Abstract
A data set consisting of 1040 drug candidates was divided into a training set and test set of 832 and 208 compounds, respectively. The training set was used for estimating a model for classification into two classes with respect to membrane permeation in a cell based assay: 1) apparent permeability below 4 * 10−6 cm/s and 2) apparent permeability on 4 * 10−6 cm/s or higher. Nine molecular descriptors were calculated for each compound and six classification techniques were applied: k-Nearest Neighbor, Linear and Quadratic Discriminant Analysis, Discriminant Adaptive Nearest-Neigbor, Soft Independent Modeling of Class Analogy and Classification Tree. A Discriminant Adaptive Nearest-Neigbor model based on four descriptors: Number of flex bonds, number of hydrogen bond donors, molecular weight and molecular polar surface area was selected as the best model. The selection was based on cross validation and a new weighted classification accuracy measure introduced in this study. In the test set of 208 compounds 9% was not classified. The false positive rate was 0.08 and the sensitivity was 0.76.
| Original language | English |
|---|---|
| Journal | QSAR and Combinatorial Science |
| Volume | 24 |
| Issue number | 4 |
| Pages (from-to) | 449-457 |
| ISSN | 1611-020X |
| DOIs | |
| Publication status | Published - 2005 |
| Externally published | Yes |
Keywords
- Drug candidates
- Membrane permeation
- k-Nearest Neighbor (k-NN)
- Linear Discriminant Analysis (LDA)
- Quadratic Discriminant Analysis (QDA)
- Discriminant Adaptive Nearest-Neigbor (DANN)
- Soft Independent Modeling of Class Analogy (SIMCA)
Fingerprint
Dive into the research topics of 'Classification of Membrane Permeability of Drug Candidates: A Methodological Investigation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver