Skip to main content
Log in

Predicting Mouse Liver Microsomal Stability with “Pruned” Machine Learning Models and Public Data

  • Research Paper
  • Published:
Pharmaceutical Research Aims and scope Submit manuscript

ABSTRACT

Purpose

Mouse efficacy studies are a critical hurdle to advance translational research of potential therapeutic compounds for many diseases. Although mouse liver microsomal (MLM) stability studies are not a perfect surrogate for in vivo studies of metabolic clearance, they are the initial model system used to assess metabolic stability. Consequently, we explored the development of machine learning models that can enhance the probability of identifying compounds possessing MLM stability.

Methods

Published assays on MLM half-life values were identified in PubChem, reformatted, and curated to create a training set with 894 unique small molecules. These data were used to construct machine learning models assessed with internal cross-validation, external tests with a published set of antitubercular compounds, and independent validation with an additional diverse set of 571 compounds (PubChem data on percent metabolism).

Results

“Pruning” out the moderately unstable / moderately stable compounds from the training set produced models with superior predictive power. Bayesian models displayed the best predictive power for identifying compounds with a half-life ≥1 h.

Conclusions

Our results suggest the pruning strategy may be of general benefit to improve test set enrichment and provide machine learning models with enhanced predictive value for the MLM stability of small organic molecules. This study represents the most exhaustive study to date of using machine learning approaches with MLM data from public sources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Abbreviations

ADME/Tox:

Absorption metabolism, distribution, excretion and toxicity

CDD:

Collaborative Drug Discovery

FCFP_6:

Molecular function class fingerprints of maximum diameter 6

HLM:

Human liver microsomal stability

HTS:

High Throughput Screens

Mtb :

Mycobacterium tuberculosis

PPV:

positive predictive value

QSAR:

Quantitative Structure-Activity Relationships

ROC:

Receiver-operator characteristic

SAR:

Structure Activity Relationship

SVM:

Support Vector Machine

References

  1. Ekins S, Pottorf R, Reynolds RC, Williams AJ, Clark AM, Freundlich JS. Looking back to the future: predicting in vivo efficacy of small molecules versus Mycobacterium tuberculosis. J Chem Inf Model. 2014;54:1070–82.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  2. Franzblau SG, DeGroote MA, Cho SH, Andries K, Nuermberger E, Orme IM, et al. Comprehensive analysis of methods used for the evaluation of compounds against Mycobacterium tuberculosis. Tuberculosis (Edinb). 2012;92:453–88.

    Article  CAS  Google Scholar 

  3. Dartois V, Barry 3rd CE. A medicinal chemists’ guide to the unique difficulties of lead optimization for tuberculosis. Bioorg Med Chem Lett. 2013;23:4741–50.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  4. Ekins S, Nuermberger EL, Freundlich JS. Minding the gaps in tuberculosis research. Drug Discovery Today 2014.

  5. Lotharius J, Gamo-Benito FJ, Angulo-Barturen I, Clark J, Connelly M, Ferrer-Bazaga S, et al. Repositioning: the fast track to new anti-malarial medicines? Malar J. 2014;13:143.

    Article  PubMed Central  PubMed  Google Scholar 

  6. Kaushansky A, Mikolajczak SA, Vignali M, Kappe SH. Of men in mice: the success and promise of humanized mouse models for human malaria parasite infections. Cell Microbiol. 2014;16:602–11.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  7. Goyama S, Wunderlich M, Mulloy JC. Xenograft models for normal and malignant stem cells. Blood 2015.

  8. Hayes SA, Hudson AL, Clarke SJ, Molloy MP, Howell VM. From mice to men: GEMMs as trial patients for new NSCLC therapies. Semin Cell Dev Biol. 2014;27:118–27.

    Article  PubMed  CAS  Google Scholar 

  9. Morton JP, Sansom OJ. Myc-y mice: from tumour initiation to therapeutic targeting of endogenous MYC. Mol Oncol. 2013;7:248–58.

    Article  PubMed  CAS  Google Scholar 

  10. Koren S, Bentires-Alj M. Mouse models of PIK3CA mutations: one mutation initiates heterogeneous mammary tumors. FEBS J. 2013;280:2758–65.

    Article  PubMed  CAS  Google Scholar 

  11. Kirma NB, Tekmal RR. Transgenic mouse models of hormonal mammary carcinogenesis: advantages and limitations. J Steroid Biochem Mol Biol. 2012;131:76–82.

    Article  PubMed  CAS  Google Scholar 

  12. Millington C, Sonego S, Karunaweera N, Rangel A, Aldrich-Wright JR, Campbell IL, et al. Chronic neuroinflammation in Alzheimer’s disease: new perspectives on animal models and promising candidate drugs. Biomed Res Int. 2014;2014:309129.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  13. Ford Siltz LA, Viktorova EG, Zhang B, Kouiavskaia D, Dragunsky E, Chumakov K, et al. New small-molecule inhibitors effectively blocking picornavirus replication. J Virol. 2014;88:11091–107.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  14. Charbogne P, Kieffer BL, Befort K. 15 years of genetic approaches in vivo for addiction research: Opioid receptor and peptide gene knockout in mouse models of drug abuse. Neuropharmacology. 2014;76(Pt B):204–17.

    Article  PubMed  CAS  Google Scholar 

  15. Cachat A, Villaudy J, Rigal D, Gazzolo L, Duc DM. Mice are not Men and yet… How humanized mice inform us about human infectious diseases. Med Sci (Paris). 2012;28:63–8.

    Article  Google Scholar 

  16. Paine MF, Khalighi M, Fisher JM, Shen DD, Kunze KL, Marsh CL, et al. Characterization of interintestinal and intraintestinal variations in human CYP3A-dependent metabolism. J Pharmacol Exp Ther. 1997;283:1552–62.

    PubMed  CAS  Google Scholar 

  17. Afzelius L, Arnby CH, Broo A, Carlsson L, Isaksson C, Jurva U, et al. State-of-the-art tools for computational site of metabolism predictions: comparative analysis, mechanistical insights, and future applications. Drug Metab Rev. 2007;39:61–86.

    Article  PubMed  CAS  Google Scholar 

  18. Jolivette LJ, Ekins S. Methods for predicting human drug metabolism. Adv Clin Chem. 2007;43:131–76.

    Article  PubMed  CAS  Google Scholar 

  19. Williams JA, Hyland R, Jones BC, Smith DA, Hurst S, Goosen TC, et al. Drug-drug interactions for UDP-glucuronosyltransferase substrates: a pharmacokinetic explanation for typically observed low exposure (AUCi/AUC) ratios. Drug Metab Dispos. 2004;32:1201–8.

    Article  PubMed  CAS  Google Scholar 

  20. Quintieri L, Fantin M, Palatini P, De Martin S, Rosato A, Caruso M, et al. In vitro hepatic conversion of the anticancer agent nemorubicin to its active metabolite PNU-159682 in mice, rats and dogs: a comparison with human liver microsomes. Biochem Pharmacol. 2008;76:784–95.

    Article  PubMed  CAS  Google Scholar 

  21. Palmer BD, Thompson AM, Sutherland HS, Blaser A, Kmentova I, Franzblau SG, et al. Synthesis and structure-activity studies of biphenyl analogues of the tuberculosis drug (6S)-2-nitro-6-{[4-(trifluoromethoxy)benzyl]oxy}-6,7-dihydro-5H-imidazo[2,1-b][1, 3]oxazine (PA-824). J Med Chem. 2010;53:282–94.

    Article  PubMed  CAS  Google Scholar 

  22. Crivori P, Poggesi I. Computational approaches for predicting CYP-related metabolism properties in the screening of new drugs. Eur J Med Chem. 2006;41:795–808.

    Article  PubMed  CAS  Google Scholar 

  23. Stjernschantz E, Vermeulen NP, Oostenbrink C. Computational prediction of drug binding and rationalisation of selectivity towards cytochromes P450. Expert Opin Drug Metab Toxicol. 2008;4:513–27.

    Article  PubMed  CAS  Google Scholar 

  24. Hansch C. Quantitative relationships between lipophilic character and drug metabolism. Drug Metab Rev. 1972;1:1–13.

    Article  Google Scholar 

  25. Hansch C. The QSAR paradigm in the design of less toxic molecules. Drug Metab Rev. 1984;15:1279–94.

    Article  PubMed  CAS  Google Scholar 

  26. Hansch C, Lien EJ, Helmer F. Structure--activity correlations in the metabolism of drugs. Arch Biochem Biophys. 1968;128:319–30.

    Article  PubMed  CAS  Google Scholar 

  27. Hansch C, Zhang L. Quantitative structure-activity relationships of cytochrome P-450. Drug Metab Rev. 1993;25:1–48.

    Article  PubMed  CAS  Google Scholar 

  28. Lewis DF. Quantitative structure-activity relationships in substrates, inducers, and inhibitors of cytochrome P4501 (CYP1). Drug Metab Rev. 1997;29:589–650.

    Article  PubMed  CAS  Google Scholar 

  29. Lewis DF. On the recognition of mammalian microsomal cytochrome P450 substrates and their characteristics: towards the prediction of human P450 substrate specificity and metabolism. Biochem Pharmacol. 2000;60:293–306.

    Article  PubMed  CAS  Google Scholar 

  30. Lewis DF. Structural characteristics of human P450s involved in drug metabolism: QSARs and lipophilicity profiles. Toxicology. 2000;144:197–203.

    Article  PubMed  CAS  Google Scholar 

  31. Lewis DF, Eddershaw PJ, Dickins M, Tarbit MH, Goldfarb PS. Structural determinants of cytochrome P450 substrate specificity, binding affinity and catalytic rate. Chem Biol Interact. 1998;115:175–99.

    Article  PubMed  CAS  Google Scholar 

  32. Lewis DF, Eddershaw PJ, Dickins M, Tarbit MH, Goldfarb PS. Erratum to structural determinants of cytochrome P450 substrate specificity, binding affinity and catalytic rate. Chemico Biol Interact. 1999;117:187.

    Article  CAS  Google Scholar 

  33. Lewis DF, Jacobs MN, Dickins M. Compound lipophilicity for substrate binding to human P450s in drug metabolism. Drug Discov Today. 2004;9:530–7.

    Article  PubMed  CAS  Google Scholar 

  34. Fuhr U, Strobl G, Manaut F, Anders EM, Sorgel F, Lopez-de-Brinas E, et al. Quinolone antibacterial agents: relationship between structure and in vitro inhibition of the human cytochrome P450 isoform CYP1A2. Mol Pharmacol. 1993;43:191–9.

    PubMed  CAS  Google Scholar 

  35. Jones JP, Korzekwa KR. Predicting the rates and regioselectivity of reactions mediated by the P450 superfamily. Methods Enzymol. 1996;272:326–35.

    Article  PubMed  CAS  Google Scholar 

  36. Jones JP, Korzekwa KR. Predicting intrinsic clearance for drugs and drug candidates metabolized by aldehyde oxidase. Mol Pharm. 2013;10:1262–8.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  37. Locuson CW, Wahlstrom JL. Three-dimensional quantitative structure-activity relationship analysis of cytochromes P450: effect of incorporating higher-affinity ligands and potential new applications. Drug Metab Dispos. 2005;33:873–8.

    Article  PubMed  CAS  Google Scholar 

  38. Sorich MJ, McKinnon RA, Miners JO, Winkler DA, Smith PA. Rapid prediction of chemical metabolism by human UDP-glucuronosyltransferase isoforms using quantum chemical descriptors derived with the electronegativity equalization method. J Med Chem. 2004;47:5311–7.

    Article  PubMed  CAS  Google Scholar 

  39. Dajani R, Cleasby A, Neu M, Wonacott AJ, Jhoti H, Hood AM, et al. X-ray crystal structure of human dopamine sulfotransferase, SULT1A3. Molecular modeling and quantitative structure-activity relationship analysis demonstrate a molecular basis for sulfotransferase substrate specificity. J Biol Chem. 1999;274:37862–8.

    Article  PubMed  CAS  Google Scholar 

  40. Ekins S. In silico approaches to predicting drug metabolism, toxicology and beyond. Biochem Soc Trans. 2003;31:611–4.

    Article  PubMed  CAS  Google Scholar 

  41. Shen M, Xiao Y, Golbraikh A, Gombar VK, Tropsha A. Development and validation of k-nearest-neighbor QSPR models of metabolic stability of drug candidates. J Med Chem. 2003;46:3013–20.

    Article  PubMed  CAS  Google Scholar 

  42. Jensen BF, Sorensen MD, Kissmeyer AM, Bjorkling F, Sonne K, Engelsen SB, et al. Prediction of in vitro metabolic stability of calcitriol analogs by QSAR. J Comput Aided Mol Des. 2003;17:849–59.

    Article  PubMed  CAS  Google Scholar 

  43. Chang C, Duignan DB, Johnson KD, Lee PH, Cowan GS, Gifford EM, et al. The development and validation of a computational model to predict rat liver microsomal clearance. J Pharm Sci. 2009;98:2857–67.

    Article  PubMed  CAS  Google Scholar 

  44. Hu Y, Unwalla R, Denny RA, Bikker J, Di L, Humblet C. Development of QSAR models for microsomal stability: identification of good and bad structural features for rat, human and mouse microsomal stability. J Comput Aided Mol Des. 2010;24:23–35.

    Article  PubMed  CAS  Google Scholar 

  45. Gupta RR, Gifford EM, Liston T, Waller CL, Hohman M, Bunin BA, et al. Using open source computational tools for predicting human metabolic stability and additional absorption, distribution, metabolism, excretion, and toxicity properties. Drug Metab Dispos. 2010;38:2083–90.

    Article  PubMed  CAS  Google Scholar 

  46. Clark AM, Dole K, Coulon-Spektor A, McNutt A, Grass G, Freundlich JS, Reynolds RC, Ekins S. Open source Bayesian models. 1. Application to ADME/Tox and drug discovery datasets. J Chem Inf Model 2015.

  47. Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH. Pubchem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res. 2009;37:W623–33.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  48. Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Zhou Z, et al. Pubchem’s bioassay database. Nucleic Acids Res. 2012;40:D400–12.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  49. Lee PH, Cucurull-Sanchez L, Lu J, Du YJ. Development of in silico models for human liver microsomal stability. J Comput Aided Mol Des. 2007;21:665–73.

    Article  PubMed  CAS  Google Scholar 

  50. BIOVIA. Discovery Studio modeling environment. San Diego, CA: BIOVIA; 2013.

    Google Scholar 

  51. Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, et al. The ChEMBL bioactivity database: an update. Nucleic Acids Res. 2014;42:D1083–90.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  52. Ekins S, Freundlich J. Computational models for tuberculosis drug discovery. In: Kortagere S, ed. In silico models for drug discovery: Humana Press, 2013: pp 245–262.

  53. Ekins S, Freundlich J. Validating new tuberculosis computational models with public whole cell screening aerobic activity datasets. Pharm Res. 2011;28:1859–69.

    Article  PubMed  CAS  Google Scholar 

  54. Ekins S, Freundlich J, Hobrath J, Lucile White E, Reynolds R. Combining computational methods for hit to lead optimization in Mycobacterium tuberculosis drug discovery. Pharm Res. 2013;1–22.

  55. Ekins S, Reynolds RC, Franzblau SG, Wan B, Freundlich JS, Bunin BA. Enhancing hit identification in Mycobacterium tuberculosis drug discovery using validated dual-event Bayesian models. PLoS ONE. 2013;8, e63240.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  56. Ekins S, Reynolds RC, Kim H, Koo M-S, Ekonomidis M, Talaue M, et al. Bayesian models leveraging bioactivity and cytotoxicity information for drug discovery. Chem Biol. 2013;20:370–8.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  57. Langdon SR, Mulgrew J, Paolini GV, van Hoorn WP. Predicting cytotoxicity from heterogeneous data sources with Bayesian learning. J Cheminfom. 2010;2.

  58. Prathipati P, Ma NL, Keller TH. Global Bayesian models for the prioritization of antitubercular agents. J Chem Inf Model. 2008;48:2362–70.

    Article  PubMed  CAS  Google Scholar 

  59. Ekins S, Williams AJ, Xu JJ. A predictive ligand-based Bayesian model for human drug-induced liver injury. Drug Metab Dispos. 2010;38:2302–8.

    Article  PubMed  CAS  Google Scholar 

  60. Ekins S, Freundlich JS, Hobrath JV, Lucile White E, Reynolds RC. Combining computational methods for hit to lead optimization in Mycobacterium tuberculosis drug discovery. Pharm Res. 2014;31:414–35.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  61. Langdon SR, Mulgrew J, Paolini GV, van Hoorn WP. Predicting cytotoxicity from heterogeneous data sources with Bayesian learning. J Cheminform. 2010;2:11.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  62. Ekins S, Bradford J, Dole K, Spektor A, Gregory K, Blondeau D, et al. A collaborative database and computational models for tuberculosis drug discovery. Mol Biosyst. 2010;6:840–51.

    Article  PubMed  CAS  Google Scholar 

  63. Ekins S, Kaneko T, Lipinski CA, Bradford J, Dole K, Spektor A, et al. Analysis and hit filtering of a very large library of compounds screened against Mycobacterium tuberculosis. Mol Biosyst. 2010;6:2316–24.

    Article  PubMed  CAS  Google Scholar 

  64. Ekins S. Progress in computational toxicology. J Pharmacol Toxicol Methods. 2013;69:115–40.

    Article  PubMed  CAS  Google Scholar 

  65. Bender A, Scheiber J, Glick M, Davies JW, Azzaoui K, Hamon J, et al. Analysis of pharmacology data and the prediction of adverse drug reactions and off-target effects from chemical structure. ChemMedChem. 2007;2:861–73.

    Article  PubMed  CAS  Google Scholar 

  66. Klon AE, Lowrie JF, Diller DJ. Improved naïve BayesiĖan modeling of numerical data for absorption, distribution, metabolism and excretion (ADME) property prediction. J Chem Inf Model. 2006;46:1945–56.

    Article  PubMed  CAS  Google Scholar 

  67. Hassan M, Brown RD, Varma-O’brien S, Rogers D. Cheminformatics analysis and learning in a data pipelining environment. Mol Divers. 2006;10:283–99.

    Article  PubMed  CAS  Google Scholar 

  68. Rogers D, Brown RD, Hahn M. Using extended-connectivity fingerprints with Laplacian-modified Bayesian analysis in high-throughput screening follow-up. J Biomol Screen. 2005;10:682–6.

    Article  PubMed  CAS  Google Scholar 

  69. Ekins S, Freundlich JS, Reynolds RC. Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis. J Chem Inf Model. 2014;54:2157–65.

    Article  PubMed  CAS  Google Scholar 

  70. Xia X, Maliski EG, Gallant P, Rogers D. Classification of kinase inhibitors using a Bayesian model. J Med Chem. 2004;47:4463–70.

    Article  PubMed  CAS  Google Scholar 

  71. Hohman M, Gregory K, Chibale K, Smith PJ, Ekins S, Bunin B. Novel web-based tools combining chemistry informatics, biology and social networks for drug discovery. Drug Discov Today. 2009;14:261–70.

    Article  PubMed  CAS  Google Scholar 

  72. Lakshminarayana SB, Huat TB, Ho PC, Manjunatha UH, Dartois V, Dick T, et al. Comprehensive physicochemical, pharmacokinetic and activity profiling of anti-TB agents. J Antimicrob Chemother. 2015;70:857–67.

    Article  PubMed  CAS  Google Scholar 

  73. Ekins S, Freundlich JS, Reynolds RC. Fusing dual-event data sets for mycobacterium tuberculosis machine learning models and their evaluation. J Chem Inf Model. 2013;53:3054–63.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  74. Perryman AL, Yu W, Wang X, Ekins S, Forli S, Li SG, Freundlich JS, Tonge PJ, Olson AJ. A virtual screen discovers novel, fragment-sized inhibitors of Mycobacterium tuberculosis InhA. J Chem Inf Model 2015.

  75. Jones DR, Ekins S, Li L, Hall SD. Computational approaches that predict metabolic intermediate complex formation with CYP3A4 (+b5). Drug Metab Dispos. 2007;35:1466–75.

    Article  PubMed  CAS  Google Scholar 

  76. Ekins S, Williams AJ, Krasowski MD, Freundlich JS. In silico repositioning of approved drugs for rare and neglected diseases. Drug Discov Today. 2011;16:298–310.

    Article  PubMed  Google Scholar 

  77. Anderson JW, Sarantakis D, Terpinski J, Santha Kumar TR, Tsai H-C, Kuo M, et al. Novel diaryl ureas with efficacy in a mouse model of malaria. Bioorg Med Chem Lett. 2013;23:1022–5.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  78. Clark AM, Ekins S. Open source Bayesian models. 2. Mining a “big dataset” to create and validate models with ChEMBL. J Chem Inf Model. 2015.

  79. Clark AM, Sarker M, Ekins S. New target prediction and visualization tools incorporating open source molecular fingerprints for TB Mobile 2.0. J Cheminform. 2014;6:38–54.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  80. Anon R. R: The R project for statistical computing. http://www.r-project.org, 2014.

  81. Di L, Kerns EH, Hong Y, Kleintop TA, McConnell OJ, Huryn DM. Optimization of a higher throughput microsomal stability screening assay for profiling drug discovery candidates. J Biomol Screen. 2003;8:453–62.

    Article  PubMed  CAS  Google Scholar 

  82. Lombardo F, Obach RS, Dicapua FM, Bakken GA, Lu J, Potter DM, et al. A hybrid mixture discriminant analysis-random forest computational model for the prediction of volume of distribution of drugs in human. J Med Chem. 2006;49:2262–7.

    Article  PubMed  CAS  Google Scholar 

  83. Lombardo F, Obach RS, Shalaeva MY, Gao F. Prediction of human volume of distribution values for neutral and basic drugs. 2. Extended data set and leave-class-out statistics. J Med Chem. 2004;47:1242–50.

    Article  PubMed  CAS  Google Scholar 

  84. Lombardo F, Obach RS, Shalaeva MY, Gao F. Prediction of volume of distribution values in humans for neutral and basic drugs using physicochemical measurements and plasma protein binding data. J Med Chem. 2002;45:2867–76.

    Article  PubMed  CAS  Google Scholar 

  85. Lombardo F, Shalaeva MY, Tupper KA, Gao F. ElogD(oct): a tool for lipophilicity determination in drug discovery. 2. Basic and neutral compounds. J Med Chem. 2001;44:2490–7.

    Article  PubMed  CAS  Google Scholar 

  86. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–7.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  87. Papadatos G, Overington JP. The ChEMBL database: a taster for medicinal chemists. Future Med Chem. 2014;6:361–4.

    Article  PubMed  CAS  Google Scholar 

  88. Sun H, Veith H, Xia M, Austin CP, Tice RR, Huang R. Prediction of cytochrome P450 profiles of environmental chemicals with QSAR models built from drug-like molecules. Mol Inform. 2012;31:783–92.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  89. Sun H, Veith H, Xia M, Austin CP, Huang R. Predictive models for cytochrome P450 isozymes based on quantitative high throughput screening data. J Chem Inf Model. 2011;51:2474–81.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  90. Veith H, Southall N, Huang R, James T, Fayne D, Artemenko N, et al. Comprehensive characterization of cytochrome P450 isozyme selectivity across chemical libraries. Nat Biotechnol. 2009;27:1050–5.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  91. MacArthur R, Leister W, Veith H, Shinn P, Southall N, Austin CP, et al. Monitoring compound integrity with cytochrome P450 assays and qHTS. J Biomol Screen. 2009;14:538–46.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  92. Litterman NK, Lipinski CA, Bunin BA, Ekins S. Computational prediction and validation of an expert’s evaluation of chemical probes. J Chem Inf Model. 2014;54:2996–3004.

    Article  PubMed  CAS  Google Scholar 

  93. Dong Z, Ekins S, Polli JE. Structure-activity relationship for FDA approved drugs as inhibitors of the human sodium taurocholate cotransporting polypeptide (NTCP). Mol Pharm. 2013;10:1008–19.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  94. Ekins S, Freundlich JS, Hobrath JV, Lucile White E, Reynolds RC. Combining computational methods for hit to lead optimization in Mycobacterium tuberculosis drug discovery. Pharm Res. 2014;31:414–35.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  95. Ekins S, Embrechts MJ, Breneman CM, Jim K, Wery J-P. Novel applications of kernel-partial least squares to modeling a comprehensive array of properties for drug discovery. In: Ekins S, editor. Computational toxicology: Risk assessment for pharmaceutical and environmental chemicals. Hoboken, NJ: John Wiley and Sons; 2007. p. 403–32.

    Chapter  Google Scholar 

Download references

ACKNOWLEDGMENTS AND DISCLOSURES

J.S.F., S.E., and A.L.P. were supported by Award Number 1U19AI109713 NIH/NIAID for the “Center to develop therapeutic countermeasures to high-threat bacterial agents,” from the National Institutes of Health: Centers of Excellence for Translational Research (CETR). S.E. and J.S.F. were also supported in part by Award Number 9R44TR000942-02 “Biocomputation across distributed private datasets to enhance drug discovery” from the National Center for Advancing Translational Sciences. We thank Dr. John Piwinski for suggesting that an MLM t1/2 of ≥60 min was ideal, but a t1/2 of ≥30 min was not significantly unfavorable. S.E. kindly acknowledges Alex Clark, Molecular Materials Informatics, Inc. and Krishna Dole and colleagues at Collaborative Drug Discovery, Inc., for their development of CDD Models. We thank Thomas Mayo at BIOVIA (formerly known as Accelrys, Inc.) for providing S.E. and J.S.F with Discovery Studio and Pipeline Pilot. We also thank Jodi Shaulsky at BIOVIA and Katalin Nadassy for assistance with setting up and maintaining the license server and Pipeline Pilot server. S.E. is a consultant for Collaborative Drug Discovery Inc.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joel S. Freundlich.

Electronic supplementary material

Supplementary Material Available: Supplemental information consists of 14 figures (i.e., a workflow describing the percent compound left set, many internal and external ROC curves, and PCA plots), 8 tables and commentary (describing internal and external results of SVM and RP-Random Forest models, as well as additional half-life Bayesians that were created using different types of 2D topological fingerprints and numbers of bins), and the sdf files of the curated, full and pruned versions of the MLM half-life and percent compound left datasets. These curated sdf files contain the PubChem CID numbers, structural information, MLM stability data, qualifiers/notes (such as < or > and comments on the details of certain assay methods or the series of compounds of which that molecule is a member), our binary classification (1 = stable and 0 = unstable), and the AID reference numbers that cite the source of the assay results on PubChem for each and every compound utilized. This material is available free of charge via the Internet at http://link.springer.com/journal/11095/.

Supplementry Material 1

(DOCX 2331 kb)

Supplementry Material 2

(SDF 2101 kb)

Supplementry Material 3

(SDF 2211 kb)

Supplementry Material 4

(SDF 59 kb)

Supplementry Material 5

(SDF 2877 kb)

Supplementry Material 6

(SDF 3370 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Perryman, A.L., Stratton, T.P., Ekins, S. et al. Predicting Mouse Liver Microsomal Stability with “Pruned” Machine Learning Models and Public Data. Pharm Res 33, 433–449 (2016). https://doi.org/10.1007/s11095-015-1800-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11095-015-1800-5

Key Words

Navigation