Skip to main content
Log in

Structural models in the assessment of protein druggability based on HTS data

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Insights on the potential of target proteins to bind small molecules with high affinity can be derived from the knowledge of their three-dimensional structural details especially of their binding pockets. The present study uses high-throughput screening (HTS) results on various targets, to obtain mathematical predictive models in which a minimal set of structural parameters significantly contributing to the hit rates or the affinity of the protein binding pockets for small molecular entities, is identified. An emphasis is given to focus on target variation aspect of the data by consideration of commonly tested compounds against the HTS targets. We identify ‘four-parameter’ models with R 2, \( R_{\text{adj}}^{2} \), SEE, and LOO q 2 values of 0.70, 0.60, 0.27 and 0.50, respectively, or better. We demonstrate through cross-validation exercises that our regression models apply well on varied data sets. Thus we can use these models to estimate hit rates for HTS campaigns and thereby assign priority to drug targets before they undergo such resource intense experimental screening and follow-up.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Tanimoto TT, IBM Internal Report, November 17, 1957.

Abbreviations

CCT:

Common compounds tested

FDA:

Food and drug administration (U.S. Department of Health and Human Services)

HR:

Hit rate

HTS:

High-throughput screening

IQR:

Inter quartile range

LOO :

Leave-one-out

MWSS:

Model without site score

NME:

New/novel molecular entity

NMR:

Nuclear magnetic resonance

SEE :

Standard error of estimate

References

  1. Betz UA (2005) How many genomic targets can a portfolio afford? Drug Discov Today 10(15):1057–1063. doi:10.1016/S1359-6446(05)03498-7

    Article  Google Scholar 

  2. Hopkins AL, Groom CR (2002) The druggable genome. Nat Rev Drug Discov 1(9):727–730. doi:10.1038/nrd892

    Article  CAS  Google Scholar 

  3. Drews J (2006) Whats’s in a number? Nat Rev Drug Discov 5:975. doi:10.1038/nrd2205

    Article  CAS  Google Scholar 

  4. Zambrowicz BP, Sands AT (2003) Knockouts model the 100 best-selling drugs–will they model the next 100? Nat Rev Drug Discov 2(1):38–51. doi:10.1038/nrd987

    Article  CAS  Google Scholar 

  5. Hajduk PJ, Huth JR, Fesik SW (2005) Druggability indices for protein targets derived from NMR-based screening data. J Med Chem 48(7):2518–2525. doi:10.1021/jm049131r

    Article  CAS  Google Scholar 

  6. Han LY, Zheng CJ, Xie B, Jia J, Ma XH, Zhu F, Lin HH, Chen X, Chen YZ (2007) Support vector machines approach for predicting druggable proteins: recent progress in its exploration and investigation of its usefulness. Drug Discov Today 12(7–8):304–313. doi:10.1016/j.drudis.2007.02.015

    Article  CAS  Google Scholar 

  7. Nayal M, Honig B (2006) On the nature of cavities on protein surfaces: application to the identification of drug-binding sites. Proteins 63(4):892–906. doi:10.1002/prot.20897

    Article  CAS  Google Scholar 

  8. Cheng AC, Coleman RG, Smyth KT, Cao Q, Soulard P, Caffrey DR, Salzberg AC, Huang ES (2007) Structure-based maximal affinity model redicts small-molecule druggability. Nat Biotechnol 25(1):71–75. doi:10.1038/nbt1273

    Article  Google Scholar 

  9. Overington JP, Al-Lazikani B, Hopkins AL (2006) How many drug targets are the there? Nature 5(12):993–996. doi:10.1038/nrd2199

    Article  CAS  Google Scholar 

  10. Lipinski CA (2000) Drug-like properties and the causes of poor solubility and poor permeability. J Pharmacol Toxicol Methods 44(1):235–249. doi:10.1016/S1056-8719(00)00107-6

    Article  CAS  Google Scholar 

  11. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 46(1–3):3–26. doi:10.1016/S0169-409X(00)00129-0

    Article  CAS  Google Scholar 

  12. Kuntz ID, Chen K, Sharp KA, Kollman PA (1999) The maximal affinity of ligands. Proc Natl Acad Sci USA 96(18):9997–10002. doi:10.1073/pnas.96.18.9997

    Article  CAS  Google Scholar 

  13. Bajorath J (2002) Integration of virtual and high-throughput screening. Nat Rev Drug Discov 1(11):882–894. doi:10.1038/nrd941

    Article  CAS  Google Scholar 

  14. Davies JW, Glick M, Jenkins JL (2006) Streamlining lead discovery by aligning in silico and high-throughput screening. Curr Opin Chem Biol 10(4):343–351. doi:10.1016/j.cbpa.2006.06.022

    Article  CAS  Google Scholar 

  15. Pereira DA, Williams JA (2007) Origin and evolution of high throughput screening. Br J Pharmacol 152(1):53–61. doi:10.1038/sj.bjp.0707373

    Article  CAS  Google Scholar 

  16. Pellecchia M, Bertini I, Cowburn D, Dalvit C, Giralt E, Jahnke W, James TL, Homans SW, Kessler H, Luchinat C, Meyer B, Oschkinat H, Peng J, Schwalbe H, Siegal G (2008) Perspectives on NMR in drug discovery: a technique comes of age. Nat Rev Drug Discov 7:738–745. doi:10.1038/nrd2606

    Article  CAS  Google Scholar 

  17. Puvanendrampillai D, Mitchell JB (2003) L/D Protein ligand database (PLD): additional understanding of the nature and specificity of protein-ligand complexes. Bioinformatics 19(14):1856–1857. doi:10.1093/bioinformatics/btg243

    Article  CAS  Google Scholar 

  18. Kumar MD, Gromiha MM (2006) Protein Ligand Database (PLD): additional understanding of the nature and specificity of protein–ligand complexes. Nucleic Acids Res 34(Database issue):195–198. doi:10.1093/nar/gkj017

    Article  Google Scholar 

  19. Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35(Database issue):198–201. doi:10.1093/nar/gkl999

    Article  Google Scholar 

  20. Perola E, Walters WP, Charifson PS (2004) A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance Proteins. Proteins 56(2):235–249. doi:10.1002/prot.20088

    Article  CAS  Google Scholar 

  21. Böhm HJ (1998) Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J Comput Aided Mol Des 12(4):309–323. doi:10.1023/A:1007999920146

    Article  Google Scholar 

  22. Schrödinger SiteMap Fast, accurate and practical binding site identification. 8.0. (2008) New York, NY, Schrödinger, LLC. 2005. Ref Type: Computer Program

  23. Connolly ML (1993) The molecular surface package. J Mol Graph 11(2):139–141. doi:10.1016/0263-7855(93)87010-3

    Article  CAS  Google Scholar 

  24. Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, Merz KM, Onufriev A, Simmerling C, Wang B, Woods RJ (2005) The Amber biomolecular simulation programs. J Comput Chem 26(16):1668–1688. doi:10.1002/jcc.20290

    Article  CAS  Google Scholar 

  25. Gupta AK, Babu MA, Kaskhedikar SG (2004) VALSTAT : validation program for quantitative structure activity relationship studies. Indian J Pharm Sci 66(4):396–402

    CAS  Google Scholar 

  26. Wold S, Eriksson L (1995) Statistical validation of QSAR results. In: van de Waterbeemd H (ed) Chemometrics methods in molecular design. VCH, Weinheim, pp 309–318

    Chapter  Google Scholar 

  27. Veretnik S, Fink JL, Bourne PE (2008) Computational biology resources lack persistence and usability. PLOS Comput Biol 4(7):e1000136. doi:10.1371/journal.pcbi.1000136

    Article  Google Scholar 

  28. Abad-Zapatero CMJT (2005) Ligand efficiency indices as guideposts for drug discovery. Drug Discov Today 10(7):464–469. doi:10.1016/S1359-6446(05)03386-6

    Article  Google Scholar 

  29. Hopkins AL, Groom CR, Alex A (2004) Ligand efficiency: a useful metric for lead selection. Drug Discov Today 9(10):430–431. doi:10.1016/S1359-6446(04)03069-7

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank Dr. Stefan Schmitt (SS) for offering valuable suggestions during the course of this project. We are also grateful to SS, Drs. Bheemarao Ugarkar, Manoranjan Panda and Raghuram Tangirala for their comments on the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kothandaraman Seshadri.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, A., Gupta, A.K. & Seshadri, K. Structural models in the assessment of protein druggability based on HTS data. J Comput Aided Mol Des 23, 583–592 (2009). https://doi.org/10.1007/s10822-009-9279-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-009-9279-y

Keywords

Navigation