Skip to main content

Advertisement

Log in

Customizing scoring functions for docking

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Empirical scoring functions used in protein-ligand docking calculations are typically trained on a dataset of complexes with known affinities with the aim of generalizing across different docking applications. We report a novel method of scoring-function optimization that supports the use of additional information to constrain scoring function parameters, which can be used to focus a scoring function’s training towards a particular application, such as screening enrichment. The approach combines multiple instance learning, positive data in the form of ligands of protein binding sites of known and unknown affinity and binding geometry, and negative (decoy) data of ligands thought not to bind particular protein binding sites or known not to bind in particular geometries. Performance of the method for the Surflex-Dock scoring function is shown in cross-validation studies and in eight blind test cases. Tuned functions optimized with a sufficient amount of data exhibited either improved or undiminished screening performance relative to the original function across all eight complexes. Analysis of the changes to the scoring function suggest that modifications can be learned that are related to protein-specific features such as active-site mobility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, Tedesco G, Wall ID, Woolven JM, Peishoff CE, Head MS (2006) A critical assessment of docking programs and scoring functions. J Med Chem 49(20):5912–5931

    Article  CAS  Google Scholar 

  2. Huang N, Shoichet BK, Irwin JJ (2006) Benchmarking sets for molecular docking. J Med Chem 49(23):6789–6801

    Article  CAS  Google Scholar 

  3. Jain AN (2006) Scoring functions for protein-ligand docking. Curr Protein Pept Sci 7(5):407–420

    Article  CAS  Google Scholar 

  4. Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE (1982) A geometric approach to macromolecule-ligand interactions. J Mol Biol 161(2):269–288

    Article  CAS  Google Scholar 

  5. Rarey M, Kramer B, Lengauer T, Klebe G (1996) A fast flexible docking method using an incremental construction algorithm. J Mol Biol 261(3):470–489

    Article  CAS  Google Scholar 

  6. Goodsell DS, Morris GM, Olson AJ (1996) Automated docking of flexible ligands: applications of AutoDock. J Mol Recognit 9(1):1–5

    Article  CAS  Google Scholar 

  7. Bohm HJ (1994) The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure. J Comput Aided Mol Des 8(3):243–256

    Article  CAS  Google Scholar 

  8. Welch W, Ruppert J, Jain AN (1996) Hammerhead: fast, fully automated docking of flexible ligands to protein binding sites. Chem Biol 3(6):449–462

    Article  CAS  Google Scholar 

  9. Jain AN (1996) Scoring noncovalent protein-ligand interactions: a continuous differentiable function tuned to compute binding affinities. J Comput Aided Mol Des 10(5):427–440

    Article  CAS  Google Scholar 

  10. Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267(3):727–748

    Article  CAS  Google Scholar 

  11. Jain AN (2003) Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. J Med Chem 46(4):499–511

    Article  CAS  Google Scholar 

  12. Jain AN (2007) Surflex-Dock 2.1: robust performance from ligand energetic modeling, ring flexibility, and knowledge-based search. J Comput Aided Mol Des 21(5):281–306

    Article  CAS  Google Scholar 

  13. Pham TA, Jain AN (2006) Parameter estimation for scoring protein-ligand interactions using negative training data. J Med Chem 49(20):5856–5868

    Article  CAS  Google Scholar 

  14. Muegge I, Martin YC (1999) A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J Med Chem 42(5):791–804

    Article  CAS  Google Scholar 

  15. Gohlke H, Hendlich M, Klebe G (2000) Knowledge-based scoring function to predict protein-ligand interactions. J Mol Biol 295(2):337–356

    Article  CAS  Google Scholar 

  16. Smith R, Hubbard RE, Gschwend DA, Leach AR, Good AC (2003) Analysis and optimization of structure-based virtual screening protocols. (3). New methods and old problems in scoring function design. J Mol Graph Model 22(1):41–53

    Article  CAS  Google Scholar 

  17. Jain AN, Dietterich TG, Lathrop RH, Chapman D, Critchlow RE, Bauer BE, Webster TA, Lozano-Perez T (1994) A shape-based machine learning tool for drug design. J Comput Aided Mol Des 8(6):635–652

    Article  CAS  Google Scholar 

  18. Dietterich TG, Lathrop RH, Lozano-Perez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71

    Article  Google Scholar 

  19. Wang R, Liu L, Lai L, Tang Y (1998) SCORE: a new empirical method for estimating the binding affinity of a protein-ligand complex. J Mol Model 4:379–384

    Article  CAS  Google Scholar 

  20. Wang R, Lai L, Wang S (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J Comput Aided Mol Des 16(1):11–26

    Article  CAS  Google Scholar 

  21. Jain AN, Harris NL, Park JY (1995) Quantitative binding site model generation: compass applied to multiple chemotypes targeting the 5-HT1A receptor. J Med Chem 38(8):1295–1308

    Article  CAS  Google Scholar 

  22. Wang R, Fang X, Lu Y, Yang CY, Wang S (2005) The PDBbind database: methodologies and updates. J Med Chem 48(12):4111–4119

    Article  CAS  Google Scholar 

  23. Bissantz C, Folkers G, Rognan D (2000) Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J Med Chem 43(25):4759–4767

    Article  CAS  Google Scholar 

  24. Irwin JJ, Shoichet BK (2005) ZINC—a free database of commercially available compounds for virtual screening. J Chem Inf Model 45(1):177–182

    Article  CAS  Google Scholar 

  25. Mayo SL, Olafson BD, Goddard WA (1990) DREIDING: a generic force field for molecular simulations. J Phys Chem 94(26):8897–8909

    Article  CAS  Google Scholar 

  26. Perkins E, Sun D, Nguyen A, Tulac S, Francesco M, Tavana H, Nguyen H, Tugendreich S, Barthmaier P, Couto J, Yeh E, Thode S, Jarnagin K, Jain AN, Morgans D, Melese T (2001) Novel inhibitors of poly(ADP-ribose) polymerase/PARP1 and PARP2 identified using a cell-based screen in yeast. Cancer Res 61(10):4175–4183

    CAS  Google Scholar 

  27. Sham HL, Zhao C, Stewart KD, Betebenner DA, Lin S, Park CH, Kong XP, Rosenbrook WJ, Herrin T, Madigan D, Vasavanonda S, Lyons N, Molla A, Saldivar A, Marsh KC, McDonald E, Wideburg NE, Denissen JF, Robins T, Kempf DJ, Plattner JJ, Norbeck DW (1996) A novel, picomolar inhibitor of human immunodeficiency virus type 1 protease. J Med Chem 39(2):392–397

    Article  CAS  Google Scholar 

  28. Wlodawer A, Vondrasek J (1998) Inhibitors of HIV-1 protease: a major success of structure-assisted drug design. Annu Rev Biophys Biomol Struct 27:249–284

    Article  CAS  Google Scholar 

  29. Axelsen PH, Harel M, Silman I, Sussman JL (1994) Structure and dynamics of the active site gorge of acetylcholinesterase: synergistic use of molecular dynamics simulation and X-ray crystallography. Protein Sci 3(2):188–197

    Article  CAS  Google Scholar 

  30. Silman I, Millard CB, Ordentlich A, Greenblatt HM, Harel M, Barak D, Shafferman A, Sussman JL (1999) A preliminary comparison of structural models for catalytic intermediates of acetylcholinesterase. Chem Biol Interact 119–120:43–52

    Article  Google Scholar 

  31. Dvir H, Wong DM, Harel M, Barril X, Orozco M, Luque FJ, Munoz-Torrero D, Camps P, Rosenberry TL, Silman I, Sussman JL (2002) 3D structure of Torpedo californica acetylcholinesterase complexed with huprine X at 2.1 A resolution: kinetic and molecular dynamic correlates. Biochemistry 41(9):2970–2981

    Article  CAS  Google Scholar 

  32. Wong DM, Greenblatt HM, Dvir H, Carlier PR, Han YF, Pang YP, Silman I, Sussman JL (2003) Acetylcholinesterase complexed with bivalent ligands related to huperzine a: experimental evidence for species-dependent protein-ligand complementarity. J Am Chem Soc 125(2):363–373

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge NIH for partial funding of the work (grant GM070481). Dr. Jain has a financial interest in BioPharmics LLC, a biotechnology company whose main focus is in the development of methods for computational modeling in drug discovery. Tripos Inc., has exclusive commercial distribution rights for Surflex-Dock, licensed from BioPharmics LLC.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ajay N. Jain.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pham, T.A., Jain, A.N. Customizing scoring functions for docking. J Comput Aided Mol Des 22, 269–286 (2008). https://doi.org/10.1007/s10822-008-9174-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-008-9174-y

Keywords

Navigation