Skip to main content

Applications of Big Data and AI-Driven Technologies in CADD (Computer-Aided Drug Design)

  • Protocol
  • First Online:
Computational Drug Discovery and Design

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2714))

Abstract

In the field of computer-aided drug design (CADD), there has been dramatic progress in the development of big data and AI-driven methodologies. The expensive and time-consuming process of drug design is related to biomedical complexity. CADD can be used to apply effective and efficient strategies to overcome obstacles in the field of drug design in order to properly design and develop a new medicine. To prepare the raw data for consistent and repeatable applications of big data and AI methodologies, data pre-processing methods are introduced. Big data and AI technologies can be used to develop drugs in areas including predicting absorption, distribution, metabolism, excretion, and toxicity properties as well as finding binding sites in target proteins and conducting structure-based virtual screenings. The accurate and thorough analysis of large amounts of biomedical data as well as the design of prediction models in the area of drug design is made possible by data pre-processing and applications of big data and AI skills. In the biomedical big data era, knowledge on the biological, chemical, or pharmacological structures of biomedical entities relevant to drug design should be analyzed with significant big data and AI approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Grechishnikova D (2021) Transformer neural network for protein-specific de novo drug generation as a machine translation problem. Sci Rep UK 11(1):1–3

    Google Scholar 

  2. Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P (2021) Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 25(3):1315–1360

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Lee JW, Maria-Solano MA, Vu TNL, Yoon S, Choi S (2022) Big data and artificial intelligence (AI) methodologies for computer-aided drug design (CADD). Biochem Soc Trans 50(1):241–252. https://doi.org/10.1042/BST20211240

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Tripathi MK, Nath A, Singh TP et al (2021) Evolving scenario of big data and Artificial Intelligence (AI) in drug discovery. Mol Divers 25:1439–1460. https://doi.org/10.1007/s11030-021-10256-w

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Jiménez-Luna J, Grisoni F, Schneider G (2020) Drug discovery with explainable artificial intelligence. Nat Mach Intell 2:573–584. https://doi.org/10.1038/s42256-020-00236-4

    Article  Google Scholar 

  6. Buza K, Peška L, Koller J (2020) Modified linear regression predicts drug-target interactions accurately. PLoS One 15(4):e0230726. https://doi.org/10.1371/journal.pone.0230726

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Najafi-Ghobadi S, Najafi-Ghobadi K, Tapak L et al (2019) Application of data mining techniques and logistic regression to model drug use transition to injection: a case study in drug use treatment centers in Kermanshah Province. Iran Subst Abuse Treat Prev Policy 14:55. https://doi.org/10.1186/s13011-019-0242-1

    Article  PubMed  Google Scholar 

  8. Andrews CW, Bennett L, Yu LX (2000) Predicting human oral bioavailability of a compound: development of a novel quantitative structure-bioavailability relationship. Pharm Res 17(6):639–644. https://doi.org/10.1023/a:1007556711109

    Article  CAS  PubMed  Google Scholar 

  9. Shi H, Liu S, Chen J, Li X, Ma Q, Yu B (2019) Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure. Genomics 111(6):1839–1852. https://doi.org/10.1016/j.ygeno.2018.12.007

    Article  CAS  PubMed  Google Scholar 

  10. Mehmood T, Iqbal M, Rafique B (2021) Using least angular regression to model the antibacterial potential of metronidazole complexes. Sci Rep 11:19295. https://doi.org/10.1038/s41598-021-97897-x

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Macalino SJY, Gosu V, Hong SH, Choi S (2015) Role of computer-aided drug design in modern drug discovery. Arch Pharm Res 38(9):1686–1701

    Article  CAS  PubMed  Google Scholar 

  12. Schneider P, Tanrikulu Y, Schneider G (2009) Self-organizing maps in drug discovery: compound library design, scaffold-hopping, repurposing. Curr Med Chem 16(3):258–266. https://doi.org/10.2174/092986709787002655

    Article  CAS  PubMed  Google Scholar 

  13. Hu YH, Lin WC, Tsai CF, Ke SW, Chen CW (2015) An efficient data preprocessing approach for large scale medical data mining. Technol Health Care 23(2):153–160

    Article  PubMed  Google Scholar 

  14. Car J, Sheikh A, Wicks P et al (2019) Beyond the hype of big data and artificial intelligence: building foundations for knowledge and wisdom. BMC Med 17:143

    Article  PubMed  PubMed Central  Google Scholar 

  15. Saez C, Garcia-Gomez JM (2018) Kinematics of big biomedical data to characterize temporal variability and seasonality of data repositories: functional data analysis of data temporal evolution over non-parametric statistical manifolds. Int J Med Inform 119:109–124

    Article  PubMed  Google Scholar 

  16. He T, Heidemeyer M, Ban F et al (2017) SimBoost: a read-across approach for predicting drug–target binding affinities using gradient boosting machines. J Cheminform 9:24. https://doi.org/10.1186/s13321-017-0209-z

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Miller JB (2019) Big data and biomedical informatics: preparing for the modernization of clinical neuropsychology. Clin Neuropsychol 33(2):287–304

    Article  PubMed  Google Scholar 

  18. Suh D, Lee JW, Choi S, Lee Y (2021) Recent applications of deep learning methods on evolution-and contact-based protein structure prediction. Int J Mol Sci 22(11):6032

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Segler MHS, Kogej T, Tyrchan C, Waller MP (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4(1):120–131. https://doi.org/10.1021/acscentsci.7b00512

    Article  CAS  PubMed  Google Scholar 

  20. Yasonik J (2020) Multiobjective de novo drug design with recurrent neural networks and nondominated sorting. J Cheminform 12:14. https://doi.org/10.1186/s13321-020-00419-6

    Article  PubMed  PubMed Central  Google Scholar 

  21. Mirza B, Wang W, Wang J, Choi H, Chung NC, Ping PP (2019) Machine learning and integrative analysis of biomedical big data. Genes Basel 10(2):87

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Irwin B, Whitehead TM, Rowland S, Mahmoud SY, Conduit GJ, Segall MD (2021) Deep imputation on large-scale drug discovery data. Appl AI Lett 2(3):e31

    Article  Google Scholar 

  23. Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29(6–7):476–488

    Article  CAS  PubMed  Google Scholar 

  24. Rumondor AC, Taylor LS (2010) Application of partial least-squares (PLS) modeling in quantifying drug crystallinity in amorphous solid dispersions. Int J Pharm 398(1–2):155–160. https://doi.org/10.1016/j.ijpharm.2010.07.049

    Article  CAS  PubMed  Google Scholar 

  25. Perez-Villanueva J, Santos R, Hernandez-Campos A, Giulianotti MA, Castillo R, Medina-Franco JL (2010) Towards a systematic characterization of the antiprotozoal activity landscape of benzimidazole derivatives. Bioorgan Med Chem 18(21):7380–7391

    Article  CAS  Google Scholar 

  26. Heikamp K, Bajorath J (2014) Support vector machines for drug discovery. Expert Opin Drug Discov 9(1):93–104. https://doi.org/10.1517/17460441.2014.866943

    Article  CAS  PubMed  Google Scholar 

  27. Lee JW, Moen EL, Punshon T, Hoen AG, Stewart D, Li H, Karagas MR, Gui J (2019) An Integrated Gaussian Graphical Model to evaluate the impact of exposures on metabolic networks. Comput Biol Med 114:103417. https://doi.org/10.1016/j.compbiomed.2019.103417

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Shutta KH, De Vito R, Scholtens DM, Balasubramanian R (2022) Gaussian graphical models with applications to omics analyses. Stat Med 41(25):5150–5187. https://doi.org/10.1002/sim.9546

    Article  PubMed  Google Scholar 

  29. Diaz-Uriarte R, Gómez de Lope E, Giugno R, Fröhlich H, Nazarov PV et al (2022) Ten quick tips for biomarker discovery and validation analyses using machine learning. PLoS Comput Biol 18(8):e1010357. https://doi.org/10.1371/journal.pcbi.1010357

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Liu B, Sträuber H, Saraiva J et al (2022) Machine learning-assisted identification of bioindicators predicts medium-chain carboxylate production performance of an anaerobic mixed culture. Microbiome 10:48. https://doi.org/10.1186/s40168-021-01219-2

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

This article is financially supported by the 2023 College of Public Policy at Korea University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jai Woo Lee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Seo, S., Lee, J.W. (2024). Applications of Big Data and AI-Driven Technologies in CADD (Computer-Aided Drug Design). In: Gore, M., Jagtap, U.B. (eds) Computational Drug Discovery and Design. Methods in Molecular Biology, vol 2714. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3441-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-3441-7_16

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-3440-0

  • Online ISBN: 978-1-0716-3441-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics