Classification models for predicting the bioactivity of pan-TRK inhibitors and SAR analysis

Zhao, Xiaoman; Kong, Yue; Ji, Yueshan; Xin, Xiulan; Chen, Liang; Chen, Guang; Yu, Changyuan

doi:10.1007/s11030-023-10735-2

Classification models for predicting the bioactivity of pan-TRK inhibitors and SAR analysis

Original Article
Published: 01 November 2023

(2023)
Cite this article

Molecular Diversity Aims and scope Submit manuscript

Xiaoman Zhao^1,2,
Yue Kong¹,
Yueshan Ji¹,
Xiulan Xin²,
Liang Chen²,
Guang Chen¹ &
…
Changyuan Yu¹

190 Accesses
1 Altmetric
Explore all metrics

Abstract

Tropomyosin receptor kinases (TRKs) are important broad-spectrum anticancer targets. The oncogenic rearrangement of the NTRK gene disrupts the extracellular structural domain and epitopes for therapeutic antibodies, making small-molecule inhibitors essential for treating NTRK fusion-driven tumors. In this work, several algorithms were used to construct descriptor-based and nondescriptor-based models, and the models were evaluated by outer 10-fold cross-validation. To find a model with good generalization ability, the dataset was partitioned by random and cluster-splitting methods to construct in- and cross-domain models, respectively. Among the 48 models built, the model with the combination of the deep neural network (DNN) algorithm and extended connectivity fingerprints 4 (ECFP4) descriptors achieved excellent performance in both dataset divisions. The results indicate that the DNN algorithm has a strong generalization prediction ability, and the richness of features plays a vital role in predicting unknown spatial molecules. Additionally, we combined the clustering results and decision tree models of fingerprint descriptors to perform structure–activity relationship analysis. It was found that nitrogen-containing aromatic heterocyclic and benzo heterocyclic structures play a crucial role in enhancing the activity of TRK inhibitors.

Graphical abstract

Workflow for generating predictive models for TRK inhibitors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification of FLT3 inhibitors and SAR analysis by machine learning methods

Article 05 May 2023

Building 2D classification models and 3D CoMSIA models on small-molecule inhibitors of both wild-type and T790M/L858R double-mutant EGFR

Article 12 October 2021

Machine learning-based classification models for non-covalent Bruton’s tyrosine kinase inhibitors: predictive ability and interpretability

Article 21 July 2023

Data availability

The datasets used in this work can be found in the Supplementary Information. The source of the Reaxys dataset can be found at https://www.reaxys.com/. The source of the ChEMBL dataset can be found at https://www.ebi.ac.uk/chembl/.

Code availability

The source code used in this work are freely available at GitHub repository (https://github.com/CathyZakZhao/Classifier-for-Trk-Inhibitor).

References

Cocco E, Scaltriti M, Drilon A (2018) NTRK fusion-positive cancers and TRK inhibitor therapy. Nat Rev Clin Oncol 15:731–747. https://doi.org/10.1038/s41571-018-0113-0
Article CAS PubMed PubMed Central Google Scholar
Huang EJ, Reichardt LF (2001) Neurotrophins: roles in neuronal development and function. Annu Rev Neurosci 24:677–736. https://doi.org/10.1146/annurev.neuro.24.1.677
Article CAS PubMed PubMed Central Google Scholar
Huse M, Kuriyan J (2002) The conformational plasticity of protein kinases. Cell 109:275–282. https://doi.org/10.1016/S0092-8674(02)00741-9
Article CAS PubMed Google Scholar
Demir IE, Tieftrunk E, Schorn S et al (2016) Nerve growth factor & TrkA as novel therapeutic targets in cancer. Biochim Biophys Acta BBA—Rev Cancer 1866:37–50. https://doi.org/10.1016/j.bbcan.2016.05.003
Article CAS Google Scholar
Skaper SD (2018) Neurotrophic factors: an overview. In: Skaper SD (ed) Neurotrophic factors. Springer, New York, pp 1–17
Chapter Google Scholar
Stephens RM, Loeb DM, Copeland TD et al (1994) Trk receptors use redundant signal transduction pathways involving SHC and PLC-γ1 to mediate NGF responses. Neuron 12:691–705. https://doi.org/10.1016/0896-6273(94)90223-2
Article CAS PubMed Google Scholar
Greco A, Fusetti L, Miranda C et al (1998) Role of the TFG N-terminus and coiled-coil domain in the transforming activity of the thyroid TRK-T3 oncogene. Oncogene 16:809–816. https://doi.org/10.1038/sj.onc.1201596
Article CAS PubMed Google Scholar
Segal RA (2003) Selectivity in neurotrophin signaling: theme and variations. Annu Rev Neurosci 26:299–330. https://doi.org/10.1146/annurev.neuro.26.041002.131421
Article CAS PubMed Google Scholar
Zito Marino F, Pagliuca F, Ronchi A et al (2020) NTRK fusions, from the diagnostic algorithm to innovative treatment in the era of precision medicine. Int J Mol Sci 21:3718. https://doi.org/10.3390/ijms21103718
Article CAS PubMed PubMed Central Google Scholar
Scott LJ (2019) Larotrectinib: first global approval. Drugs 79:201–206. https://doi.org/10.1007/s40265-018-1044-x
Article CAS PubMed Google Scholar
Al-Salama ZT, Keam SJ (2019) Entrectinib: first global approval. Drugs 79:1477–1483. https://doi.org/10.1007/s40265-019-01177-y
Article PubMed Google Scholar
Ardini E, Menichincheri M, Banfi P et al (2016) Entrectinib, a Pan–TRK, ROS1, and ALK inhibitor with activity in multiple molecularly defined cancer indications. Mol Cancer Ther 15:628–639. https://doi.org/10.1158/1535-7163.MCT-15-0758
Article CAS PubMed Google Scholar
Federman N, McDermott R (2019) Larotrectinib, a highly selective tropomyosin receptor kinase (TRK) inhibitor for the treatment of TRK fusion cancer. Expert Rev Clin Pharmacol 12:931–939. https://doi.org/10.1080/17512433.2019.1661775
Article CAS PubMed Google Scholar
Drilon A, Nagasubramanian R, Blake JF et al (2017) A next-generation TRK kinase inhibitor overcomes acquired resistance to prior TRK kinase inhibition in patients with TRK fusion-positive solid tumors. Cancer Discov 7:963–972. https://doi.org/10.1158/2159-8290.CD-17-0507
Article CAS PubMed PubMed Central Google Scholar
Zhai D, Deng W, Huang J et al (2017) Abstract 3161: TPX-0005, an ALK/ROS1/TRK inhibitor, overcomes multiple resistance mechanisms by targeting SRC/FAK signaling. Cancer Res 77:3161–3161. https://doi.org/10.1158/1538-7445.AM2017-3161
Article Google Scholar
Drilon A (2019) TRK inhibitors in TRK fusion-positive cancers. Ann Oncol 30:viii23–viii30. https://doi.org/10.1093/annonc/mdz282
Article CAS PubMed PubMed Central Google Scholar
Wang Z, Wang J, Wang Y et al (2022) Discovery of the first highly selective and broadly effective macrocycle-based type II TRK inhibitors that overcome clinically acquired resistance. J Med Chem 65:6325–6337. https://doi.org/10.1021/acs.jmedchem.2c00308
Article CAS PubMed Google Scholar
Shoombuatong W, Schaduangrat N, Nantasenamat C (2018) Towards understanding aromatase inhibitory activity via QSAR modeling. Excli J. https://doi.org/10.17179/EXCLI2018-1417
Article PubMed PubMed Central Google Scholar
Muratov EN, Bajorath J, Sheridan RP et al (2020) Correction: QSAR without borders. Chem Soc Rev 49:3716–3716. https://doi.org/10.1039/D0CS90041A
Article CAS PubMed Google Scholar
Yan W, Zhang L, Lv F et al (2021) Discovery of pyrazolo-thieno[3,2-d]pyrimidinylamino-phenyl acetamides as type-II pan-tropomyosin receptor kinase (TRK) inhibitors: design, synthesis, and biological evaluation. Eur J Med Chem 216:113265. https://doi.org/10.1016/j.ejmech.2021.113265
Article CAS PubMed Google Scholar
Ivanova L, Karelson M, Dobchev D (2018) Identification of natural compounds against neurodegenerative diseases using in silico techniques. Molecules 23:1847. https://doi.org/10.3390/molecules23081847
Article CAS PubMed PubMed Central Google Scholar
Tammiku-Taul J, Park R, Jaanson K et al (2016) Indole-like Trk receptor antagonists. Eur J Med Chem 121:541–552. https://doi.org/10.1016/j.ejmech.2016.06.003
Article CAS PubMed Google Scholar
Er-rajy M, El fadili M, Mujwar S et al (2023) Design of novel anti-cancer drugs targeting TRKs inhibitors based 3D QSAR, molecular docking and molecular dynamics simulation. J Biomol Struct Dyn. https://doi.org/10.1080/07391102.2023.2170471
Article PubMed Google Scholar
de Boves HP (2015) Support vector machine classification trees. Anal Chem 87:11065–11071. https://doi.org/10.1021/acs.analchem.5b03113
Article CAS Google Scholar
Schonlau M, Zou RY (2020) The random forest algorithm for statistical learning. Stata J Promot Commun Stat Stata 20:3–29. https://doi.org/10.1177/1536867X20909688
Article Google Scholar
Ma J, Sheridan RP, Liaw A et al (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55:263–274. https://doi.org/10.1021/ci500747n
Article CAS PubMed Google Scholar
Yang K, Swanson K, Jin W et al (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370–3388. https://doi.org/10.1021/acs.jcim.9b00237
Article CAS PubMed PubMed Central Google Scholar
Mendez D, Gaulton A, Bento AP et al (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47:D930–D940. https://doi.org/10.1093/nar/gky1075
Article CAS PubMed Google Scholar
Sushko I, Novotarskyi S, Körner R et al (2011) Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 25:533–554. https://doi.org/10.1007/s10822-011-9440-2
Article CAS PubMed PubMed Central Google Scholar
Tsangaratos P, Ilia I (2016) Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: The influence of models complexity and training dataset size. CATENA 145:164–179. https://doi.org/10.1016/j.catena.2016.06.004
Article Google Scholar
Hajibabaee P, Pourkamali-Anaraki F, Hariri-Ardebili MA (2021) An empirical evaluation of the t-SNE algorithm for data visualization in structural engineering. In: 2021 20th IEEE international conference on machine learning and applications (ICMLA). IEEE, Pasadena, CA, pp 1674–1680
Frades I, Matthiesen R (2010) Overview on techniques in cluster analysis. In: Matthiesen R (ed) Bioinformatics methods in clinical research. Humana Press, Totowa, pp 81–107
Chapter Google Scholar
Kanungo T, Mount DM, Netanyahu NS et al (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24:881–892. https://doi.org/10.1109/TPAMI.2002.1017616
Article Google Scholar
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. https://doi.org/10.1021/ci100050t
Article CAS PubMed Google Scholar
Vilar S, Cozza G, Moro S (2008) Medicinal chemistry and the molecular operating environment (MOE): application of QSAR and molecular docking to drug discovery. Curr Top Med Chem 8:1555–1572. https://doi.org/10.2174/156802608786786624
Article CAS PubMed Google Scholar
Riniker S, Landrum GA (2013) Open-source platform to benchmark fingerprints for ligand-based virtual screening. J Cheminformatics 5:26. https://doi.org/10.1186/1758-2946-5-26
Article CAS Google Scholar
RDKit. Open-source cheminformatics software. http://www.rdkit.org. Accessed Oct 2021
Steyerberg E (1999) Stepwise selection in small data sets a simulation study of bias in logistic regression analysis. J Clin Epidemiol 52:935–942. https://doi.org/10.1016/S0895-4356(99)00103-1
Article CAS PubMed Google Scholar
Maltarollo VG, Kronenberger T, Espinoza GZ et al (2019) Advances with support vector machines for novel drug discovery. Expert Opin Drug Discov 14:23–33. https://doi.org/10.1080/17460441.2019.1549033
Article CAS PubMed Google Scholar
Polishchuk PG, Muratov EN, Artemenko AG et al (2009) Application of random forest approach to QSAR prediction of aquatic toxicity. J Chem Inf Model 49:2481–2488. https://doi.org/10.1021/ci900203n
Article CAS PubMed Google Scholar
Song Y-Y, Lu Y (2015) Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry 27:130–135. https://doi.org/10.11919/j.issn.1002-0829.215044
Article PubMed PubMed Central Google Scholar
Dreiseitl S, Ohno-Machado L (2002) Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inform 35:352–359. https://doi.org/10.1016/S1532-0464(03)00034-0
Article PubMed Google Scholar
Bisong E (2019) More supervised machine learning techniques with Scikit-learn. Building machine learning and deep learning models on google cloud platform. Apress, Berkeley, pp 287–308
Chapter Google Scholar
Babajide Mustapha I, Saeed F (2016) Bioactive molecule prediction using extreme gradient boosting. Molecules 21:983. https://doi.org/10.3390/molecules21080983
Article CAS PubMed PubMed Central Google Scholar
Xiong Z, Wang D, Liu X et al (2020) Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J Med Chem 63:8749–8760. https://doi.org/10.1021/acs.jmedchem.9b00959
Article CAS PubMed Google Scholar
Karpov P, Godin G, Tetko IV (2020) Transformer-CNN: Swiss knife for QSAR modeling and interpretation. J Cheminformatics 12:17. https://doi.org/10.1186/s13321-020-00423-w
Article Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, et al Scikit-learn: machine learning in python. Mach Learn PYTHON
Fushiki T (2011) Estimation of prediction error by using K-fold cross-validation. Stat Comput 21:137–146. https://doi.org/10.1007/s11222-009-9153-8
Article Google Scholar
Azar AT, Elshazly HI, Hassanien AE, Elkorany AM (2014) A random forest classifier for lymph diseases. Comput Methods Programs Biomed 113:465–473. https://doi.org/10.1016/j.cmpb.2013.11.004
Article PubMed Google Scholar
Priyanka NA, Kumar D (2020) Decision tree classifier: a detailed survey. Int J Inf Decis Sci 12:246. https://doi.org/10.1504/IJIDS.2020.108141
Article Google Scholar
Abu Alfeilat HA, Hassanat ABA, Lasassmeh O et al (2019) Effects of distance measure choice on K-nearest neighbor classifier performance: a review. Big Data 7:221–248. https://doi.org/10.1089/big.2018.0175
Article PubMed Google Scholar
Carmona P, Climent F, Momparler A (2019) Predicting failure in the U.S. banking sector: an extreme gradient boosting approach. Int Rev Econ Finance 61:304–323. https://doi.org/10.1016/j.iref.2018.03.008
Article Google Scholar
Walsh I, Fishman D, Garcia-Gasulla D et al (2021) DOME: recommendations for supervised machine learning validation in biology. Nat Methods 18:1122–1127. https://doi.org/10.1038/s41592-021-01205-4
Article CAS PubMed Google Scholar
Dorrity MW, Saunders LM, Queitsch C et al (2020) Dimensionality reduction by UMAP to visualize physical and genetic interactions. Nat Commun 11:1537. https://doi.org/10.1038/s41467-020-15351-4
Article CAS PubMed PubMed Central Google Scholar
Malik AA, Chotpatiwetchkul W, Phanus-umporn C et al (2021) StackHCV: a web-based integrative machine-learning framework for large-scale identification of hepatitis C virus NS5B inhibitors. J Comput Aided Mol Des 35:1037–1053. https://doi.org/10.1007/s10822-021-00418-1
Article CAS PubMed Google Scholar
Jiang D, Wu Z, Hsieh C-Y et al (2021) Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J Cheminformatics 13:12. https://doi.org/10.1186/s13321-020-00479-8
Article CAS Google Scholar
Bai P, Miljković F, John B, Lu H (2023) Interpretable bilinear attention network with domain adaptation improves drug–target prediction. Nat Mach Intell 5:126–136. https://doi.org/10.1038/s42256-022-00605-1
Article Google Scholar
Muratov EN, Bajorath J, Sheridan RP et al (2020) QSAR without borders. Chem Soc Rev 49:3525–3564. https://doi.org/10.1039/D0CS00098A
Article CAS PubMed PubMed Central Google Scholar
Wang H, Qin Z, Yan A (2021) Classification models and SAR analysis on CysLT1 receptor antagonists using machine learning algorithms. Mol Divers 25:1597–1616. https://doi.org/10.1007/s11030-020-10165-4
Article CAS PubMed Google Scholar
Menichincheri M, Ardini E, Magnaghi P et al (2016) Discovery of entrectinib: a new 3-aminoindazole as a potent anaplastic lymphoma kinase (ALK), c-ros oncogene 1 kinase (ROS1), and pan-tropomyosin receptor kinases (Pan-TRKs) inhibitor. J Med Chem 59:3392–3408. https://doi.org/10.1021/acs.jmedchem.6b00064
Article CAS PubMed Google Scholar
Ghilardi JR, Freeman KT, Jimenez-Andrade JM et al (2010) Administration of a tropomyosin receptor kinase inhibitor attenuates sarcoma-induced nerve sprouting, neuroma formation and bone cancer pain. Mol Pain 6:1744-8069-6–87. https://doi.org/10.1186/1744-8069-6-87
Article CAS Google Scholar
Drilon A, Ou S-HI, Cho BC et al (2018) Repotrectinib (TPX-0005) is a next-generation ROS1/TRK/ALK inhibitor that potently inhibits ROS1/TRK/ALK solvent- front mutations. Cancer Discov 8:1227–1236. https://doi.org/10.1158/2159-8290.CD-18-0484
Article CAS PubMed Google Scholar
Regina A, Elagoz A, Albert V et al (2019) Abstract 2198: PBI-200: a novel, brain penetrant, next generation pan-TRK kinase inhibitor. Cancer Res 79:2198–2198. https://doi.org/10.1158/1538-7445.AM2019-2198
Article Google Scholar
Albanese C, Alzani R, Amboldi N et al (2010) Dual targeting of CDK and tropomyosin receptor kinase families by the oral inhibitor PHA-848125, an agent with broad-spectrum antitumor efficacy. Mol Cancer Ther 9:2243–2254. https://doi.org/10.1158/1535-7163.MCT-10-0190
Article CAS PubMed Google Scholar

Download references

Funding

This work was supported by The Research on National Reference Material and Product Development of Natural Products (SG030801).

Author information

Authors and Affiliations

College of Life Science and Technology, Beijing University of Chemical Technology, 15 BeiSanHuan East Road, Beijing, 100029, People’s Republic of China
Xiaoman Zhao, Yue Kong, Yueshan Ji, Guang Chen & Changyuan Yu
College of Bio engineering, No. 9 Liangshuihe 1st Street, Beijing, 100176, People’s Republic of China
Xiaoman Zhao, Xiulan Xin & Liang Chen

Authors

Xiaoman Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yue Kong
View author publications
You can also search for this author in PubMed Google Scholar
Yueshan Ji
View author publications
You can also search for this author in PubMed Google Scholar
Xiulan Xin
View author publications
You can also search for this author in PubMed Google Scholar
Liang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Guang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Changyuan Yu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XZ and YK conceived the experiments. XZ and YJ collected and organized data. XZ evaluated the models. XX and LC performed analysis. CY and GC modified the language. CY contributed to project administration. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Changyuan Yu.

Ethics declarations

Conflicts of interest

The authors confirm that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 997 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhao, X., Kong, Y., Ji, Y. et al. Classification models for predicting the bioactivity of pan-TRK inhibitors and SAR analysis. Mol Divers (2023). https://doi.org/10.1007/s11030-023-10735-2

Download citation

Received: 14 July 2023
Accepted: 22 September 2023
Published: 01 November 2023
DOI: https://doi.org/10.1007/s11030-023-10735-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification models for predicting the bioactivity of pan-TRK inhibitors and SAR analysis