Skip to content
BY 4.0 license Open Access Published by De Gruyter Open Access February 17, 2020

Atom-specific persistent homology and its application to protein flexibility analysis

  • David Bramer and Guo-Wei Wei EMAIL logo

Abstract

Recently, persistent homology has had tremendous success in biomolecular data analysis. It works by examining the topological relationship or connectivity of a group of atoms in a molecule at a variety of scales, then rendering a family of topological representations of the molecule. However, persistent homology is rarely employed for the analysis of atomic properties, such as biomolecular flexibility analysis or B-factor prediction. This work introduces atom-specific persistent homology to provide a local atomic level representation of a molecule via a global topological tool. This is achieved through the construction of a pair of conjugated sets of atoms and corresponding conjugated simplicial complexes, as well as conjugated topological spaces. The difference between the topological invariants of the pair of conjugated sets is measured by Bottleneck and Wasserstein metrics and leads to an atom-specific topological representation of individual atomic properties in a molecule. Atom-specific topological features are integrated with various machine learning algorithms, including gradient boosting trees and convolutional neural network for protein thermal fluctuation analysis and B-factor prediction. Extensive numerical results indicate the proposed method provides a powerful topological tool for analyzing and predicting localized information in complex macromolecules.

References

[1] K. L. Xia and G. W. Wei. Persistent homology analysis of protein structure, flexibility and folding. International Journal for Numerical Methods in Biomedical Engineering, 30:814–844, 2014.10.1002/cnm.2655Search in Google Scholar PubMed PubMed Central

[2] M. Gameiro, Y. Hiraoka, S. Izumi, M. Kramar, K. Mischaikow, and V. Nanda. Topological measurement of protein compressibility via persistence diagrams. Japan Journal of Industrial and Applied Mathematics, 32:1–17, 2014.10.1007/s13160-014-0153-5Search in Google Scholar

[3] K. L. Xia and G. W. Wei. Persistent topology for cryo-EM data analysis. International Journal for Numerical Methods in Biomedical Engineering, 31:e02719, 2015.10.1002/cnm.2719Search in Google Scholar PubMed

[4] Z. X. Cang, Lin Mu, Kedi Wu, Kris Opron, Kelin Xia, and Guo-Wei Wei. A topological approach to protein classification. Molecular based Mathematical Biology, 3:140–162, 2015.10.1515/mlbmb-2015-0009Search in Google Scholar

[5] Violeta Kovacev-Nikolic, Peter Bubenik, Dragan Nikolić, and Giseon Heo. Using persistent homology and dynamical distances to analyze protein binding. Stat. Appl. Genet. Mol. Biol., 15(1):19–38, 2016.10.1515/sagmb-2015-0057Search in Google Scholar PubMed

[6] Kelin Xia. Persistent homology analysis of ion aggregations and hydrogen-bonding networks. Physical Chemistry Chemical Physics, 20(19):13448–13460, 2018.10.1039/C8CP01552JSearch in Google Scholar

[7] Patrizio Frosini and Claudia Landi. Size theory as a topological tool for computer vision. Pattern Recognition and Image Analysis, 9(4):596–603, 1999.Search in Google Scholar

[8] H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological persistence and simplification. Discrete Comput. Geom., 28:511–533, 2002.10.1007/s00454-002-2885-2Search in Google Scholar

[9] A. Zomorodian and G. Carlsson. Computing persistent homology. Discrete Comput. Geom., 33:249–274, 2005.10.1007/s00454-004-1146-ySearch in Google Scholar

[10] Afra Zomorodian and Gunnar Carlsson. Localized homology. Computational Geometry - Theory and Applications, 41(3):126–148, 2008.10.1016/j.comgeo.2008.02.003Search in Google Scholar

[11] Yuan Yao, Jian Sun, Xuhui Huang, Gregory R Bowman, Gurjeet Singh, Michael Lesnick, Leonidas J Guibas, Vijay S Pande, and Gunnar Carlsson. Topological methods for exploring low-density states in biomolecular folding pathways. The Journal of chemical physics, 130(14):04B614, 2009.10.1063/1.3103496Search in Google Scholar PubMed PubMed Central

[12] Z. X. Cang and G. W. Wei. Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology. Bioinformatics, 33:3549–3557, 2017.10.1093/bioinformatics/btx460Search in Google Scholar PubMed

[13] Z. X. Cang and G. W. Wei. Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction. International Journal for Numerical Methods in Biomedical Engineering, 34(2):e2914, DOI: 10.1002/cnm.2914, 2018.10.1002/cnm.2914Search in Google Scholar PubMed

[14] David Cohen-Steiner, Herbert Edelsbrunner, John Harer, and Yuriy Mileyko. Lipschitz functions have Lp-stable persistence. Foundations of computational mathematics, 10(2):127–139, 2010.10.1007/s10208-010-9060-6Search in Google Scholar

[15] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence diagrams. Discrete & Computational Geometry, 37(1):103–120, 2007.10.1007/s00454-006-1276-5Search in Google Scholar

[16] Z. X. Cang and G. W. Wei. TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLOS Computational Biology, 13(7):e1005690, https://doi.org/10.1371/journal.pcbi.1005690, 2017.10.1371/journal.pcbi.1005690Search in Google Scholar PubMed PubMed Central

[17] Kedi Wu and G. W. Wei. Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks. Journal of Chemical Information and Modeling, 58:520–531, 2018.10.1021/acs.jcim.7b00558Search in Google Scholar PubMed

[18] Kedi Wu, Zhixiong Zhao, Renxiao Wang, and G. W. Wei. TopP-S: Persistent Homology-Based Multi-Task Deep Neural Networks for Simultaneous Predictions of Partition Coefficient and Aqueous Solubility. Journal of Computational Chemistry, 39:1444–1454, 2018.10.1002/jcc.25213Search in Google Scholar

[19] Z. X. Cang, L. Mu, and G. W. Wei. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLOS Computational Biology, 14(1):e1005929, https://doi.org/10.1371/journal.pcbi.1005929, 2018.10.1371/journal.pcbi.1005929Search in Google Scholar PubMed PubMed Central

[20] Guowei Wei, Duc Nguyen, and Zixuan Cang. System and methods for machine learning for drug design and discovery, October 3 2019. US Patent App. 16/372,239.Search in Google Scholar

[21] J. P. Ma. Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes. Structure, 13:373 – 180, 2005.10.1016/j.str.2005.02.002Search in Google Scholar PubMed

[22] H. Frauenfelder, S. G. Slihar, and P. G. Wolynes. The energy landsapes and motion of proteins. Science, 254(5038):1598–1603, DEC 13 1991.10.1126/science.1749933Search in Google Scholar PubMed

[23] M. Tasumi, H. Takenchi, S. Ataka, A. M. Dwidedi, and S. Krimm. Normal vibrations of proteins: Glucagon. Biopolymers, 21:711 – 714, 1982.10.1002/bip.360210318Search in Google Scholar

[24] B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D.J. States, S. Swaminathan, and M. Karplus. Charmm: A program for macro-molecular energy, minimization, and dynamics calculations. J. Comput. Chem., 4:187–217, 1983.10.1002/jcc.540040211Search in Google Scholar

[25] M. Levitt, C. Sander, and P. S. Stern. Protein normal-mode dynamics: Trypsin inhibitor, crambin, ribonuclease and lysozyme. J. Mol. Biol., 181(3):423 – 447, 1985.10.1016/0022-2836(85)90230-XSearch in Google Scholar

[26] M. M. Tirion. Large amplitude elastic motions in proteins from a single-parameter, atomic analysis. Phys. Rev. Lett., 77:1905 – 1908, 1996.10.1103/PhysRevLett.77.1905Search in Google Scholar PubMed

[27] A. R. Atilgan, S. R. Durrell, R. L. Jernigan, M. C. Demirel, O. Keskin, and I. Bahar. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J., 80:505 – 515, 2001.10.1016/S0006-3495(01)76033-XSearch in Google Scholar

[28] I. Bahar, A. R. Atilgan, and B. Erman. Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Folding and Design, 2:173 – 181, 1997.10.1016/S1359-0278(97)00024-2Search in Google Scholar

[29] I. Bahar, A. R. Atilgan, M. C. Demirel, and B. Erman. Vibrational dynamics of proteins: Significance of slow and fast modes in relation to function and stability. Phys. Rev. Lett, 80:2733 – 2736, 1998.10.1103/PhysRevLett.80.2733Search in Google Scholar

[30] Turkan Haliloglu, Ivet Bahar, and Burak Erman. Gaussian dynamics of folded proteins. Physical review letters, 79(16):3090, 1997.10.1103/PhysRevLett.79.3090Search in Google Scholar

[31] K. L. Xia and G. W. Wei. A stochastic model for protein flexibility analysis. Physical Review E, 88:062709, 2013.10.1103/PhysRevE.88.062709Search in Google Scholar PubMed

[32] K. Opron, K. L. Xia, and G. W. Wei. Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis. Journal of Chemical Physics, 140:234105, 2014.10.1063/1.4882258Search in Google Scholar PubMed PubMed Central

[33] Kristopher Opron, K. L. Xia, and G. W. Wei. Communication: Capturing protein multiscale thermal fluctuations. Journal of Chemical Physics, 142(211101), 2015.10.1063/1.4922045Search in Google Scholar PubMed PubMed Central

[34] David Bramer and G. W. Wei. Weighted multiscale colored graphs for protein flexibility and rigidity analysis. Journal of Chemical Physics, 148:054103, 2018.10.1063/1.5016562Search in Google Scholar PubMed

[35] David Bramer and G. W. Wei. Blind prediction of protein B-factor and flexibility. Journal of Chemical Physics, 149:021837, 2018.10.1063/1.5048469Search in Google Scholar PubMed PubMed Central

[36] K. L. Xia and G. W. Wei. Multidimensional persistence in biomolecular data. Journal of Computational Chemistry, 36:1502–1520, 2015.10.1002/jcc.23953Search in Google Scholar PubMed PubMed Central

[37] Brittany Terese Fasy, Jisu Kim, Fabrizio Lecci, and Clément Maria. Introduction to the r package tda. arXiv preprint arXiv:1411.1830, 2014.Search in Google Scholar

[38] Matthias Heinig and Dmitrij Frishman. Stride: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic acids research, 32(suppl_2):W500–W502, 2004.10.1093/nar/gkh429Search in Google Scholar

[39] J. K. Park, Robert Jernigan, and Zhijun Wu. Coarse grained normal mode analysis vs. refined gaussian network model for protein residue-level structural fluctuations. Bulletin of Mathematical Biology, 75:124–160, 2013.10.1007/s11538-012-9797-ySearch in Google Scholar PubMed PubMed Central

[40] N. Go, T. Noguti, and T. Nishikawa. Dynamics of a small globular protein in terms of low-frequency vibrational modes. Proc. Natl. Acad. Sci., 80:3696 – 3700, 1983.10.1073/pnas.80.12.3696Search in Google Scholar PubMed PubMed Central

[41] B. Brooks and M. Karplus. Harmonic dynamics of proteins: normal modes and fluctuations in bovine pancreatic trypsin inhibitor. Proceedings of the National Academy of Sciences, 80(21):6571–6575, 1983.10.1073/pnas.80.21.6571Search in Google Scholar

[42] Kristopher Opron, K. L. Xia, Z. Burton, and G. W. Wei. Flexibility-rigidity index for protein-nucleic acid flexibility and fluctuation analysis. Journal of Computational Chemistry, 37:1283–1295, 2016.10.1002/jcc.24320Search in Google Scholar PubMed PubMed Central

Received: 2019-11-14
Accepted: 2020-01-09
Published Online: 2020-02-17

© 2020 David Bramer et al., published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 30.5.2024 from https://www.degruyter.com/document/doi/10.1515/cmb-2020-0001/html
Scroll to top button