Persistence diagrams with linear machine learning models

Obayashi, Ippei; Hiraoka, Yasuaki; Kimura, Masao

doi:10.1007/s41468-018-0013-5

Persistence diagrams with linear machine learning models

Published: 05 May 2018

Volume 1, pages 421–449, (2018)
Cite this article

Journal of Applied and Computational Topology Aims and scope Submit manuscript

3071 Accesses
76 Citations
Explore all metrics

Abstract

Persistence diagrams have been widely recognized as a compact descriptor for characterizing multiscale topological features in data. When many datasets are available, statistical features embedded in those persistence diagrams can be extracted by applying machine learnings. In particular, the ability for explicitly analyzing the inverse in the original data space from those statistical features of persistence diagrams is significantly important for practical applications. In this paper, we propose a unified method for the inverse analysis by combining linear machine learning models with persistence images. The method is applied to point clouds and cubical sets, showing the ability of the statistical inverse analysis and its advantages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Persistence Curves: A canonical framework for summarizing persistence diagrams

Article 18 January 2022

Persistence codebooks for topological data analysis

Article Open access 01 September 2020

Topological Machine Learning with Persistence Indicator Functions

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Notes

A topological space X with $\tilde{H}_{q}(X)=0$ for any q is called acyclic, where $\tilde{H}_q(X)$ is the reduced homology of X.
A multiset is a set with multiplicity of each point.
In Robins et al. (2016), the birth/death positions are called critical points.
CGAL: https://www.cgal.org/ (Da. et al. 2017).
Scikit-learn: http://scikit-learn.org/ (Pedregosa et al. 2011).
http://www.wpi-aimr.tohoku.ac.jp/hiraoka_labo/homcloud/index.en.html.
DIPHA: A Distributed Persistent Homology Algorithm (Bauer et al. 2014).
SciPy: Open Source Scientific Tools for Python, 2001-, http://www.scipy.org/ (Jones et al. 2011–).
https://opencv.org/
In the paper (Kimura et al. 2017), images in the final stage are also used. In this paper, we only use early and intermediate stage images to focus on the initial changes in the reaction.

References

Adams, H., Chepushtanova, S., Emerson, T., Hanson, E., Kirby, M., Motta, F., Neville, R., Peterson, C., Shipman, P., Ziegelmeier, L.: Persistence images: a stable vector representation of persistent homology. J. Mach. Learn. Res. 18(8), 1–35 (2017)
MathSciNet MATH Google Scholar
Bauer, U., Kerber, M., Reininghaus, J.: Distributed computation of persistent homology. Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments (ALENEX) (2014)
Bauer, U., Kerber, M., Reininghaus, J., Wagner, H.: Phat—persistent homology algorithms toolbox. J. Symb. Comput. 78, 76–90 (2017)
Article MathSciNet MATH Google Scholar
Bingham, N.H., Fry, J.M.: Regression—Linear Models in Statistics. Springer, Berlin (2010)
MATH Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Berlin (2007)
Google Scholar
Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16(1), 77–102 (2015)
MathSciNet MATH Google Scholar
Buchet, M., Hiraoka, Y., Obayashi, I.: Persistent homology and materials informatics. In: Tanaka, I. (ed.) Nanoinformatics, pp. 75–95. Springer, Berlin (2018)
Chapter Google Scholar
Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46, 255–308 (2009)
Article MathSciNet MATH Google Scholar
Chazal, F., Glisse, M., Labruére, C., Michel, B.: Convergence rates for persistence diagram estimation in topological data analysis. J. Mach. Learn. Res. 16, 3603–3635 (2015)
MathSciNet MATH Google Scholar
Chan, J.M., Carlsson, G., Rabadan, R.: Topology of viral evolution. PNAS 110(46), 18566–18571 (2013)
Article MathSciNet MATH Google Scholar
Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discret. Comput. Geom. 37(1), 103–120 (2007)
Article MathSciNet MATH Google Scholar
Csurka, G., Bray, C., Dance, C. Fan, L.: Visual categorization with bags of keypoints. In: Proceeding of ECCV Workshop on Statistical Learning in Computer Vision, pp. 59–74 (2004)
Da, T.K.F., Loriot, S., Yvinec, M.: 3D Alpha Shapes. CGAL User and Reference Manual 4.11, CGAL Editorial Board (2017)
Delgado-Friedrichs, O., Robins, V., Sheppard, A.: Morse theory and persistent homology for topological analysis of 3D images of complex materials. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 4872–4876 (2014)
Delgado-Friedrichs, O., Robins, V., Sheppard, A.: Skeletonization and partitioning of digital images using discrete Morse theory. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 654–666 (2015)
Article Google Scholar
de Silva, V., Ghrist, R.: Coverage in sensor networks via persistent homology. Algebraic Geom. Topol. 7, 339–358 (2007)
Article MathSciNet MATH Google Scholar
Dey, T.K., Hirani, A.N., Krishnamoorthy, B.: Optimal homologous cycles, total unimodularity and linear programming. SIAM J. Comput. 40(4), 1026–1044 (2011)
Article MathSciNet MATH Google Scholar
Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. Discret. Comput. Geom. 28(4), 511–533 (2002)
Article MathSciNet MATH Google Scholar
Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. AMS, Providence (2010)
MATH Google Scholar
Escolar, E.G., Hiraoka, Y.: Optimal cycles for persistent homology via linear programming. Optimization in the Real World Toward Solving Real-World Optimization Problems, pp. 79–96. Springer Japan, Osaka (2016)
Google Scholar
Fasy, B.T., Lecci, F., Rinaldo, A., Wasserman, L., Balakrishnan, S., Singh, A.: Confidence sets for persistence diagrams. Ann. Stat. 42(6), 2301–2339 (2014)
Article MathSciNet MATH Google Scholar
Hiraoka, Y., Nakamura, T., Hirata, A., Escolar, E.G., Matsue, K., Nishiura, Y.: Hierarchical structures of amorphous solids characterized by persistent homology. Proc. Nat. Acad. Sci. USA 113, 7035–7040 (2016)
Article Google Scholar
Ichinomiya, T., Obayashi, I., Hiraoka, Y.: Persistent homology analysis of craze formation. Phys. Rev. E 95(1), 012504 (2017)
Article Google Scholar
Jones, E., Oliphant, T., Peterson, .P, et al.: SciPy: Open source scientific tools for Python. http://www.scipy.org/ (2001–) [Online; accessed 2018-01-20]
Kaczynski, T., Mischaikow, K., Mrozek, M.: Computational Homology. Springer, Berlin (2004)
Book MATH Google Scholar
Kimura, M., Obayashi, I., Takeuchi, Y., Hiraoka, Y.: Finding trigger sites in heterogeneous reactions using persistent-homology without preliminary material scientific information. Sci. Rep. 8, 3553 (2018)
Article Google Scholar
Kusano, G., Fukumizu, K., Hiraoka, Y.: Persistence weighted Gaussian kernel for topological data analysis. Proceedings of the 33rd International Conference on Machine Learning, JMLR: W&CP 48. 2004-2013 (2016)
Kusano, G., Fukumizu, K., Hiraoka, Y.: Kernel method for persistence diagrams via kernel embedding and weight factor. Accepted in Journal of Machine Learning Research
Lowe, D.G.: Object recognition from local scale invariant features. In: Proc. of IEEE International Conference on Computer Vision, pp. 1150–1157 (1999)
Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Computer Vision – ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7-13, 2006, Proceedings, Part IV, pp. 490–503 (2006)
Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. arXiv:1506.08903
Pearson, D.A., Bradley, R.M., Motta, F.C., Shipman, P.D.: Producing nanodot arrays with improved hexagonal order by patterning surfaces before ion sputtering. Phys. Rev. E 92(6), 062401 (2015)
Article Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Erplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Rajan, K.: Materials informatics. Mater. Today 8(10), 38–45 (2005)
Article Google Scholar
Rajan, K.: Materials informatics. Mater. Today 15(11), 470 (2012)
Article Google Scholar
Reininghaus, J., Huber, S., Bauer, U., Kwitt, R.: A Stable Multi-Scale Kernel for Topological Machine Learning. 2015 IEEE Conference on Computer Vision and Pattern Recognition, 4741–4748 (2015)
Robert, T.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Robins, V., Turner, K.: Principal component analysis of persistent homology rank functions with case studies of spatial point patterns, sphere packing and colloids. Phys. D 334, 99–117 (2016)
Article MathSciNet Google Scholar
Robins, V., Saadatfar, M., Delgado-Friedrichs, O., Sheppard, A.P.: Percolating length scales from topological persistence analysis of micro-CT images of porous materials. Water Resour. Res. 52(1), 315–329 (2016)
Article Google Scholar
Saadatfar, M., Takeuchi, H., Francois, N., Robins, V., Hiraoka, Y.: Pore configuration landscape of granular crystallisation. Nat. Commun. 8, 15082 (2017). https://doi.org/10.1038/ncomms15082
Article Google Scholar
Sivic, J. and Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: Proc. of IEEE International Conference on Computer Vision, pp.1470–1477 (2003)
Turner, K., Mileyko, Y., Mukherjee, S., Harer, J.: Fréchet means for distributions of persistence diagrams. Discret. Comput. Geom. 52(1), 44–70 (2014)
Article MATH Google Scholar
Zomorodian, A., Carlsson, G.: Computing persistent homology. Discret. Comput. Geom. 33(2), 249–274 (2005)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Institute for Materials Research (WPI-AIMR), Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai, 980-8577, Japan
Ippei Obayashi
Kyoto University Institute for Advanced Study, Kyoto University, Yoshida Ushinomiya-cho, Sakyo-ku, Kyoto, 606-8501, Japan
Yasuaki Hiraoka
Center for Advanced Intelligence Project, RIKEN, Wako, Japan
Yasuaki Hiraoka
Center for Materials research by Information Integration (CMI2), National Institute for Materials Science (NIMS), Tsukuba, Japan
Yasuaki Hiraoka
Photon Factory, Institute of Materials Structure Science, High Energy Accelerator Research Organization, Tsukuba, Japan
Masao Kimura
Department of Materials Structure Science, School of High Energy Accelerator Science, SOKENDAI (The Graduate University for Advanced Studies), Tsukuba, Japan
Masao Kimura

Authors

Ippei Obayashi
View author publications
You can also search for this author inPubMed Google Scholar
Yasuaki Hiraoka
View author publications
You can also search for this author inPubMed Google Scholar
Masao Kimura
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Ippei Obayashi.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

This work is partially supported by JSPS KAKENHI Grant Number JP 16K17638, JST CREST Mathematics15656429, JST “Materials research by Information Integration” Initiative (MI²I) project of the Support Program for Starting Up Innovation Hub, Structural Materials for Innovation Strategic Innovation Promotion Program D72 and D66, and New Energy and Industrial Technology Development Organization (NEDO).

A Algorithm for generating random images

The algorithm for generating random binary images is given by Algorithm 2. It consists of six parameters, $W, N, S\in \mathbb {N}, \sigma _1> 0, \sigma _2 > 0$, and $t >0$. The area of white pixels in the generated image is given by the orbits of the Brownian motion of N particles on a flat torus with the size $W \times W$. The parameters S and $\sigma _1$ determine the length of each orbit and $\sigma _2$ and t determine the radii of particles. In this paper we fix $W=300$, $\sigma _1 = 4$, $\sigma _2 = 2$, $t = 0.01$, and only N and S are changed. When N and S become larger, the generated image tend to have more white pixels.

These kinds of random images are frequently obtained by experimental measurements in materials science such as X-CT and TEM (Kimura et al. 2017). These seemingly disordered images are supposed to be utilized for materials informatics, and one of the motivations of this paper is to develop a universal framework for this purpose.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Obayashi, I., Hiraoka, Y. & Kimura, M. Persistence diagrams with linear machine learning models. J Appl. and Comput. Topology 1, 421–449 (2018). https://doi.org/10.1007/s41468-018-0013-5

Download citation

Received: 06 July 2017
Accepted: 16 April 2018
Published: 05 May 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s41468-018-0013-5

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Persistence diagrams with linear machine learning models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Persistence Curves: A canonical framework for summarizing persistence diagrams

Persistence codebooks for topological data analysis

Topological Machine Learning with Persistence Indicator Functions

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

A Algorithm for generating random images

A Algorithm for generating random images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now