Skip to main content
Log in

Dimension-Reduced Clustering of Functional Data via Subspace Separation

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

We propose a new method for finding an optimal cluster structure of functions as well as an optimal subspace for clustering simultaneously. The proposed method aims to minimize a distance between functional objects and their projections with the imposition of clustering penalties. It includes existing approaches to functional cluster analysis and dimension reduction, such as functional principal component k-means (Yamamoto, 2012) and functional factorial k-means (Yamamoto and Terada, 2014), as special cases. We show that these existing methods can perform poorly when a disturbing structure exists and that the proposed method can overcome this drawback by using subspace separation. A novel model selection procedure has been proposed, which can also be applied to other joint analyses of dimension reduction and clustering. We apply the proposed method to artificial and real data to demonstrate its performance as compared to the extant approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • ARABIE, P., and HUBERT, L. (1994), “Cluster Analysis in Marketing Research”, in Advanced Methods of Marketing Research, ed. R.P. Bagozzi, Oxford: Blackwell, pp. 160–189.

    Google Scholar 

  • BESSE, P.C., and RAMSAY, J.O. (1986), “Principal Components Analysis of Sampled Functions”, Psychometrika, 51, 285–311.

    Article  MathSciNet  MATH  Google Scholar 

  • BOUVEYRON, C., and JACQUES, J. (2011), “Model-Based Clustering of Time Series in Group-Specific Functional Subspaces”, Advances in Data Analysis and Classification, 5, 281–300.

    Article  MathSciNet  MATH  Google Scholar 

  • CALIŃSKI, T., and HARABASZ, J. (1974), “A Dendrite Method for Cluster Analysis”, Communications in Statistics, 3, 1–27.

    MathSciNet  MATH  Google Scholar 

  • DE SOETE, G., and CARROLL, J.D. (1994), “K-Means Clustering in a Low-Dimensional Euclidean Space”, in New Approaches in Classification and Data Analysis, eds. E. Diday, Y. Lechevallier, M. Schader, P. Bertrand, and B. Burtschy, Heidelberg: Springer, pp. 212–219.

    Chapter  Google Scholar 

  • DUNFORD, N., and SCHWARTZ. J.T. (1988), Linear Operators, Spectral Theory, Self Adjoint Operators in Hilbert Space, Part 2, New York: Interscience.

    MATH  Google Scholar 

  • FERRATY, F., and VIEU, P. (2006), Nonparametric Functional Data Analysis, New York: Springer.

    MATH  Google Scholar 

  • FRIEDMAN, J., HASTIE, T., and TIBSHIRANI, R. (2010), “Regularization Paths for Generalized Linear Models via Coordinate Descent”, Journal of Statistical Software, 33, 1–22.

    Article  Google Scholar 

  • GATTONE, S.A., and ROCCI, R. (2012), “Clustering Curves on a Reduced Subspace”, Journal of Computational and Graphical Statistics, 21, 361–379.

    Article  MathSciNet  Google Scholar 

  • GREEN, P.J., and SILVERMAN, B.W. (1994), Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, London: Chapman and Hall.

    Book  MATH  Google Scholar 

  • HARTIGAN, J.A., and WONG, M.A. (1979), “Algorithm AS 136: A K-means Clustering Algorithm”, Journal of the Royal Statistical Society, Series C, 28, 100–108.

    MATH  Google Scholar 

  • HASTIE, T., BUJA, A., and TIBSHIRANI, R. (1995), “Penalized Discriminant Analysis”, The Annals of Statistics, 23, 73–102.

    Article  MathSciNet  MATH  Google Scholar 

  • HUBERT, L., and ARABIE, P. (1985), “Comparing Partitions”, Journal of Classification, 2, 193–218.

    Article  MATH  Google Scholar 

  • ILLIAN, J.B., PROSSER, J.I., BAKER, K.L., and RANGEL-CASTRO, J.I. (2009), “Functional Principal Component Data Analysis: A New Method for Analysing Microbial Community Fingerprints”, Journal of Microbiological Methods, 79, 89–95.

    Article  Google Scholar 

  • JENNRICH, R.I. (2001), “A Simple General Procedure for Orthogonal Rotation”, Psychometrika, 66, 289–306.

    Article  MathSciNet  MATH  Google Scholar 

  • JENNRICH, R.I. (2002), “A Simple General Procedure for Oblique Rotation”, Psychometrika, 67, 7–20.

    Article  MathSciNet  MATH  Google Scholar 

  • LLOYD, S. (1982), “Least Squares Quantization in Pem”, IEEE Transactions on Information Theory, 28, 128–137.

    Article  Google Scholar 

  • MACQUEEN, J. (1967),“ Some Methods of Classification and Analysis of Multivariate Observations”, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, eds. L.M. Le Cam and J. Neyman, Berkeley, CA: University of California Press, pp. 281–297.

    Google Scholar 

  • MAZUMDER, R., FRIEDMAN, J., and HASTIE, T. (2011), “Sparsenet: Coordinate Descent with Nonconvex Penalties”, Journal of the American Statistical Association, 106, 1125–1138.

    Article  MathSciNet  MATH  Google Scholar 

  • OCAÑA, F.A., AGUILERA, A.M., and VALDERRAMA, M.J. (1999), “Functional Principal Components Analysis by Choice of Norm”, Journal of Multivariate Analysis, 71, 262–276.

    Article  MathSciNet  MATH  Google Scholar 

  • RAMSAY, J.O., and SILVERMAN, B.W. (2005), Functional Data Analysis (2nd ed.), New York: Springer-Verlag.

    MATH  Google Scholar 

  • REISS, T.P., and OGDEN, T. (2007), “Functional Principal Component Regression and Functional Partial Least Squares”, Journal of the American Statistical Association, 102, 984–996.

    Article  MathSciNet  MATH  Google Scholar 

  • SILVERMAN, B.W. (1996), “Smoothed Functional Principal Components Analysis by Choice of Norm”, The Annals of Statistics, 24, 1–24.

    Article  MathSciNet  MATH  Google Scholar 

  • SUYUNDYKOV, R., PUECHMOREL, S., and FERRE, L. (2010), “Multivariate Functional Data Clusterization by PCA in Sobolev Space Using Wavelets”, Hyper Articles en Ligne (https://hal.archives-ouvertes.fr/): inria-00494702.

  • TIBSHIRANI, R., WALTHER, G., and HASTIE, T. (2001), “Estimating the Number of Clusters in a Data Set via the Gap Statistic”, Journal of the Royal Statistical Society, Series B, 63, 411–423.

    Article  MathSciNet  MATH  Google Scholar 

  • TIMMERMAN, M.E., CEULEMANS, E., KIERS, H., and VICHI, M. (2010), “Factorial and Reduced K-Means Reconsidered”, Computational Statistics and Data Analysis, 54, 1858–1871.

    Article  MathSciNet  MATH  Google Scholar 

  • VICHI, M., and KIERS H.A.L. (2001), “Factorial K-Means Analysis for Two-Way Data”, Computational Statistics and Data Analysis, 37, 49–64.

    Article  MathSciNet  MATH  Google Scholar 

  • VICHI, M., ROCCI, R., and KIERS, H.A.L. (2007), “Simultaneous Component and Clustering Methods for Three-Way Data: Within and Between Approaches”, Journal of Classification, 24, 71–98.

    Article  MathSciNet  MATH  Google Scholar 

  • WANG, J. (2010), “Consistent Selection of the Number of Clusters via Crossvalidation”, Biometrika, 97, 893–904.

    Article  MathSciNet  MATH  Google Scholar 

  • YAMAMOTO, M. (2012), “Clustering of Functional Data in a Low-Dimensional Subspace”, Advances in Data Analysis and Classification, 6, 219–247.

    Article  MathSciNet  MATH  Google Scholar 

  • YAMAMOTO, M., and HWANG, H. (2014), “A General Formulation of Cluster Analysis with Dimension Reduction and Subspace Separation”, Behaviormetrika, 41, 115–129.

    Article  Google Scholar 

  • YAMAMOTO, M., and TERADA, Y. (2014), “Functional Factorial k-Means Analysis”, Computational Statistics and Data Analysis, 79, 133–148.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michio Yamamoto.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yamamoto, M., Hwang, H. Dimension-Reduced Clustering of Functional Data via Subspace Separation. J Classif 34, 294–326 (2017). https://doi.org/10.1007/s00357-017-9232-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-017-9232-z

Keywords

Navigation