General location model with factor analyzer covariance matrix structure and its applications

Amiri, Leila; Khazaei, Mojtaba; Ganjali, Mojtaba

doi:10.1007/s11634-016-0258-6

General location model with factor analyzer covariance matrix structure and its applications

Regular Article
Published: 31 May 2016

Volume 11, pages 593–609, (2017)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

Leila Amiri¹,
Mojtaba Khazaei¹ &
Mojtaba Ganjali¹

412 Accesses
4 Citations
Explore all metrics

Abstract

General location model (GLOM) is a well-known model for analyzing mixed data. In GLOM one decomposes the joint distribution of variables into conditional distribution of continuous variables given categorical outcomes and marginal distribution of categorical variables. The first version of GLOM assumes that the covariance matrices of continuous multivariate distributions across cells, which are obtained by different combination of categorical variables, are equal. In this paper, the GLOMs are considered in both cases of equality and unequality of these covariance matrices. Three covariance structures are used across cells: the same factor analyzer, factor analyzer with unequal specific variances matrices (in the general and parsimonious forms) and factor analyzers with common factor loadings. These structures are used for both modeling covariance structure and for reducing the number of parameters. The maximum likelihood estimates of parameters are computed via the EM algorithm. As an application for these models, we investigate the classification of continuous variables within cells. Based on these models, the classification is done for usual as well as for high dimensional data sets. Finally, for showing the applicability of the proposed models for classification, results from analyzing three real data sets are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sampling Techniques for Quantitative Research

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

References

Airoldi JP, Hoffmann RS (1984) Age variation in volves (Microtus californicus, M. ochrogaster) and its significance for systematic studies. Occasional papers of the Museum of Natural History, University of Kansas, Lawrence KS 111:1–45
Anderson JA, Pemberton JD (1985) The grouped continuous model for multivariate ordered categorical variables and covariate adjustment. Biometrics 41:875–885
Article MathSciNet MATH Google Scholar
Baek J, McLachlan GJ (2008) Mixtures of factor analyzers with common factor loadings for the clustering and visualisation of high-dimensional data. Technical Report NI08018-SCH. Preprint Series of the Isaac Newton Institute for Mathematical Sciences, Cambridge
Baek J, McLachlan GJ, Flack LK (2010) Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualisation of high-dimensional data. IEEE Trans Pattern Anal Mach Intell 32:1298–1309
Article Google Scholar
Barnard J, McCulloch RE, Meng XL (2000) Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage. Stat Sin 10:1281–1311
MathSciNet MATH Google Scholar
Bartholomew DJ, Knott M, Moustaki I (2011) Latent variable models and factor analysis: a unified approach, 3rd edn. Wiley, New York
Book MATH Google Scholar
Belin TR, Hu MY, Young AS, Grusky O (1999) Performance of a general location model with an ignorable missing-data assumption in a multivariate mental health services study. Stat Med 18:3123–3135
Article Google Scholar
Bishop YMM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis: theory and practice. MIT Press, Cambridge
MATH Google Scholar
Browne RP, McNicholas PD (2012) Model-based clustering, classification, and discriminant analysis of data with mixed type. J Stat Plann Inference 142:2976–2984
Article MathSciNet MATH Google Scholar
Cai JH, Song XY, Lam KH, Ip HS (2011) A mixture of generalized latent variable models for mixed mode and heterogeneous data. Comput Stat Data Anal 55:2889–2907
Article MathSciNet MATH Google Scholar
de Leon AR, Carrière KC (2007) General mixed-data model: extension of general location and grouped continuous models. Can J Stat 35:533–548
Article MathSciNet MATH Google Scholar
de Leon AR, Carrière KC (2013) Analysis of mixed data: methods and applications. Chapman & Hall/CRC, London
Book MATH Google Scholar
de Leon AR, Soo A, Williamson T (2011) Classification with discrete and continuous variables via general mixed-data models. J Appl Stat 38:1021–1032
Article MathSciNet Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39:1–38
MathSciNet MATH Google Scholar
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Article Google Scholar
Flury B (2012) Flury: data sets from flury, 1997. R package version 0.1-3 (2012)
Fonseca JRS (2010) On the performance of information criteria in latent segment models. World Acad Sci Eng Technol 63:330–337
Google Scholar
Gershenfeld N (1997) Nonlinear inference and cluster-weighted modeling. Ann N Y Acad Sci 808:18–24
Article Google Scholar
Ingrassia S, Punzo A, Vittadini G, Minotti SC (2015) The generalized linear mixed cluster-weighted model. J Classif 32:85–113
Article MathSciNet MATH Google Scholar
Krzanowski WJ (1982) Mixtures of continuous and categorical variables in discriminant analysis: a hypothesis testing approach. Biometrics 38:991–1002
Article MathSciNet MATH Google Scholar
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York
MATH Google Scholar
Little RJA, Schluchter MD (1985) Maximum likelihood estimation for mixed continuous and categorical data with missing values. Biometrika 72:492–512
Article MathSciNet MATH Google Scholar
Little RJ, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York
MATH Google Scholar
Liu C, Rubin DB (1998) Ellipsoidally symmetric extensions of the general location model for mixed categorical and continuous data. Biometrika 85:673–688
Article MathSciNet MATH Google Scholar
Lopes HF, West M (2004) Bayesian model assessment in factor analysis. Stat Sin 14:41–67
MathSciNet MATH Google Scholar
Nguyen HT, Coomans D, Leermakers M, Boman J (1997) Multivariate statistical analysis of human exposure to trace elements from coal in Vietnam. in: SPRUCE IV, international conference on statistical aspects of health and the environment, Enschede, The Netherlands (1997)
Olkin I, Tate RF (1961) Multivariate correlation models with mixed discrete and continuous variables. Ann Math Stat 32:448–465
Article MathSciNet MATH Google Scholar
Peng Y, Little RJA, Raghunathan TE (2004) An extended general location model for causal inferences from data subject to noncompliance and missing values. Biometrics 60:598–607
Article MathSciNet MATH Google Scholar
Poon WY, Lee SY (1987) Maximum likelihood estimation of multivariate polyserial and polychoric correlation coefficients. Psychometrika 52:409–430
Article MathSciNet MATH Google Scholar
Punzo A, Ingrassia S (2013) On the use of the generalized linear exponential cluster-weighted model to asses local linear independence in bivariate data. QdS J Methodol Appl Stat 15:131–144
Google Scholar
Punzo A, Ingrassia S (2015) Clustering bivariate mixed-type data via the cluster-weighted model. Comput Stat. doi:10.1007/s00180-015-0600-z
Rencher AC (1998) Multivariate statistical inference and applications. Wiley, New York
MATH Google Scholar
Schafer JL (1997) Analysis of incomplete multivariate data. CRC Press, New York
Book MATH Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MathSciNet MATH Google Scholar
Smyth C, Coomans D, Everingham Y (2006) Clustering noisy data in a reduced dimension space via multivariate regression trees. Pattern Recognit 39:424–431
Article MATH Google Scholar
Subedi S, Punzo A, Ingrassia S, McNicholas PD (2013) Clustering and classification via cluster-weighted factor analyzers. Adv Data Anal Classif 7:5–40
Article MathSciNet MATH Google Scholar
Subedi S, Punzo A, Ingrassia S, McNicholas PD (2015) Cluster-weighted t-factor analyzers for robust model-based clustering and dimension reduction. Stat Methods Appl 24:623–649
Article MathSciNet MATH Google Scholar
Wu CFJ (1983) On the convergence properties of the EM algorithm. Ann Stat 11:95–103
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors are grateful for the helpful comments and valuable suggestions given by the referees which have greatly improved quality of the paper.

Author information

Authors and Affiliations

Department of Statistics, Faculty of Mathematical Sciences, Shahid Beheshti University, PO BOX: 1983969411, Tehran, Iran
Leila Amiri, Mojtaba Khazaei & Mojtaba Ganjali

Authors

Leila Amiri
View author publications
You can also search for this author in PubMed Google Scholar
Mojtaba Khazaei
View author publications
You can also search for this author in PubMed Google Scholar
Mojtaba Ganjali
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mojtaba Khazaei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Amiri, L., Khazaei, M. & Ganjali, M. General location model with factor analyzer covariance matrix structure and its applications. Adv Data Anal Classif 11, 593–609 (2017). https://doi.org/10.1007/s11634-016-0258-6

Download citation

Received: 01 October 2015
Revised: 04 May 2016
Accepted: 23 May 2016
Published: 31 May 2016
Issue Date: September 2017
DOI: https://doi.org/10.1007/s11634-016-0258-6

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

General location model with factor analyzer covariance matrix structure and its applications

Abstract

Access this article

Similar content being viewed by others

Sampling Techniques for Quantitative Research

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

General location model with factor analyzer covariance matrix structure and its applications

Abstract

Access this article

Similar content being viewed by others

Sampling Techniques for Quantitative Research

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation