Open Access
June 2010 Transposable regularized covariance models with an application to missing data imputation
Genevera I. Allen, Robert Tibshirani
Ann. Appl. Stat. 4(2): 764-790 (June 2010). DOI: 10.1214/09-AOAS314

Abstract

Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data matrix is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal, in which the rows and columns each have a separate mean vector and covariance matrix. By placing additive penalties on the inverse covariance matrices of the rows and columns, these so-called transposable regularized covariance models allow for maximum likelihood estimation of the mean and nonsingular covariance matrices. Using these models, we formulate EM-type algorithms for missing data imputation in both the multivariate and transposable frameworks. We present theoretical results exploiting the structure of our transposable models that allow these models and imputation methods to be applied to high-dimensional data. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility.

Citation

Download Citation

Genevera I. Allen. Robert Tibshirani. "Transposable regularized covariance models with an application to missing data imputation." Ann. Appl. Stat. 4 (2) 764 - 790, June 2010. https://doi.org/10.1214/09-AOAS314

Information

Published: June 2010
First available in Project Euclid: 3 August 2010

zbMATH: 1194.62079
MathSciNet: MR2758420
Digital Object Identifier: 10.1214/09-AOAS314

Keywords: Covariance estimation , EM algorithm , imputation , Matrix-variate normal , transposable data

Rights: Copyright © 2010 Institute of Mathematical Statistics

Vol.4 • No. 2 • June 2010
Back to Top