Skip to main content
Log in

On Covariate Importance for Regression Models with Multivariate Response

  • Published:
Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Abstract

We address the question of identifying the relative importance of covariates for model response, a form of sensitivity analysis. Relative importance is typically implemented as part of the model building procedure, e.g., forward variable selection or backward elimination. Here, we take a different perspective. We assume a model with multiple covariates and multivariate response has been selected and formulate criteria to assess covariate importance. Hence, with regard to covariates, our approach is joint, post model fitting, rather than conditional or sequential model creation. The noteworthy challenge we accommodate is the handling of multivariate response where individual regressions may give differing, perhaps conflicting, relative importances. In addition, we recognize that, according to the model specification, importance/sensitivity to covariates may be a global or a local issue. For models with multivariate response, we provide a criterion that (i) produces one sensitivity coefficient for each covariate, (ii) takes into account the model specification of uncertainty, and (iii) is based only on the model parameters but does not require a distribution on the covariates. However, with a prior on the covariates, in special cases, we show that comparison of covariates using this criterion gives the same results as comparison of marginal variances of the inverse predictive distributions of the covariates. We illustrate with an application examining sensitivity of tree abundance to climate.

Supplementary materials accompanying this paper appear on-line.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Azen, R., and Budescu, D. V. (2003), “The dominance analysis approach for comparing predictors in multiple regression,” Psychological methods, 8(2), 129–148.

  • ———   (2006), “Comparing Predictors in Multivariate Regression Models: An Extension of DominanceAnalysis,” Journal of Educational and Behavioral Statistics, 31(2), 157–180.

  • Bechtold, W. A., and Patterson, P. L. (2005), The Enhanced Forest Inventory and Analysis Program—National Sampling Design and Estimation Procedures, General technical report, SRS-80 edn, USDA Forest Service, Southern Research Station, Asheville, NC.

  • Budescu, D. V. (1993), “Dominance analysis—a new approach to the problem or relative importance of predictors in multiple-regression,”Psychological Bulletin, 114(3), 542–551.

  • Campolongo, F., Cariboni, J., and Saltelli, A. (2007), “An effective screening design for sensitivity analysis of large models,” Environmental Modelling and Software, 22, 1509–1518.

    Article  Google Scholar 

  • Cariboni, J., Gatelli, D., Liska, R., and Saltelli, A. (2007), “The role of sensitivity analysis in ecological modelling,” Ecological Modelling, 203, 167–182.

    Article  Google Scholar 

  • Chao, Y.-C. E., Zhao, Y., Kupper, L. L., and Nylander-French, L. A. (2008), “Quantifying the relative importance of predictors in multiple linear regression analyses for public health studies,” J Occup Environ Hyg, 5(8), 519–529.

    Article  Google Scholar 

  • Chib, S., and Greenberg, E. (1998) , “Analysis of Multivariate Probit Models,” Biometrika, 85(2), 347–361.

    Article  MATH  Google Scholar 

  • Clark, J., Bell, D., Chu, C., Courbaud, B., Dietze, M., Hersh, M., HilleRisLambers, J., Ibáñez, I., LaDeau, S., McMahon, S. et al. (2010), “High-dimensional coexistence based on individual variation: a synthesis of evidence,” Ecological Monographs, 80(4), 569–608.

    Article  Google Scholar 

  • Clark, J. S., Bell, D. M., Hersh, M. H., and Nichols, L. (2011), “Climate change vulnerability of forest biodiversity: climate and competition tracking of demographic rates,” Global Change Biology, 17, 1834–1849.

    Article  Google Scholar 

  • Clark, J. S., Bell, D. M., Kwit, M., Powell, A., and Zhu, K. (2013), “Dynamic inverse prediction and sensitivity analysis with high-dimensional responses: Application to climate-change vulnerability of biodiversity,” Journal of Agricultural, Biological, and Environmental Statistics, p. Accepted.

  • Confalonieri, R., Bellocchi, G., Bregaglio, S., Donatelli, M., and Acutis, M. (2010), “Comparison of sensitivity analysis techniques: A case study with the rice model WARM,” Ecological Modelling, 221, 1897–1906.

    Article  Google Scholar 

  • Cook, R. D. (1993), “Exploring Partial Residual Plots,” Techometrics, 35(4), 351–362.

    Article  MATH  Google Scholar 

  • Daly, C., Halbleib, M., Smith, J. I., Gibson, W. P., Doggett, M. K., Taylor, G. H., Curtis, J., and Pasteris, P. P. (2008), “Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States,” International Journal of Climatology, 28(15), 2031–2064.

    Article  Google Scholar 

  • Darlington, R. B. (1968), “Multiple regression in psychological research and practice,” Psychological Bulletin, 69(3), 161–182.

    Article  Google Scholar 

  • Ebrahimi, N., Soofi, E. S., and Soyer, R. (2010), “Information Measures in Perspective,” International Statistical Review, 78(3), 383–412.

    Article  MathSciNet  Google Scholar 

  • Fieberg, J., and Jenkins, K. J. (2005), “Assessing uncertainty in ecological systems using global sensitivity analyses: a case example of simulated wolf reintroduction effects on elk,” Ecological Modelling, 187, 259–280.

    Article  Google Scholar 

  • Gelman, A. (2004), “Parameterization and Bayesian Modeling,” Journal of the American Statistical Association, 99(466), 537–545.

    Article  MATH  MathSciNet  Google Scholar 

  • Gilman, S., Wethey, D., and Helmuth, B. (2006) , “Variation inthe sensitivity of organismal body temperature to climate changeover local and geographic scales,” Proceedings of theNational Academy of Sciences of the United States of America,103(25), 9560–9565.

    Article  Google Scholar 

  • Ginot, V., Gaba, S., Beaudouin, R., Aries, F., and Monod, H. (2006), “Combined use of local and ANOVA-based global sensitivity analyses for the investigation of a stochastic dynamic model: Application to the case study of an individual-based model of a fish population,” Ecological modelling, 193(3–4), 479–491.

    Article  Google Scholar 

  • Gneiting, T., and Raftery, A. E. (2007), “Strictly Proper Scoring Rules, Prediction, and Estimation,” Journal of theAmerican Statistical Association, 102(477), 359–378.

    Article  MATH  MathSciNet  Google Scholar 

  • Goldstein, W. M. (1990), “Judgments of Relative Importance in Decision Making: Global vs Local Interpretations of Subjective Weight,” Organizational Behavior and Human Decision Preocesses, 47, 313–336.

    Article  Google Scholar 

  • Green, P. E., Carroll, J. D., and DeSarbo, W. S. (1978), “A New Measure of Predictor Variable Importance in Multiple Regression,” Journal of Marketing Research, 15(3), 365–360.

    Article  Google Scholar 

  • Hamby, D. M. (1994), “A review of techniques for parameter sensitivity analysis of environmental models,” Environmental Monitoring and Assessment, 32, 135–154.

    Article  Google Scholar 

  • Homma, T., and Saltelli, A. (1996), “Importance measures in global sensitivity analysis of nonlinear models,”Reliability Engineering and System Safety, 52(1), 1–17.

    Article  Google Scholar 

  • Johnson, J., and LeBreton, J. (2004), “History and use of relative importance indices in organizational research,” Organizational Research Methods, 7(3), 238–257.

    Article  Google Scholar 

  • Johnson, J. W. (2000), “A Heuristic Method for Estimating the Relative Weight of Predictor Variables in Multiple Regression,”Multivariate Behavioral Research, 35(1), 1–19.

    Article  Google Scholar 

  • Kruskal, W., and Majors, R. (1989), “Concepts of Relative Importance in Recent Scientific Literature,” The AmericanStatistician, 43(1), 2–6.

    Google Scholar 

  • LeBreton, J. M., and Tonidandel, S. (2008), “Multivariate Relative Importance: Extending Relative Weight Analysis to Multivariate Criterion Spaces,” Journal of Applied Psychology,93(2), 329–345.

    Article  Google Scholar 

  • Nathans, L. L., Oswald, F. L., and Nimon, K. (2012), “Interpreting Multiple Linear Regression: A Guidebook of Variable Importance,” Practical Assessment, Research & Evaluation, 17(9), 1–19.

  • Retzer, J., Soofi, E., and Soyer, R. (2009), “Information importance of predictors: Concept, measures, Bayesian inference, and applications,” Computational Statistics and Data Analysis, 53, 2363–2377.

    Article  MATH  MathSciNet  Google Scholar 

  • Smith, W. B., Miles, P. D., Perry, C. H., and Pugh, S. A. (2009), Forest resources of the United States, 2007, General technical report, WO-78 edn., USDA Forest Service, Washington Office, Washington, DC.

  • Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., and Zeileis, A. (2008), “Conditional variable importance for random forests,” BMC Bioinformatics, 9, 307.

    Article  Google Scholar 

  • Thomas, D. R., Hughes, E., and Zumbo, B. D. (1998), “On Variable Importance in Linear Regression,” Social Indicators Research, 45(1/3), 253–275.

    Article  Google Scholar 

  • Yen, J. D. L., Thomson, J. R., Vesk, P. A., and Nally, R. M. (2011), “To what are woodland birds responding? Inference on relative importance of in-site habitat variables using several ensemble habitat modelling techniques,”Ecography, 34, 946–954.

    Article  Google Scholar 

  • Zhu, J., Eickhoff, J. C., and Yan, P. (2005), “Generalized linear latent variable models for repeated measures of spatially correlated multivariate data,” Biometrics, 61(3), 674–83.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors thank Jim Clark for motivating this work and for useful conversations, Kai Zhu for help in preparing the FIA dataset which we analyzed and anonymous Associate Editor and Reviewers for helpful comments on earlier versions of this paper. The work of the authors was supported in part by NSF CMG 0934595 and NSF CDI 0940671.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jenný Brynjarsdóttir.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 146 KB)

Appendices

Appendix 1: Proof of Theorem 2.2

The model in (3) and (4) implies that the joint distribution of \(\mathbf {Y}\) and \(\mathbf {X}\) is Gaussian and therefore the conditional distribution \([ \mathbf {X}|\mathbf {Y},\theta ]\) is also Gaussian with covariance matrix

$$\begin{aligned} \Sigma _{X|Y}&= \left( \Sigma _{XX}^{-1} + B^\mathrm{T} \Sigma _{Y|X}^{-1} B \right) ^{-1} \\&= \begin{pmatrix} \frac{\sigma ^2}{\sigma ^4 - \sigma _{12}^2} + \mathbf {b}_1^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_1 &{} \frac{-\sigma _{12}}{\sigma ^4 - \sigma _{12}^2} + \mathbf {b}_1^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_2\\ \frac{-\sigma _{12}}{\sigma ^4 - \sigma _{12}^2} + \mathbf {b}_2^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_1&{} \frac{\sigma ^2}{\sigma ^4 - \sigma _{12}^2} + \mathbf {b}_2^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_2 \end{pmatrix}^{-1} \end{aligned}$$

If follows that

$$\begin{aligned} \mathrm {Var}(X_1|\mathbf {Y},\theta )&\le \mathrm {Var}(X_2|\mathbf {Y},\theta ) \\ \Leftrightarrow \qquad \frac{\sigma ^2}{\sigma ^4 - \sigma _{12}^2} + \mathbf {b}_2^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_2&\le \frac{\sigma ^2}{\sigma ^4 - \sigma _{12}^2} + \mathbf {b}_1^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_1 \\ \Leftrightarrow \qquad \mathbf {b}_2^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_2&\le \mathbf {b}_1^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_1 . \end{aligned}$$

Appendix 2: Proof of Remark 2.3

Theorem 6.1

Let

$$\begin{aligned} \mathbf {Y} | \mathbf {X}, B, \Sigma _{Y|X}&\sim \mathcal {N}_r ( B \mathbf {X} , \Sigma _{Y|X}) \quad \text {and} \quad \mathbf {X} | \Sigma _{XX} \sim \mathcal {N}_p ( {\varvec{\mu }}_X, \Sigma _{XX} ). \end{aligned}$$
(21)

Let \(\tilde{\mathbf {X}}\) be the standardized \(\mathbf {X}\), i.e., \(\tilde{\mathbf {X}} = D^{-1} \mathbf {X}\) where \( D = \mathrm{diag}(\sigma _{X_1}, \ldots , \sigma _{X_p})\) and \(\sigma _{X_j}\) is the marginal standard deviation of \(X_j\). Let

$$\begin{aligned} \mathbf {Y} | \tilde{\mathbf {X}}, \tilde{B}, \tilde{\Sigma }_{Y|X}&\sim \mathcal {N}_r ( \tilde{B} \tilde{\mathbf {X}} , \tilde{\Sigma }_{Y|X}) \quad \mathrm{and} \quad \tilde{\mathbf {X}} | R \sim \mathcal {N}_p ( \tilde{{\varvec{\mu }}}_X, R ) \end{aligned}$$
(22)

be the corresponding relationship between \(\mathbf {Y}\) and \(\tilde{\mathbf {X}}\) where \(R\) is a correlation matrix. Then

$$\begin{aligned} \tilde{B}^\intercal \tilde{\Sigma }_{Y|X}^{-1} \tilde{B} = D B^\intercal \Sigma _{Y|X}^{-1} B D . \end{aligned}$$
(23)

Proof

It follows from (21) that \(\mathbf {Y}\) and \(\mathbf {X}\) are jointly normal and we denote the joint covariance matrix as \(\Sigma = \left( \begin{array}{ll} \Sigma _{YY} &{} \Sigma _{YX} \\ \Sigma _{XY} &{} \Sigma _{XX} \end{array} \right) \) where \(\Sigma _{YY}\) and \(\Sigma _{YX}\) can be found from \(B\), \(\Sigma _{Y|X}\) and \(\Sigma _{XX}\). Also, the joint distribution of \(\mathbf {Y}\) and \(\tilde{\mathbf {X}}\) is normal with covariance matrix \(\tilde{\Sigma } = \left( \begin{array}{ll} \Sigma _{YY} &{} \Sigma _{YX} D^{-1} \\ D^{-1}\Sigma _{XY} &{} D^{-1} \Sigma _{XX} D^{-1} \end{array} \right) \). Furthermore, \(\tilde{B}\) and \(B\) can be written as \(B = \Sigma _{YX} \Sigma _{XX}^{-1}\) and \(\tilde{B} = \tilde{\Sigma }_{YX} R^{-1}\), where \(R=D^{-1} \Sigma _{XX} D^{-1} \). Finally, note that \(\tilde{\Sigma }_{Y|X} = \Sigma _{Y|X}\). Putting all this together we get

$$\begin{aligned} \tilde{B}^\intercal \tilde{\Sigma }_{Y|X}^{-1} \tilde{B}&= (D \Sigma _{XX}^{-1} D ) D^{-1}\Sigma _{XY} \Sigma _{Y|X}^{-1} \Sigma _{YX} D^{-1} ( D \Sigma _{XX}^{-1} D)\\&= D B^\intercal \Sigma _{Y|X}^{-1} B D . \end{aligned}$$

Proof of Remark 2.3

$$\begin{aligned} \mathrm {Var}(X_1/\sigma _{X_1}|\mathbf {Y},\theta ) = \mathrm {Var}(\tilde{X}_1|\mathbf {Y},\theta )&\le \mathrm {Var}(\tilde{X}_2|\mathbf {Y},\theta ) = \mathrm {Var}(X_2/\sigma _{X_2}|\mathbf {Y},\theta ) \\ \qquad \qquad \qquad \text {if and only if } \qquad \tilde{\mathbf {b}}_1^\mathrm{T} \tilde{\Sigma }_{Y|X}^{-1} \tilde{\mathbf {b}}_1&\ge \tilde{\mathbf {b}}_2^\mathrm{T} \tilde{\Sigma }_{Y|X}^{-1} \tilde{\mathbf {b}}_2 \\ \qquad \qquad \qquad \text {if and only if } \qquad \mathcal {I}_1 / \sigma ^2_{X_1} = \mathbf {b}_1^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_1 / \sigma ^2_{X_1}&\ge \mathbf {b}_2^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_2 / \sigma ^2_{X_2} =\mathcal {I}_2 / \sigma ^2_{X_2}. \end{aligned}$$

The first line follows from Theorem 2.1 and the second line follows from Theorem 6.1.

Appendix 3: Theorem for \(p>2\)

Theorem 7.1

Let \(\mathbf {Y}\) and \(\mathbf {X}\) be \(r\) and \(p\) dimensional random vectors where \(p>2\). Let

$$\begin{aligned} \mathbf {Y} | \mathbf {X} \sim \mathcal {N}_r ({\varvec{\mu }}_{Y|X} + B\mathbf {X}, \Sigma _{Y|X} ) \quad \mathrm{and} \quad \mathbf {X} \sim \mathcal {N}_p ({\varvec{\mu }}_X, \Sigma _{XX}). \end{aligned}$$
(24)

For integers \(k\), \(k^\prime \), let \(\mathbf {X}_s = (X_k,X_{k^\prime })^\mathrm{T}\) and let \(\mathbf {X}_{-s}\) be a vector of the remaining elements of \(\mathbf {X}\). Also, assume that the marginal variances of \(X_k\) and \(X_{k^\prime }\) are equal, i.e., \( \Sigma _{X_sX_s} = \left( \begin{array}{ll} \sigma ^2 &{} \sigma _{12} \\ \sigma _{12} &{} \sigma ^2 \end{array} \right) \) and that the parameters \(\theta = ({\varvec{\mu }}_{Y|X}, {\varvec{\mu }}_X, B, \Sigma _{Y|X}, \Sigma _{XX})\) are known. Then

$$\begin{aligned} \mathrm {Var}(X_k|\mathbf {Y}-B_{-s}\mathbf {X}_{-s},\theta )&\le \mathrm {Var}(X_{k^\prime }|\mathbf {Y}-B_{-s}\mathbf {X}_{-s},\theta )\\ \mathrm{if\,and\,only\,if\,} \qquad \mathcal {I}_{k} = \mathbf {b}_{k} \Sigma _{Y|X}^{-1} \mathbf {b}_{k}&\ge \mathbf {b}_{k^\prime } \Sigma _{Y|X}^{-1} \mathbf {b}_{k^\prime } = \mathcal {I}_{k^\prime }, \end{aligned}$$

where \(\mathbf {b}_k\) is the \(k\)th column of \(B\) and \(B_{-s}\) is \(B\) without the \(k\)th and \(k^\prime \)th columns.

Proof

Let \(\mathbf {U} = \mathbf {Y} - B_{-s} \mathbf {X}_{-s}\). Then

$$\begin{aligned} \mathbf {U} | \mathbf {X}_s \sim \mathcal {N}_r ({\varvec{\mu }}_{Y|X} + B_s\mathbf {X}_s, \Sigma _{Y|X} ) \quad \text {and} \quad \mathbf {X}_s \sim \mathcal {N}_2 ({\varvec{\mu }}_{X_s}, \Sigma _{X_s X_s} ), \end{aligned}$$
(25)

where \(B_s = \begin{bmatrix} \mathbf {b}_k&\mathbf {b}_{k^\prime } \end{bmatrix}\). By Theorem 2.1, it then follows that

$$\begin{aligned} \mathrm {Var}(X_k|\mathbf {U},\theta ) \le \mathrm {Var}(X_{k^\prime }|\mathbf {U},\theta ) \quad \Leftrightarrow \quad \mathbf {b}_{k^\prime }^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_{k^\prime } \le \mathbf {b}_{k}^\mathrm{T} \Sigma _{Y|X}^{-1} \mathbf {b}_k . \end{aligned}$$
(26)

To see why (25) holds we note that (24) implies that the joint distribution of \(\mathbf {Y}\), \(\mathbf {X}_s\), and \(\mathbf {X}_{-s}\) is normal. Hence, the joint distribution of \(\mathbf {U}\) and \(\mathbf {X}_s\) is also normal since \((\mathbf {U}^\mathrm{T},\mathbf {X}^\mathrm{T}_s)^\mathrm{T}\) is a linear transformation of \((\mathbf {Y}^\mathrm{T},\mathbf {X}^\mathrm{T}_s, \mathbf {X}^\mathrm{T}_{-s})^\mathrm{T}\). Therefore, the conditional distribution of \(\mathbf {U}|\mathbf {X}_s\) is normal. The mean vector and covariance matrix can be found via iterative expectations:

$$\begin{aligned} E(\mathbf {U} | \mathbf {X}_s)&= E\left( \left. E(\mathbf {Y} - B_{-s} \mathbf {X}_{-s} | \mathbf {X}_s, \mathbf {X}_{-s} ) \right| \mathbf {X}_s \right) \\&= E\left( \left. {\varvec{\mu }}_{Y|X} + B_{-s} \mathbf {X}_{-s}+ B_s \mathbf {X}_s - B_{-s} \mathbf {X}_{-s} \right| \mathbf {X}_s \right) \\&= E\left( \left. {\varvec{\mu }}_{Y|X} + B_s \mathbf {X}_s \right| \mathbf {X}_s \right) = {\varvec{\mu }}_{Y|X} + B_s \mathbf {X}_s \\ \text {and} \quad \mathrm {Var}(\mathbf {U} | \mathbf {X}_s)&= \mathrm {Var}\left( \left. E(\mathbf {Y} - B_{-s} \mathbf {X}_{-s} | \mathbf {X}_{-s}, \mathbf {X}_{s}) \right| \mathbf {X}_s\right) \\&\quad + E\left( \mathrm {Var}( \left. \mathbf {Y} - B_{-s} \mathbf {X}_{-s} | \mathbf {X}_{-s}, \mathbf {X}_{s} ) \right| \mathbf {X}_s \right) \\&= \mathrm {Var}\left( \left. {\varvec{\mu }}_{Y|X} + B_s \mathbf {X}_s \right| \mathbf {X}_s\right) + E\left( \mathrm {Var}( \left. \mathbf {Y} | \mathbf {X}_{-s}, \mathbf {X}_{s} ) \right| \mathbf {X}_s \right) \\&= 0 + E\left( \left. \Sigma _{Y|X} \right| \mathbf {X}_s \right) = \Sigma _{Y|X}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brynjarsdóttir, J., Gelfand, A.E. On Covariate Importance for Regression Models with Multivariate Response. JABES 19, 479–500 (2014). https://doi.org/10.1007/s13253-014-0187-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-014-0187-9

Keywords

Navigation