Improper use of endogenous formative variables

https://doi.org/10.1016/j.jbusres.2012.08.006Get rights and content

Abstract

Researchers often develop and test conceptual models containing formative variables. In many cases, these formative variables are specified as being endogenous. This article provides a clarification of formative variable theory, distinguishing between the formative latent variable and the formative composite variable. When an endogenous latent variable relies on formative indicators for measurement, empirical studies can say nothing about the relationship between exogenous variables and the endogenous formative latent variable: conclusions can only be drawn regarding the exogenous variables' relationships with a composite variable. The authors also show the dangers associated with developing theory about antecedents to endogenous formative variables at the (aggregate) formative latent variable level. Modeling relationships with endogenous formative variables at the (disaggregate) indicator level informs richer theory development, and encourages more precise empirical testing. When antecedents' relationships with endogenous formative variables are modeled at the formative latent variable level rather than the formative indicator level, theory construction can verge on the superficial, and empirical findings can be ambiguous in substantive meaning.

Introduction

Formative variables are receiving increasing attention in business research (Diamantopoulos, 2008), as the Journal of Business Research 2008 special issue on formative indicators demonstrates. Formative indicators are used in different ways in the literature. For instance, Cadogan, Souchon, and Procter (2008, p. 1263) model each of the three dimensions underpinning market-oriented behavior in a formative way, to create a toolkit with “diagnostic capabilities which can help managers understand how to improve the quality of market orientation within the firm”. Diamantopoulos and Siguaw (2006) and Ruiz, Gremler, Washburn, and Carrión (2008) compare the performance of scale development procedures adopting reflective and formative assumptions. Diamantopoulos and Siguaw (2006, p. 263) conclude, “the choice of measurement perspective impacts on the content, parsimony and criterion validity” of the measures they develop, while Ruiz et al. (2008, 1287) contend, “the formative index significantly outperforms a reflective measure in terms of criterion validity”. These studies, and others like them, demonstrate the potential utility of modeling variables using formative indicators.

Studies also model formative measures as endogenous latent variables in structural models: these models attempt to explain variance of the formative latent variable, and test hypotheses about the causes of the explained variance. Two problematic issues are apparent in many such studies. First, as the next section demonstrates, researchers relying on formative indicators can never know how a formative latent variable varies, and can say little with confidence about the amount of variance explained in a formative latent variable. As a result, unless a census of formative indicators is used (in which case, and as discussed subsequently, the researcher is using a formative composite variable rather than a formative latent variable), one never knows how a potential antecedent variable is related to a formative latent variable. Without such a census of formative indicators, all one can do is comment on how the proposed antecedent covaries with the subset of formative indicators used: one cannot necessarily generalize the observed covariances to the full population of defining formative indicators.

Second, researchers interested in identifying exogenous causes of endogenous formative variables often make the mistake of modeling the endogenous variable at the aggregated formative variable level. Given that a “variable's formative indicators may have different antecedent factors, and those antecedents may not influence all indicators the same way” (Cadogan et al., 2008), failure to model antecedents at the disaggregated formative item level can obscure true relationships in the population, either hiding existing relationships, or suggesting the presence of non-existent relationships. As such, when antecedents to endogenous formative variables are modeled with causal paths affecting the formative variable, rather than affecting the formative indicators, the empirical findings have uncertain interpretation.

The purpose of the current note is to demonstrate the reasoning behind the conclusions presented above, and to make recommendations regarding the appropriate use of endogenous formative variables. In order to set the context, the next section outlines the assumptions of the formative indicator model, contrasting them with the more traditional reflective measurement model. The authors then explain why researchers can never know how a formative latent variable varies, and examine the potential problems arising from researchers erroneously modeling antecedents to endogenous formative variables at the endogenous variable level. Finally, recommendations for future research practice are provided, and the conditions under which antecedents to formative variables can appropriately be modeled at the endogenous variable level (as opposed to the indicator level) are discussed.

Section snippets

Reflective and formative indicators

Bollen and Lennox (1991) distinguish between two sets of measurement assumptions. The first is based on classical test theory, utilizing what are known as reflective items. Reflective items are dependent on the value of a latent variable, with the latent variable determining the item scores. On the other hand, the formative perspective treats items as being determinants of the latent variable: formative variables are defined by their items (Bagozzi & Fornell, 1982). Fig. 1a shows a path diagram

The problem of endogenous formative latent variables in substantive empirical analyses

Importantly, since a formative latent variable is defined by its indicators (Jarvis et al., 2003), a change in the value of a formative latent variable cannot occur independently of a change in the value of one or more of its indicators. When a formative latent variable changes in magnitude, the change must occur because either (a) the magnitude of one or more of the measured formative indicators changes and/or, (b) because the magnitude of one or more of those formative indicators not measured

The problem of modeling endogenous formative variables at the aggregate variable level

Even if researchers are cognizant of the distinction between formative latent and composite variables, there remain serious problems with conceptualizing and modeling antecedents at the formative composite level. Unfortunately, numerous existing studies develop and test applied theory at the composite variable level (see Fig. 6) when the appropriate approach would be to model relationships at the disaggregated composite indicator level (see Fig. 4). In the following section, the potential

Implications, recommendations and conclusions

The difference between a formative latent variable and a composite variable is not trivial, as Fig. 5 demonstrates, the magnitude and variance of a formative latent variable, and its covariances with other variables in a nomological net, are unknowable without additional measurement (the magnitude of y4) and additional information on the construct definition (the magnitude of Fig. 5’s γ4, β4, and β5 paths). Researchers looking to model an antecedent to a formative latent variable in their

Acknowledgment

Thanks to John Antonakis, Woojung Chang, Adamantios Diamantopoulos, Heiner Evanschitzky, George Franke, Anssi Tarkiainen, Martin Wetzels, Arch Woodside, and the anonymous reviewers for their useful comments and suggestions for improving this manuscript. All errors and omissions remain the sole responsibility of the authors.

References (30)

  • R.P. Bagozzi et al.

    Theoretical concepts, measurements, and meaning

  • H.M. Blalock

    Conceptualization and measurement in the social sciences

    (1982)
  • K.A. Bollen

    Interpretational confounding is due to misspecification, not to type of indicator: Comment on Howell, Breivik, and Wilcox (2007)

    Psychological Methods

    (2007)
  • K. Bollen et al.

    Conventional wisdom on measurement: A structural equation perspective

    Psychological Bulletin

    (1991)
  • D. Borsboom

    The attack of the psychometricians

    Psychometrika

    (2006)
  • Cited by (95)

    View all citing articles on Scopus
    1

    Tel.: + 44 121 2043152.

    View full text