Abstract
This article offers a formal identification analysis of the problem in comparing coefficients from linear probability models (LPM) between groups. We show that differences in coefficients from these models can result not only from genuine differences in effects, but also from differences in one or more of the following three components: outcome truncation, scale parameters and distributional shape of the predictor variable. These results point to limitations in using LPM coefficients for group comparisons. We also provide Monte Carlo simulations and real examples to illustrate these limitations, and we suggest a restricted approach to using LPM coefficients in-group comparisons.
Similar content being viewed by others
Notes
We notice that probit model is also widely used in economics. In our search, we identified 24 articles in QJE using the probit, with the vast majority (23) reporting marginal effects implied by the probit. Of the 24 articles, we identified eight articles making some form of comparison between groups (reporting marginal effects).
Allison (1999a, b) and Williams (2009) suggest the heteroskedastic (location-scale) probit or logit model as a possible solution to these issues. However, the utility of these models is disputed. Long (2009) and Breen et al. (2014) show that these models cannot separate coefficient differences between groups from differences in latent error dispersion, unless unrealistic assumptions are maintained. We therefore refrain from considering this class of models any further in this paper.
Exceptions are Lewbel et al. (2012) and Athey and Imbens (Athey and Imbens (2006)) who show that the LPM does not identify the treatment effect of interest in difference in differences (DiD) models. As DiD models involves comparing trends among groups, their results can be taken to support the results we report in this article.
The error term is inherently heteroscedastic (Goldberger 1964), but, because we are interested in identification, this issue does not concern us here.
We are here using that showing that \(A\Rightarrow B\) is equivalent to show that \(\lnot { B} \Rightarrow \lnot { A}\) .
References
Ai, C., Norton, E.C.: Interaction terms in logit and probit models. Econ Lett 80, 123–129 (2003)
Allison, P.D.: Comparing logit and probit coefficients across groups. Sociolog Methods Res 28, 186–208 (1999a)
Allison, P.: Logistic regression using SAS: theory and application. SAS Institute, Cary, NC (1999b)
Amemiya, T.: Qualitative response models: a survey. J Econ Lit 19, 1483–1536 (1981)
Athey, S., Imbens, G.W.: Identification and inference in nonlinear difference-in-differences models. Econometrica 74, 2006 (2006)
Breen, R., Holm, A., Karlson, K.B.: Correlations and non-linear probability models. Forthcoming, Sociological Methods and Research. (2014)
Goldberger, A.S.: Econometric theory. Wiley, New York (1964)
Greene, W.H.: Testing hypotheses about interaction terms in non-linear models. Econ Lett 107, 291–296 (2010)
Greene, W.H.: Econometric analysis, 7th edn. Prentice Hall, Upper Saddle River (2011)
Karaca-Mandic, P., Norton, E.C., Dowd, B.: Interaction terms in nonlinear models. Health Ser Res 47, 255–274 (2011)
Lewbel, A., Dong, Y., Yang, T.T.: Comparing features of convenient estimators for binary choice models with endogenous regressors. Can J Econ 45, 809–829 (2012)
Long, J.S.: Group comparisons in logit and probit using predicted probabilities. Unpublished working paper 25 June 2009. (2009)
Mare, R.D.: Change and stability in educational stratification. Am Sociolog Rev 46, 72–87 (1981)
Mare, R.D.: Response: statistical models of educational stratification: Hauser and Andrew’s model for school transitionsa. Sociolog Methodol 36, 27–37 (2006)
Norton, E.C., Wang, H., Ai, C.: Computing interaction effects and standard errors in logit and probit models. Stata J 4, 154–167 (2004)
Swait, J., Louviere, J.: The role of the scale parameter in the estimation and comparison of multinomial logit models. J Market Res 30, 305–314 (1993)
Williams, R.: Using heterogeneous choice models to compare logit and probit coefficients across groups. Sociolog Methods Res 37, 531–559 (2009)
Wooldrige, J.M.: Econometric analysis of cross section and panel data. MIT press, Cambridge (2002)
Xie, Y.: Values and limitations of statistical models. Res Stratif Soc Mob 29, 343–349 (2011)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Identical sign of LPM and LLM coefficients
We want to show (10), i.e.
From (11) we have that
We show that \(\beta ^{*}>0\Rightarrow \beta ^{\textit{LPM}}>0\). The situation in which \(\beta ^{*}<0\Rightarrow \beta ^{\textit{LPM}}<0\) follows in an equivalent way.
Assume that \(\beta ^{*}>0\), which implies that F(.) is an increasing function of x. Rewrite (11) as
from which we obtain
We can then evaluate each of the terms. For the first term we have that \((x-E(x))<0\) and that \(F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) <F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) \) for \(x<E(x)\). This further implies that
For the second term we have that \((x-E(x))>0\) and that \(F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) <F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) \) for \(x>E(x)\) and that:
It then follows that
Next we show the case when \(\beta ^{*}=0\Rightarrow \beta ^{LPM}=0\). This follows straight forward from
Now we have established that
To show that \(\beta ^{\textit{LMP}}>0\Rightarrow \beta ^{*}>0\) we can instead show that \(\beta ^{*}\le 0\Rightarrow \beta ^{\textit{LPM}}\le 0\),Footnote 5 If \(\beta ^{*}\le 0\) then either \(\beta ^{*}=0\) or \(\beta ^{*}<0\). If \(\beta ^{*}=0\) we know from equation (10) that \(\beta ^{LPM}=0\). Similarly, we have that \(\beta ^{*}<0\) leads to \(\beta ^{\textit{LPM}}<0\). Then we have that \(\beta ^{\textit{LPM}}\le 0\). In a similar way we can show the \(\beta ^{\textit{LPM}}=0\Rightarrow \beta ^{*}=0\) and \(\beta ^{\textit{LPM}}<0\Rightarrow \beta ^{*}<0\).
We have thus shown:
1.2 Second-order Taylor approximation
With one predictor variable, \(x\), with mean \(\gamma \), we derive cov(y,x):
We approximate (16) by a second-order Taylor approximation around \(\beta ^{*}=0\):
yielding (13).
Rights and permissions
About this article
Cite this article
Holm, A., Ejrnæs, M. & Karlson, K. Comparing linear probability model coefficients across groups. Qual Quant 49, 1823–1834 (2015). https://doi.org/10.1007/s11135-014-0057-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-014-0057-0