Comparing linear probability model coefficients across groups

Holm, Anders; Ejrnæs, Mette; Karlson, Kristian

doi:10.1007/s11135-014-0057-0

Comparing linear probability model coefficients across groups

Published: 26 July 2014

Volume 49, pages 1823–1834, (2015)
Cite this article

Quality & Quantity Aims and scope Submit manuscript

Anders Holm^1,2,
Mette Ejrnæs³ &
Kristian Karlson¹

2234 Accesses
16 Citations
4 Altmetric
Explore all metrics

Abstract

This article offers a formal identification analysis of the problem in comparing coefficients from linear probability models (LPM) between groups. We show that differences in coefficients from these models can result not only from genuine differences in effects, but also from differences in one or more of the following three components: outcome truncation, scale parameters and distributional shape of the predictor variable. These results point to limitations in using LPM coefficients for group comparisons. We also provide Monte Carlo simulations and real examples to illustrate these limitations, and we suggest a restricted approach to using LPM coefficients in-group comparisons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Group Differences in Generalized Linear Models

Predicting group-level outcome variables: An empirical comparison of analysis strategies

Article 05 March 2018

Modelling heterogeneity: on the problem of group comparisons with logistic regression and the potential of the heterogeneous choice model

Article 13 December 2019

Notes

We notice that probit model is also widely used in economics. In our search, we identified 24 articles in QJE using the probit, with the vast majority (23) reporting marginal effects implied by the probit. Of the 24 articles, we identified eight articles making some form of comparison between groups (reporting marginal effects).
Allison (1999a, b) and Williams (2009) suggest the heteroskedastic (location-scale) probit or logit model as a possible solution to these issues. However, the utility of these models is disputed. Long (2009) and Breen et al. (2014) show that these models cannot separate coefficient differences between groups from differences in latent error dispersion, unless unrealistic assumptions are maintained. We therefore refrain from considering this class of models any further in this paper.
Exceptions are Lewbel et al. (2012) and Athey and Imbens (Athey and Imbens (2006)) who show that the LPM does not identify the treatment effect of interest in difference in differences (DiD) models. As DiD models involves comparing trends among groups, their results can be taken to support the results we report in this article.
The error term is inherently heteroscedastic (Goldberger 1964), but, because we are interested in identification, this issue does not concern us here.
We are here using that showing that $A\Rightarrow B$ is equivalent to show that $\lnot { B} \Rightarrow \lnot { A}$ .

References

Ai, C., Norton, E.C.: Interaction terms in logit and probit models. Econ Lett 80, 123–129 (2003)
Allison, P.D.: Comparing logit and probit coefficients across groups. Sociolog Methods Res 28, 186–208 (1999a)
Article Google Scholar
Allison, P.: Logistic regression using SAS: theory and application. SAS Institute, Cary, NC (1999b)
Google Scholar
Amemiya, T.: Qualitative response models: a survey. J Econ Lit 19, 1483–1536 (1981)
Google Scholar
Athey, S., Imbens, G.W.: Identification and inference in nonlinear difference-in-differences models. Econometrica 74, 2006 (2006)
Article Google Scholar
Breen, R., Holm, A., Karlson, K.B.: Correlations and non-linear probability models. Forthcoming, Sociological Methods and Research. (2014)
Goldberger, A.S.: Econometric theory. Wiley, New York (1964)
Google Scholar
Greene, W.H.: Testing hypotheses about interaction terms in non-linear models. Econ Lett 107, 291–296 (2010)
Article Google Scholar
Greene, W.H.: Econometric analysis, 7th edn. Prentice Hall, Upper Saddle River (2011)
Google Scholar
Karaca-Mandic, P., Norton, E.C., Dowd, B.: Interaction terms in nonlinear models. Health Ser Res 47, 255–274 (2011)
Article Google Scholar
Lewbel, A., Dong, Y., Yang, T.T.: Comparing features of convenient estimators for binary choice models with endogenous regressors. Can J Econ 45, 809–829 (2012)
Article Google Scholar
Long, J.S.: Group comparisons in logit and probit using predicted probabilities. Unpublished working paper 25 June 2009. (2009)
Mare, R.D.: Change and stability in educational stratification. Am Sociolog Rev 46, 72–87 (1981)
Article Google Scholar
Mare, R.D.: Response: statistical models of educational stratification: Hauser and Andrew’s model for school transitionsa. Sociolog Methodol 36, 27–37 (2006)
Article Google Scholar
Norton, E.C., Wang, H., Ai, C.: Computing interaction effects and standard errors in logit and probit models. Stata J 4, 154–167 (2004)
Google Scholar
Swait, J., Louviere, J.: The role of the scale parameter in the estimation and comparison of multinomial logit models. J Market Res 30, 305–314 (1993)
Article Google Scholar
Williams, R.: Using heterogeneous choice models to compare logit and probit coefficients across groups. Sociolog Methods Res 37, 531–559 (2009)
Article Google Scholar
Wooldrige, J.M.: Econometric analysis of cross section and panel data. MIT press, Cambridge (2002)
Google Scholar
Xie, Y.: Values and limitations of statistical models. Res Stratif Soc Mob 29, 343–349 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Sociology, University of Copenhagen, Øster Farimagsgade 5, bld. 16, 1014 , Copenhagen, Denmark
Anders Holm & Kristian Karlson
The Danish National Centre for social Research, Copenhagen, Denmark
Anders Holm
Department of Economics, University of Copenhagen, Copenhagen, Denmark
Mette Ejrnæs

Authors

Anders Holm
View author publications
You can also search for this author in PubMed Google Scholar
Mette Ejrnæs
View author publications
You can also search for this author in PubMed Google Scholar
Kristian Karlson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anders Holm.

Appendix

1.1 Identical sign of LPM and LLM coefficients

We want to show (10), i.e.

$$\begin{aligned} \beta ^{*}=0&\Leftrightarrow \beta ^{\textit{LPM}}=0 \\ \beta ^{*}>0&\Leftrightarrow \beta ^{\textit{LPM}}>0 \\ \beta ^{*}<0&\Leftrightarrow \beta ^{\textit{LPM}}<0. \end{aligned}$$

From (11) we have that

$$\begin{aligned} \int {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) g(x)\partial x} \end{aligned}$$

We show that $\beta ^{*}>0\Rightarrow \beta ^{\textit{LPM}}>0$. The situation in which $\beta ^{*}<0\Rightarrow \beta ^{\textit{LPM}}<0$ follows in an equivalent way.

Assume that $\beta ^{*}>0$, which implies that F(.) is an increasing function of x. Rewrite (11) as

$$\begin{aligned} \int {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) g(x)\partial x}&= \int _{-\infty }^{E(x)} (x-E(x))F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) g(x)\partial x\\&\quad +\int _{E(x)}^\infty {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) g(x)\partial x} \end{aligned}$$

from which we obtain

$$\begin{aligned} \int {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) g(x)\partial x}&> \int _{-\infty }^{E(x)} (x-E(x))F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) g(x)\partial x\\&\quad +\int _{E(x)}^\infty {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) g(x)\partial x} \end{aligned}$$

We can then evaluate each of the terms. For the first term we have that $(x-E(x))<0$ and that $F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) <F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) $ for $x<E(x)$. This further implies that

$$\begin{aligned} \int _{-\infty }^{E(x)} {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) g(x)\partial x<\int _{-\infty }^{E(x)} {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) g(x)\partial x} } . \end{aligned}$$

For the second term we have that $(x-E(x))>0$ and that $F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) <F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) $ for $x>E(x)$ and that:

$$\begin{aligned} \int _{E(x)}^\infty {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) g(x)\partial x} <\int _{E(x)}^\infty {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) g(x)\partial x} . \end{aligned}$$

It then follows that

$$\begin{aligned} \int _{-\infty }^\infty {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) g(x)\partial x} >\int _{-\infty }^\infty {(x-E(x))F\left( {\frac{\alpha +\beta ^{*}E(x)}{\sigma }} \right) g(x)\partial x} =0 \end{aligned}$$

Next we show the case when $\beta ^{*}=0\Rightarrow \beta ^{LPM}=0$. This follows straight forward from

$$\begin{aligned} \int {(x-E(x))F\left( {\frac{\alpha }{\sigma }} \right) g(x)\partial x} =0. \end{aligned}$$

Now we have established that

$$\begin{aligned} \beta ^{*}>0&\Rightarrow \beta ^{\textit{LPM}}>0 \\ \beta ^{*}=0&\Rightarrow \beta ^{\textit{LPM}}=0 \\ \beta ^{*}<0&\Rightarrow \beta ^{\textit{LMP}}<0 \end{aligned}$$

To show that $\beta ^{\textit{LMP}}>0\Rightarrow \beta ^{*}>0$ we can instead show that $\beta ^{*}\le 0\Rightarrow \beta ^{\textit{LPM}}\le 0$,^{Footnote 5} If $\beta ^{*}\le 0$ then either $\beta ^{*}=0$ or $\beta ^{*}<0$. If $\beta ^{*}=0$ we know from equation (10) that $\beta ^{LPM}=0$. Similarly, we have that $\beta ^{*}<0$ leads to $\beta ^{\textit{LPM}}<0$. Then we have that $\beta ^{\textit{LPM}}\le 0$. In a similar way we can show the $\beta ^{\textit{LPM}}=0\Rightarrow \beta ^{*}=0$ and $\beta ^{\textit{LPM}}<0\Rightarrow \beta ^{*}<0$.

We have thus shown:

$$\begin{aligned} \beta ^{*}>0\Leftrightarrow \beta ^{LPM}>0 \\ \beta ^{*}=0\Leftrightarrow \beta ^{LPM}=0 \\ \beta ^{*}<0\Leftrightarrow \beta ^{LMP}<0 \end{aligned}$$

1.2 Second-order Taylor approximation

With one predictor variable, $x$, with mean $\gamma $, we derive cov(y,x):

$$\begin{aligned} \hbox {cov}(y,x)&= E(xy)-E(x)E(y) = E(x|E(y|x))-E(x)E(y) \nonumber \\&= \int {xF\left( {\frac{\alpha +\beta ^{*}x}{\sigma }} \right) } f(x)\partial x-\gamma \int {F(\frac{\alpha +\beta ^{*}x}{\sigma })g(x)\partial x} . \end{aligned}$$

(16)

We approximate (16) by a second-order Taylor approximation around $\beta ^{*}=0$:

$$\begin{aligned} \hbox {cov}(y,x)&\approx \int {xF\left( {\frac{\alpha }{\sigma }} \right) +x\beta ^{*}f\left( {\frac{\alpha }{\sigma }} \right) } \frac{x}{\sigma }+(\beta ^{*})^{2}xf^{\prime }\left( {\frac{\alpha }{\sigma }} \right) \left( {\frac{x}{\sigma }} \right) ^{2}g(x)\partial x \\&\quad -\gamma \int {\left( {F\left( {\frac{\alpha }{\sigma }} \right) +\beta ^{*}f\left( {\frac{\alpha }{\sigma }} \right) \frac{x}{\sigma }+(\beta ^{*})^{2}f^{\prime }\left( {\frac{\alpha }{\sigma }} \right) \frac{x^{2}}{\sigma ^{2}}} \right) g(x)\partial x} \\&= F\left( {\frac{\alpha }{\sigma }} \right) \gamma +\beta ^{*}f\left( {\frac{\alpha }{\sigma }} \right) \left( {\frac{E(x^{2})}{\sigma }} \right) +(\beta ^{*})^{2}f^{\prime }\left( {\frac{\alpha }{\sigma }} \right) \left( {\frac{E(x^{3})}{\sigma ^{2}}} \right) \\&\quad -\left( {\lambda F\left( {\frac{\alpha }{\sigma }} \right) +\beta ^{*}f\left( {\frac{\alpha }{\sigma }} \right) \frac{\gamma ^{2}}{\sigma }+(\beta ^{*})^{2}f^{\prime }\left( {\frac{\alpha }{\sigma }} \right) \frac{E(x^{2})\gamma }{\sigma ^{2}}} \right) \\&= \beta ^{*}f\left( {\frac{\alpha }{\sigma }} \right) \left( {\frac{E(x^{2})-\gamma ^{2}}{\sigma }} \right) +(\beta ^{*})^{2}f^{\prime }\left( {\frac{\alpha }{\sigma }} \right) \left( {\frac{E(x^{3})-E(x^{2})\gamma }{\sigma ^{2}}} \right) \\&= f\left( {\frac{\alpha }{\sigma }} \right) \left( {\frac{\beta ^{*}\sigma ^{2}_x }{\sigma }} \right) +(\beta ^{*})^{2}f^{\prime }\left( {\frac{\alpha }{\sigma }} \right) \left( {\frac{E(x^{3})-E(x^{2})\gamma }{\sigma ^{2}}} \right) , \end{aligned}$$

yielding (13).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Holm, A., Ejrnæs, M. & Karlson, K. Comparing linear probability model coefficients across groups. Qual Quant 49, 1823–1834 (2015). https://doi.org/10.1007/s11135-014-0057-0

Download citation

Published: 26 July 2014
Issue Date: September 2015
DOI: https://doi.org/10.1007/s11135-014-0057-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing linear probability model coefficients across groups

Abstract

Access this article

Similar content being viewed by others

Group Differences in Generalized Linear Models

Predicting group-level outcome variables: An empirical comparison of analysis strategies

Modelling heterogeneity: on the problem of group comparisons with logistic regression and the potential of the heterogeneous choice model

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Identical sign of LPM and LLM coefficients

1.2 Second-order Taylor approximation

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparing linear probability model coefficients across groups

Abstract

Access this article

Similar content being viewed by others

Group Differences in Generalized Linear Models

Predicting group-level outcome variables: An empirical comparison of analysis strategies

Modelling heterogeneity: on the problem of group comparisons with logistic regression and the potential of the heterogeneous choice model

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Identical sign of LPM and LLM coefficients

1.2 Second-order Taylor approximation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation