Elsevier

Journal of Econometrics

Volume 170, Issue 2, October 2012, Pages 491-498
Journal of Econometrics

Efficiency bounds for estimating linear functionals of nonparametric regression models with endogenous regressors

https://doi.org/10.1016/j.jeconom.2012.05.018Get rights and content

Abstract

Let Y=μ(X)+ε, where μ is unknown and E[ε|X]0 with positive probability but there exist instrumental variables W such that E[ε|W]=0 w.p.1. It is well known that such nonparametric regression models are generally “ill-posed” in the sense that the map from the data to μ is not continuous. In this paper, we derive the efficiency bounds for estimating certain linear functionals of μ without assuming μ itself to be identified.

Introduction

Models containing unknown functions are commonly used in econometrics and statistics. For instance, consider the model for an observed random vector (Y,X) given by Y=μ(X)+ε, where μ is an unknown function and ε is an unobserved random variable. If E[ε|X]=0, then μ(x)=E[Y|X=x] and nonparametric regression methods can be used for inference about μ. Now suppose that the condition E[ε|X]=0 is not satisfied. This typically occurs whenever some components of X are determined endogenously. In this case, μ is no longer a conditional expectation. Nonetheless, estimation of μ may still be possible provided there exists a random vector W such that E[ε|W]=0. Unfortunately, the results of Ai and Chen (2003), Hall and Horowitz (2005), Darolles et al. (2006), Severini and Tripathi (2006), and Blundell et al. (2007) show that estimators of μ can have very poor rates of convergence because such models are “ill-posed” under general conditions. Thus, even relatively large sample sizes may not be of much help in accurately estimating μ.

In contrast, it may be possible to accurately estimate certain features of μ, such as linear functionals of the form E[ψ(X)μ(X)] and ψ(x)μ(x)dx, where ψ is a known function. In particular, it may be possible to estimate such a linear functional at the usual parametric rate of convergence, even when μ itself is not identified.

Economists are often interested in estimating linear functionals of unknown functions. For instance, Stock (1989) estimates the contrast between functionals of E[Y|X] using before-and-after policy intervention data. Letting Y denote the market demand and X the price, Newey and McFadden (1994) consider estimating abE[Y|X=x]dx, the approximate change in consumer surplus for a given price change. Additional examples can be found in Brown and Newey (1998), Ai and Chen, 2005, Ai and Chen, 2007, Ai and Chen, 2009, and Darolles et al. (2006).

The main objective of this paper is to derive the efficiency bounds for estimating linear functionals of μ when it is not a conditional expectation without assuming μ to be identified. There are at least two reasons why such efficiency bounds are important. One is that efficiency bounds can be used to recognize, and in some cases help construct, an asymptotically efficient estimator of a linear functional. That is, if an estimator has asymptotic variance equal to the efficiency bound, then it is asymptotically efficient. A second use of the efficiency bounds derived in this paper is in understanding nonparametric regression models with endogenous regressors. Efficiency bounds for linear functionals allow us to measure the relative difficulty in estimating different features of the function μ thus telling us what may be learned from the data about μ. For instance, we are able to characterize a condition that is necessary for n1/2-estimability of these functionals when they are identified. This is particularly important in the present context since estimation of μ itself is generally quite difficult.

Estimation of functionals of μ has been considered by Ai and Chen, 2005, Ai and Chen, 2007, Ai and Chen, 2009 and Darolles et al. (2006). Ai and Chen, 2005, Ai and Chen, 2009 derive the efficiency bound for estimating functionals of μ when μ is identified. Ai and Chen (2007, Example 2.2) consider estimating a weighted average derivative of μ and show that their estimator is n1/2-consistent and asymptotically normal. So the contribution of our paper is to derive the efficiency bound for estimating functionals of μ that remains valid even when μ is not identified and the proof is different from that of Ai and Chen. A discussion on the n1/2-rate of convergence of inner products can be found in Darolles et al. (2006), cf. their Section 4.3 (pp. 31–35). The results in this paper complement those obtained earlier by Ai and Chen and Darolles, Florens, and Renault.

The outline of the paper is as follows. The model under consideration is described in detail in Section 2. Section 3 contains a discussion of identification and ill-posedness in this model and the relationship between ill-posedness and n1/2-estimability is considered in Section 4. The efficiency bound for a functional of the unknown function is presented in Section 5. Proofs are in the Appendix A Proofs, Appendix B Some useful results.

Section snippets

The model

Consider the nonparametric regression model Y=μ(X)+ε, where X is a vector of regressors some or all of which are endogenous so that E[ε|X]0 with positive probability. The functional form of μ is unknown; we only assume that it lies in L2(X), the set of real-valued functions of X that are square integrable with respect to the distribution of X. We assume that ε satisfies the conditional moment restriction E[ε|W]=0 w.p.1, where W denotes a vector of instrumental variables (IV’s); conditions

Identification and ill-posedness

In this section we briefly describe what we mean by the identifiable part of μ and the sense in which the equation defining it can be ill-posed. Cf. Kress (1999) for the definition of ill-posed linear equations. Some recent papers that discuss identification conditions and results for ill-posed econometric models include Ai and Chen (2003), Newey and Powell (2003), Hall and Horowitz (2005), Darolles et al. (2006), and Blundell et al. (2007). Severini and Tripathi (2006) have more on

Ill-posedness and n1/2-estimability

As mentioned earlier, ill-posedness of (3.1) can lead to very poor rates of convergence for estimators of PN(T)μ. In fact, convergence can be so slow that n1/2-estimability of E[ψPN(T)μ] may not be possible for certain well behaved ψ. [An obvious exception is ψ1, for which E[ψPN(T)μ]=EY is n1/2-estimable irrespective of the correlation between X and ε.] The aim of this section is to characterize the ψ’s for which the corresponding expectation functionals are not n1/2-estimable and make

The efficiency bound

Following the discussion in Section 3, let θE[ψPN(T)μ] denote the parameter of interest. In this section we determine the efficiency bound for estimating θ when

Assumption 5.1

ψR(T), i.e., there exists δL2(W), not necessarily uniquely defined, such that Tδ=ψ.

For maximum generality, the bound is derived under minimal assumptions on μ. In particular, μ is allowed to be underidentified, i.e., N(T){0}, and the equation defining PN(T)μ is allowed to be ill-posed, i.e., R(T) is not assumed to be

Conclusion

We derive a necessary condition for n1/2-estimability as well as the efficiency bounds for estimating E[ψμ] and ψμ when μ is underidentified and the model defining it is ill-posed.

Acknowledgments

We thank the co-editors and two anonymous referees for comments that greatly improved this paper. We also thank Gary Chamberlain, Enno Mammen, Whitney Newey, and participants at several seminars for helpful suggestions and conversations. The first author is grateful for financial support from the NSF.

References (25)

  • B.W. Brown et al.

    Efficient semiparametric estimation of expectations

    Econometrica

    (1998)
  • G. Chamberlain

    Efficiency bounds for semiparametric regression

    Econometrica

    (1992)
  • Cited by (30)

    • Proximal causal inference without uniqueness assumptions

      2023, Statistics and Probability Letters
    • Optimal Linear Instrumental Variables Approximations

      2021, Journal of Econometrics
      Citation Excerpt :

      We call this parameter the Optimal Linear IV Approximation (OLIVA). We investigate regular identification of the OLIVA, i.e. identification with a finite efficiency bound, based on the results in Severini and Tripathi (2012). The main contribution of our paper is to show that a necessary condition for regular identification of the OLIVA is also sufficient for existence of an IV estimand in a linear structural regression.

    View all citing articles on Scopus
    View full text