Skip to main content
Log in

Weighted rank estimation for nonparametric transformation models with doubly truncated data

  • Research Article
  • Published:
Journal of the Korean Statistical Society Aims and scope Submit manuscript

Abstract

Doubly truncated data often arise when event times are observed only if they fall within subject-specific intervals. We analyze doubly truncated data using nonparametric transformation models, where an unknown monotonically increasing transformation of the response variable is equal to an unknown monotonically increasing function of a linear combination of the covariates plus a random error with an unspecified log-concave probability density function. Furthermore, we assume that the truncation variables are conditionally independent of the response variable given the covariates and leave the conditional distributions of truncation variables given the covariates unspecified. For estimation of regression parameters, we propose a weighted rank (WR) estimation procedure and establish the consistency and asymptotic normality of the resulting estimator. The limiting covariance matrix of the WR estimator can be estimated by a resampling technique, which does not involve nonparametric density estimation or numerical derivatives. A numerical study is conducted and suggests that the proposed methodology works well in practice, and an illustration based on real data is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Abrevaya, J. (1999). Rank estimation of a transformation model with observed truncation. Econometrics Journal, 2, 292–305.

    MATH  Google Scholar 

  • Amemiya, T. (1985). Advanced Econometrics. Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Austin, D., Simon, D. K., & Betensky, R. A. (2014). Computationally simple estimation and improved efficiency for special cases of double truncation. Lifetime Data Analysis, 20(3), 335–354.

    MathSciNet  MATH  Google Scholar 

  • Bhattacharya, P. K., Chernoff, H., & Yang, S. S. (1983). Nonparametric estimation of the slope of a truncated regression. The Annals of Statistics, 11, 505–514.

    MathSciNet  MATH  Google Scholar 

  • Borzadaran, G. R. M., & Borzadaran, H. A. M. (2011). Log-concavity property for some well-known distributions. Surveys in Mathematics and its Applications, 6, 203–219.

    MathSciNet  MATH  Google Scholar 

  • Cavanagh, C., & Sherman, R. P. (1998). Rank estimators for monotonic index models. Journal of Econometrics, 84(2), 351–381.

    MathSciNet  MATH  Google Scholar 

  • Cheng, S. C., Wei, L. J., & Ying, Z. (1995). Analysis of transformation models with censored data. Biometrika, 82(4), 835–845.

    MathSciNet  MATH  Google Scholar 

  • Efron, B., & Petrosian, V. (1999). Nonparametric methods for doubly truncated data. Journal of the American Statistical Association, 94, 824–834.

    MathSciNet  MATH  Google Scholar 

  • Emura, T., Hu, Y. H., & Konno, Y. (2017). Asymptotic inference for maximum likelihood estimators under the special exponential family with double-truncation. Statistical Papers, 58(3), 877–909.

    MathSciNet  MATH  Google Scholar 

  • Emura, T., & Konno, Y. (2012). Multivariate normal distribution approaches for dependently truncated data. Statistical Papers, 53, 133–149.

    MathSciNet  MATH  Google Scholar 

  • Emura, T., Konno, Y., & Michimae, H. (2015). Statistical inference based on the nonparametric maximum likelihood estimator under double-truncation. Lifetime Data Analysis, 21(3), 397–418.

    MathSciNet  MATH  Google Scholar 

  • Emura, T., & Wang, W. (2010). Testing quasi-independence for truncation data. Journal of Multivariate Analysis, 101, 223–239.

    MathSciNet  MATH  Google Scholar 

  • Emura, T., & Wang, W. (2016). Semiparametric inference for an accelerated failure time model with dependent truncation. Annals of the Institute of Statistical Mathematics, 68(5), 1073–1094.

    MathSciNet  MATH  Google Scholar 

  • Frank, G., & Dörre, A. (2017). Linear regression with randomly double-truncated data. South African Statistical Journal, 51(1), 1–18.

    MathSciNet  MATH  Google Scholar 

  • Han, A. K. (1987). Nonparametric analysis of a generalized regression model: The maximum rank correlation estimator. Journal of Econometrics, 35(2–3), 303–316.

    MathSciNet  MATH  Google Scholar 

  • Horowitz, J. L. (1996). Semiparametric estimation of a regression model with an unknown transformation of the dependent variable. Econometrica, 64, 103–137.

    MathSciNet  MATH  Google Scholar 

  • Hu, Y. H., & Emura, T. (2015). Maximum likelihood estimation for a special exponential family under random double-truncation. Computational Statistics, 30(4), 1199–1229.

    MathSciNet  MATH  Google Scholar 

  • Jin, Z., Ying, Z., & Wei, L. J. (2001). A simple resampling method by perturbing the minimand. Biometrika, 88, 381–390.

    MathSciNet  MATH  Google Scholar 

  • Kalbfleisch, J. D., & Lawless, J. F. (1989). Inferences based of retrospective ascertainment: An analysis of the data on transfusion related AIDS. Journal of the American Statistical Association, 84, 360–372.

    MathSciNet  MATH  Google Scholar 

  • Khan, S., & Tamer, E. (2007). Partial rank estimation of duration models with general forms of censoring. Journal of Econometrics, 136(1), 251–280.

    MathSciNet  MATH  Google Scholar 

  • Kim, J. P., Lu, W., Sit, T., & Ying, Z. (2013). A unified approach to semiparametric transformation models under generalized biased sampling schemes. Journal of the American Statistical Association, 108, 217–227.

    MathSciNet  MATH  Google Scholar 

  • Liang, H. Y., & Iglesias-Pérez, M. C. (2018). Weighted estimation of conditional mean function with truncated, censored and dependent data. Statistics, 52, 1249–1269.

    MathSciNet  MATH  Google Scholar 

  • Liu, H., Ning, J., Qin, J., & Shen, Y. (2016). Semiparametric maximum likelihood inference for truncated or biased-sampling data. Statistica Sinica, 26, 1087–1115.

    MathSciNet  MATH  Google Scholar 

  • Mandel, M., Uña-Álvarez, J., Simon, D. K., & Betensky, R. A. (2018). Inverse probability weighted Cox regression for doubly truncated data. Biometrics, 74, 481–487.

    MathSciNet  MATH  Google Scholar 

  • McLaren, C., Wagstaff, M., Brittegram, G., & Jacobs, A. (1991). Detection of two-component mixtures of lognormal distributions in grouped, doubly truncated data analysis of red blood cell volume distributions. Biometrics, 47, 607–622.

    Google Scholar 

  • Medley, G. F., Anderson, R. M., Cox, D. R., & Billard, L. (1987). Incubation periods of AIDS in patients via blood transfusion. Nature, 328, 719–721.

    Google Scholar 

  • Moreira, C., & de Uñ-Álvarez, J. (2010). Bootstrapping the NPMLE for doubly truncated data. Journal of Nonparametric Statistics, 22, 567–583.

    MathSciNet  MATH  Google Scholar 

  • Moreira, C., & de Uñ-Álvarez, J. (2012). Kernel density estimation with doubly-truncated data. Electronic Journal of Statistics, 6, 501–521.

    MathSciNet  MATH  Google Scholar 

  • Moreira, C., de Uñ-Álvarez, J., & Meira-Machado, L. (2016). Nonparametric regression with doubly truncated data. Computational Statistics and Data Analysis, 93, 294–307.

    MathSciNet  MATH  Google Scholar 

  • Moreira, C., & Van Keilegom, I. (2013). Bandwidth selection for kernel density estimation with doubly truncated data. Computational Statistics and Data Analysis, 61, 107–123.

    MathSciNet  MATH  Google Scholar 

  • Nolan, D., & Pollard, D. (1987). U-processes: Rates of convergence. The Annals of Statistics, 15(2), 780–799.

    MathSciNet  MATH  Google Scholar 

  • Rennert, L., & Xie, S. X. (2018). Cox regression model with doubly truncated data. Biometrics, 74, 725–733.

    MathSciNet  MATH  Google Scholar 

  • Samworth, R. J. (2018). Recent progress in log-concave density estimation. Statistical Science, 33(4), 493–509.

    MathSciNet  MATH  Google Scholar 

  • Shen, P. S. (2010). Nonparametric analysis of doubly truncated data. Annals of the Institute of Statistical Mathematics, 62, 835–853.

    MathSciNet  MATH  Google Scholar 

  • Shen, P. S. (2011). Testing quasi-independence for doubly truncated data. Journal of Nonparametric Statistics, 23, 1–9.

    MathSciNet  MATH  Google Scholar 

  • Shen, P. S. (2013). Regression analysis of interval censored and doubly truncated data with linear transformation models. Computational Statistics, 28, 581–596.

    MathSciNet  MATH  Google Scholar 

  • Shen, P. S. (2016). Analysis of transformation models with doubly truncated data. Statistical Methodology, 30, 15–30.

    MathSciNet  MATH  Google Scholar 

  • Shen, P. S., & Liu, Y. (2017). Pseudo maximum likelihood estimation for the Cox model with doubly truncated data. Statistical Papers. https://doi.org/10.1007/s00362-016-0870-8.

  • Shen, P. S., & Liu, Y. (2019). Pseudo MLE for semiparametric transformation model with doubly truncated data. Journal of the Korean Statistical Society, 48, 384–395.

    MathSciNet  MATH  Google Scholar 

  • Shen, P. S., Liu, Y., Maa, D. P., & Ju, Y. (2017). Analysis of transformation models with right-truncated data. Statistics, 51, 404–418.

    MathSciNet  MATH  Google Scholar 

  • Sherman, R. (1993). The limiting distribution of the maximum rank correlation estimator. Econometrica, 61(1), 123–137.

    MathSciNet  MATH  Google Scholar 

  • Sherman, R. (1994). Maximal inequalities for degenerate U-processes with applications to optimization estimators. The Annals of Statistics, 22(1), 439–459.

    MathSciNet  MATH  Google Scholar 

  • Song, X., Ma, S., Huang, J., & Zhou, X. H. (2007). A semiparametric approach for the nonparametric transformation survival model with multiple covariates. Biostatistics, 8(2), 197–211.

    MATH  Google Scholar 

  • Wang, H. (2007). A note on iterative marginal optimization: A simple algorithm for maximum rank correlation estimation. Computational Statistics and Data Analysis, 51, 2803–2812.

    MathSciNet  MATH  Google Scholar 

  • Wang, S.-H., & Chiang, C.-T. (2019). Maximum partial-rank correlation estimation for lefttruncated and right-censored survival data. Statistica Sinica, 29, 2141–2161.

    MathSciNet  MATH  Google Scholar 

  • Ying, Z., Yu, W., Zhao, Z., & Zheng, M. (2019). Regression analysis of doubly truncated data. Journal of the American Statistical Association,. https://doi.org/10.1080/01621459.2019.1585252.

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We are grateful to the anonymous reviewers and the editor for a number of constructive and helpful comments and suggestions that have clearly improved our manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaohui Yuan.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In this appendix, we will sketch the proof of the asymptotic results in Theorems 2.1 and 2.2. The conditional density of \(\tilde{Y}\) given \(\tilde{W}=w\) is given by \(f_{\tilde{Y}|\tilde{W}}(y|w)=f\{H(y)-G(w^\textsf {T}{\varvec{\beta }}^*)\}h(y)\), where \(h(y)=dH(y)/dy\). Then, the conditional density of Y given \((W^\textsf {T},L,R)=(w^\textsf {T},l,r)\) can be written as

$$\begin{aligned} \frac{f\{H(y)-G(w^\textsf {T}{\varvec{\beta }}^*)\}h(y)I(l<y<r)}{F\{H(r)-G(w^\textsf {T}{\varvec{\beta }}^*)\}-F\{H(l)-G(w^\textsf {T}{\varvec{\beta }}^*)\}}. \end{aligned}$$

Let \(Z=(W^\textsf {T},Y,L,R)^\textsf {T}\) denote an observation from the distribution P on the set \(\mathcal {Z}\subseteq \mathbb {R}^{p+1}\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\). For each \(z=(w^{\textsf {T}},y,l,r)^\textsf {T}\) in \(\mathcal {Z}\) and each \({\varvec{\theta }}\) in \(\Theta\), define

$$\begin{aligned} \varrho (z,{\varvec{\theta }})&= E\{ \Psi (Z,z)I(W^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<w^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}))\}\\&\quad+\, E\{\Psi (z, Z)I(W^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})>w^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}))\}, \end{aligned}$$

where \(w=(x^*,x^\textsf {T})^\textsf {T}\) and \(\Psi (Z_1, Z_2)=I(L_2<Y_1< R_2)I(L_1<Y_2< R_1)I(Y_1< Y_2)\). Write \(\nabla _m\) for the mth partial derivative operator of the function \(\varrho (z,{\varvec{\theta }})\) with respect to \({\varvec{\theta }}=(\theta _1,\ldots ,\theta _p)^{\textsf {T}}\in \mathbb {R}^p\), and let

$$\begin{aligned} |\nabla _m|\varrho (z,{\varvec{\theta }})=\sum _{i_1,\ldots ,i_m\in \{1,\ldots ,p\}}\left| \frac{\partial ^m}{\partial \theta _{i_1}\cdots \partial \theta _{i_m}}\varrho (z,{\varvec{\theta }})\right| . \end{aligned}$$

To establish all the large-sample properties in this paper, we require the following conditions:

  1. C0

    (a) \(\tilde{Z}_1,\ldots ,\tilde{Z}_{\tilde{N}}\) are independent copies of \(\tilde{Z}\), where \(\tilde{Z}=(\tilde{W}^\textsf {T},\tilde{Y},\tilde{L},\tilde{R})^\textsf {T}\); (b) \(H(\tilde{Y})=G(\tilde{W}^\textsf {T}{\varvec{\beta }}^*)+\tilde{\varepsilon }\), where \(H(\cdot )\) and \(G(\cdot )\) are strictly increasing and continuously differentiable functions; \(\tilde{W}\) and \(\tilde{\varepsilon }\) are independent; (c) there exists a positive constant \(c_0\) such that \(0<f(a)\le c_0<+\infty\) for \(a\in \mathbb {R}\) and the function \(\eta (a)=\log \{f(a)\}\) has a continuous second derivative \(\eta ''(a)\) for all \(a\in \mathbb {R}\) and \(\eta ''(a)<0\), \(a\in \mathbb {R}\); (d) \((\tilde{L},\tilde{R})\) and \(\tilde{Y}\) are conditional independent given \(\tilde{W}\); (e) for \(i=1,\ldots ,\tilde{N}\), \(\tilde{Z}_i\) is observed if and only if \(\tilde{L}_i<\tilde{Y}_i<\tilde{R}_i\). Moreover, let \(N=\sum _{i=1}^{\tilde{N}}I(\tilde{L}_i<\tilde{Y}_i<\tilde{R}_i)\) denote the number of observations and \(Z_i=(W_i^\textsf {T},Y_i,L_i,R_i)^\textsf {T}\), \(i=1,\ldots ,N\) be the observed \(\tilde{Z}_i\)’s, where \(\varepsilon _i\)’s are the corresponding error terms and \(W_i=(X_i^*,X_i^\textsf {T})^\textsf {T}\);

  2. C1

    Let \(\mathcal {W}\) denote the support of W. (a) there exists a positive constant \(c_1\) such that, for all \((w_1,w_2)\in \mathcal {W}\times \mathcal {W}\), \(P(L_1\vee L_2< R_1\wedge R_2|W_1=w_1,W_2=w_2)\ge c_1>0\); (b)

    $$\begin{aligned} \int _{-\infty }^{+\infty }\int _{-\infty }^{+\infty } I(y_1<y_2)h(y_1)h(y_2)dy_1dy_2<+\infty ; \end{aligned}$$
    (10)
  3. C2

    The set \(\mathcal {W}\) is not contained in any proper linear subspace of \(\mathbb {R}^{p+1}\);

  4. C3

    The vector of covariates, \(W=(X^*,X^\textsf {T})^\textsf {T}\), is of full rank, and \(X^*\) has an everywhere-positive Lebesgue density conditional on X;

  5. C4

    The unknown parameter \({\varvec{\beta }}^*={\varvec{\beta }}({\varvec{\theta }}^*)=(1,{\varvec{\theta }}^{*\textsf {T}})\), where \({\varvec{\theta }}^*\) lies in the interior of the parameter space \(\Theta\), which is a compact subset of \(\mathbb {R}^p\);

  6. C5

    (a) Let \(\mathcal {B}\) denote a neighborhood of \({\varvec{\theta }}^*\). For each z, all mixed second partial derivatives of \(\varrho (z,{\varvec{\theta }})\) exist on \(\mathcal {B}\). There is a function \(\rho (z)\) such that \(E\{\rho (Z)\}<+\infty\) and for all \(z\in \mathcal {Z}\) and \({\varvec{\theta }}\) in \(\mathcal {B}\),

    $$\begin{aligned} \Vert \nabla _2\varrho (z,{\varvec{\theta }})-\nabla _2\varrho (z,{\varvec{\theta }}^*)\Vert \le \rho (z)\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ; \end{aligned}$$

    (b) \(E\{\Vert \nabla _1\varrho (Z,{\varvec{\theta }}^*)\Vert ^2\}<+\infty\); (c) \(E\{|\nabla _2|\varrho (Z,{\varvec{\theta }}^*)\}<+\infty\);

    (d) \(E\{\nabla _2\varrho (Z,{\varvec{\theta }}^*)\}\) is negative definite.

Remark A.1

Most of the above conditions are assumed for a standard semiparametric monotonic linear index model (Cavanagh and Sherman 1998). Additional conditions are on the truncation mechanism and the distribution assumption of the random error. C0 defines the structure which generates the observations. Conditions C0–C4 guarantee the identifiability of \({\varvec{\theta }}^*\). Conditions C5 are standard regularity conditions sufficient to support an argument based on a Taylor expansion of \(\varrho (z,{\varvec{\theta }})\) about \({\varvec{\theta }}^*\). Since \(f(a)>0\) for all \(a\in \mathbb {R}\), F(a) is a strictly increasing function of \(a\in \mathbb {R}\). Let \(\mathcal {L}\) and \(\mathcal {R}\) denote the support of L and R, respectively. For all \((l,r,w)\in \mathcal {L}\times \mathcal {R}\times \mathcal {W}\) with \(l<r\), it follows that \(F\{H(r)-G(w^\textsf {T}{\varvec{\beta }}^*)\}-F\{H(l)-G(w^\textsf {T}{\varvec{\beta }}^*)\}>0\). Since H(a) is a strictly increasing and continuously differentiable function of a, we have \(h(a)>0\), \(a\in \mathbb {R}\). It follows that, for all \(a<b\),

$$\begin{aligned} \int _{a}^{b}\int _{a}^{b} I(y_1<y_2)h(y_1)h(y_2)dy_1dy_2>0. \end{aligned}$$
(11)

Lemma A.1

Let \(\eta (a)=\log \{f(a)\}\), \(a\in \mathbb {R}\). Assume that \(\eta (a)\) is twice continuously differentiable. Then \(\eta ''(a)<0\), \(a\in \mathbb {R}\) implies that

$$\begin{aligned} \frac{f(a)}{f(a-t)}>\frac{f(b)}{f(b-t)},\quad \text{ for } \ a<b\ \text{ and } \ t>0, \end{aligned}$$
(12)

with \(a,\ b,\ a-t,\ b-t\in \mathbb {R}\).

Proof of Lemma A.1

Since \(\eta '(a)>\eta '(b)\) for \(a<b\), there exists \(\delta >0\) such that

$$\begin{aligned} \frac{f(a)-f(a-t)}{t f(a-t)}>\frac{f(b)-f(b-t)}{t f(b-t)},\quad \text{ for } \ a<b\ \text{ and } \ 0<t<\delta . \end{aligned}$$
(13)

Thus, (12) is satisfied for \(0<t<\delta\). Furthermore, for \(u\in \mathbb {R}_+\), there exist constants \(v_j\in \mathbb {R}\), \(j=1,\ldots ,m\) such that

$$\begin{aligned} \frac{f(a)}{f(a-u)}=\prod _{j=1}^m\frac{f(a+v_{j-1})}{f(a+v_{j})}>\prod _{j=1}^m\frac{f(b+v_{j-1})}{f(b+v_{j})}=\frac{f(b)}{f(b-u)}, \end{aligned}$$

where \(v_{0}=0\), \(v_{m}=-u\), \(0<v_{j-1}-v_j<\delta\), \(j=1,\ldots ,m\). \(\square\)

Proof of Theorem 2.1

The proof is very similar to that of Theorem 1 of Cavanagh and Sherman (1998). Write \(Q({\varvec{\theta }})\) for \(E\{Q_N({\varvec{\theta }})\}\). We will show that (i) \(Q({\varvec{\theta }})\) is uniquely maximized at \({\varvec{\theta }}^*\); (ii) \(\sup _{{{\varvec{\theta }}}\in \Theta }|Q_N({\varvec{\theta }})-Q({\varvec{\theta }})|=o_p(1)\); (iii) \(Q({\varvec{\theta }})\) is continuous. Consistency will then follow from standard arguments using the compactness of \(\Theta\) (See, for example, Amemiya 1985, pp. 106–107).

Let \(O_i=(W_i^{\textsf {T}},L_i,R_i)^{\textsf {T}}\) and

$$\begin{aligned} Q({\varvec{\theta }})= E[\Lambda _{ij}I(Y_i<Y_j)I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}],\ \ i\ne j. \end{aligned}$$

To establish consistency, we first show that \(Q({\varvec{\theta }})\) has its maximizer at \({\varvec{\theta }}={\varvec{\theta }}^*\). For all \(i\ne j\), we have that

$$\begin{aligned}&E[\Lambda _{ij}I(Y_i<Y_j)I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}|O_{i},O_{j}]\\&\quad =E[\Lambda _{ij}I(Y_i<Y_j)|O_{i},O_{j}]I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}\}\\&\quad = \xi _{ij}I(L_i\vee L_j<R_i\wedge R_j)I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}, \end{aligned}$$

where \(\xi _{ij}=\frac{\int _{L_i\vee L_j}^{R_i\wedge R_j}\int _{L_i\vee L_j}^{R_i\wedge R_j} I(y_1<y_2)f\{H(y_1)-G(W_i^\textsf {T}{\varvec{\beta }}^*)\}f\{H(y_2)-G(W_j^\textsf {T}{\varvec{\beta }}^*)\}h(y_1)h(y_2)dy_1dy_2}{\zeta _i\zeta _j}.\) and \(\zeta _t=F\{H(R_t)-G(W_t^\textsf {T}{\varvec{\beta }}^*)\}-F\{H(L_t)-G(W_t^\textsf {T}{\varvec{\beta }}^*)\}\), \(t=i,j\). Analogously, we can derive an expression for \(\xi _{ji}\), that is,

$$\begin{aligned} \xi _{ji}=\frac{\int _{L_i\vee L_j}^{R_i\wedge R_j}\int _{L_i\vee L_j}^{R_i\wedge R_j} I(y_1<y_2)f\{H(y_1)-G(W_j^\textsf {T}{\varvec{\beta }}^*)\}f\{H(y_2)-G(W_i^\textsf {T}{\varvec{\beta }}^*)\}h(y_1)h(y_2)dy_1dy_2}{\zeta _i\zeta _j}. \end{aligned}$$

By using Lemma A.1 with condition C0(c) and setting \(a=H(y_1)-G(u_1)\), \(b=H(y_2)-G(u_1)\) and \(t=G(u_2)-G(u_1)\) in (12), it is easy to verify that

$$\begin{aligned}&f\{H(y_1)-G(u_1)\}f\{H(y_2)-G(u_2)\} \\&\quad >f\{H(y_1)-G(u_2)\}f\{H(y_2)-G(u_1)\}, \end{aligned}$$
(14)

where \(u_1<u_2\) and \(y_1<y_2\). When \(W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)\) and y1 < y2, by (14), we have

$$\begin{aligned}&D(y_1,y_2|W_i^\textsf {T}{\varvec{\beta }}^*,W_j^\textsf {T}{\varvec{\beta }}^*) \\&\quad =:f\{H(y_1)-G(W_i^\textsf {T}{\varvec{\beta }}^*)\}f\{H(y_2)-G(W_j^\textsf {T}{\varvec{\beta }}^*)\} \\&\qquad -\, f\{H(y_1)-G(W_j^\textsf {T}{\varvec{\beta }}^*)\}f\{H(y_2)-G(W_i^\textsf {T}{\varvec{\beta }}^*)\}>0. \end{aligned}$$
(15)

By (15), (10) and (11), if \(W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)\) and \(L_i\vee L_j<R_i\wedge R_j\), we have

$$\begin{aligned}&\xi _{ij}-\xi _{ji}\\&\quad =\int _{L_i\vee L_j}^{R_i\wedge R_j}\int _{L_i\vee L_j}^{R_i\wedge R_j} \frac{D(y_1,y_2|W_i^\textsf {T}{\varvec{\beta }}^*,W_j^\textsf {T}{\varvec{\beta }}^*)}{\zeta _i\zeta _j}I(y_1<y_2) h(y_1)h(y_2)dy_1dy_2>0. \end{aligned}$$

Similarly, one can show that \(\xi _{ji}>\xi _{ij}\), if \(W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)<W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)\) and \(L_i\vee L_j<R_i\wedge R_j\).

By symmetry,

$$\begin{aligned} Q({\varvec{\theta }})&= \frac{1}{2}E\{I(L_i\vee L_j<R_i\wedge R_j) \\&\quad\times \,[\xi _{ij}I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}+\xi _{ji}I\{W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}) <W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}]\}. \end{aligned}$$
(16)

The value of \(I(L_i\vee L_j<R_i\wedge R_j)[\xi _{ij}I\{W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})<W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})\}+\xi _{ji}I\{W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})<W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})\}]\) may be less than \(I(L_i\vee L_j<R_i\wedge R_j)(\xi _{ij}\vee \xi _{ji})\), depending on \({\varvec{\theta }}\). However, condition C3 ensures that

$$\begin{aligned}&\frac{1}{2}I(L_i\vee L_j<R_i\wedge R_j)[\xi _{ij}I\{W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})<W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})\} \\&\qquad +\,\xi _{ji}I\{W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})<W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})\}] \\&\quad \le \frac{1}{2}I(L_i\vee L_j<R_i\wedge R_j)(\xi _{ij}\vee \xi _{ji}) \\&\quad \equiv \frac{1}{2}I(L_i\vee L_j<R_i\wedge R_j)[\xi _{ij}I\{W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }}^*)<W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }}^*)\} \\&\qquad +\,\xi _{ji}I\{W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }}^*)<W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }}^*)\}] \end{aligned}$$
(17)

with probability one. Taking expectation with respect to \(\{O_i,O_j\}\) on both sides of the inequality in (17), we have \(Q({\varvec{\theta }})\le Q({\varvec{\theta }}^*)\), which shows that \({\varvec{\theta }}^*\) is a maximizer of \(Q({\varvec{\theta }})\). Let

$$\begin{aligned} \eta _{ij}(s,t)=\frac{\int _{L_i\vee L_j}^{R_i\wedge R_j}\int _{L_i\vee L_j}^{R_i\wedge R_j} I(y_1<y_2)f\{H(y_1)-G(s)\}f\{H(y_2)-G(t)\}h(y_1)h(y_2)dy_1dy_2}{\zeta _i\zeta _j}. \end{aligned}$$

We can write

$$\begin{aligned}&Q({\varvec{\theta }}^*)\\&\quad =\frac{1}{2}E [\max \{\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*),\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*)\}I(L_1\vee L_2<R_1\wedge R_2)] \end{aligned}$$

and

$$\begin{aligned}&Q({\varvec{\theta }}) \\&\quad =\frac{1}{2}E\{[\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*)I\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}+\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*) \\&\qquad \times \, I\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})>W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}]I(L_1\vee L_2<R_1\wedge R_2)\}. \end{aligned}$$
(18)

We now apply the method of proof by contradiction to show that \({\varvec{\theta }}^*\) is the unique maximizer of \(Q({\varvec{\theta }})\). Suppose that for some \({\varvec{\theta }}\in \Theta\) and \({\varvec{\theta }}\ne {\varvec{\theta }}^*\),

$$\begin{aligned} Q({\varvec{\theta }})&= \frac{1}{2}E[\max \{\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*),\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*)\} \\&\quad\times \, I(L_1\vee L_2<R_1\wedge R_2)]. \end{aligned}$$
(19)

Deduce from (18) and (19) that

$$\begin{aligned}&P[\{\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*)-\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*)\} \\&\quad \times \,\{ W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})-W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}>0|L_1\vee L_2<R_1\wedge R_2]=1. \end{aligned}$$
(20)

Let \(S_{\mathcal{X}}\) denote the support of \(X=(X_1,\ldots ,X_p)^{\textsf {T}}\) and \(\text{ CH }_{\mathcal{X}}\) denote the convex hull of \(S_{\mathcal{X}}\). That is, \(\text{ CH }_{\mathcal{X}}\) is the smallest convex set containing \(S_{\mathcal{X}}\). Condition C2 implies that \(\text{ CH }_{\mathcal{X}}\) is a p-dimensional subset of \(\mathbb {R}^p\) and so has a nonempty interior. Select a point \(\mu\) from this interior and define \(I_\mu =\{(t,\mu ^{\textsf {T}})^{\textsf {T}}:t\in \mathbb {R}\}\). Notice that the definition of \(I_\mu\) and condition C4 together imply that \(\{\iota ^{\textsf {T}}{\varvec{\beta }}^*:\iota \in I_\mu \}\equiv \mathbb {R}\). (10), (11) and (14) guarantee the existence of a point \(\iota _0\) in \(I_\mu\) such that \(s_0=\iota _0^{\textsf {T}}{\varvec{\beta }}^*\) in the support of \(W^{\textsf {T}}{\varvec{\beta }}^*\) and for \(w\in \mathcal {W}\),

$$\begin{aligned}&\eta _{12}(s_{0},s)<\eta _{12}(s,s_{0})\ \ \text{ if }\ \ \ s=w^{\textsf {T}}{\varvec{\beta }}^*<s_0 \ \text{ and }\ \ L_1\vee L_2<R_1\wedge R_2;\\&\quad \eta _{12}(s_{0},s)>\eta _{12}(s,s_{0})\ \ \text{ if }\ \ \ s=w^{\textsf {T}}{\varvec{\beta }}^*>s_0 \ \text{ and }\ \ L_1\vee L_2<R_1\wedge R_2. \end{aligned}$$

Define the \((p+1)\)-dimensional open wedges

$$\begin{aligned} G_1({\varvec{\theta }})&= \{w^{\textsf {T}}{\varvec{\beta }}^*<\iota _0^{\textsf {T}}{\varvec{\beta }}^*\}\{w^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})>\iota _0^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}\ \ \text{ and }\\ G_2({\varvec{\theta }})&= \{w^{\textsf {T}}{\varvec{\beta }}^*>\iota _0^{\textsf {T}}{\varvec{\beta }}^*\}\{w^{\textsf {T}}({\varvec{\theta }})<\iota _0^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}. \end{aligned}$$

If \(W_1\in G_1({\varvec{\theta }})\), \(W_2\in G_2({\varvec{\theta }})\) and \(L_1\vee L_2<R_1\wedge R_2\), then

$$\begin{aligned}&\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*)<\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*)\\&\quad \text{ while }\ \ W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})>W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}) \ \text{ and }\ \ L_1\vee L_2<R_1\wedge R_2. \end{aligned}$$

which contradicts (20) if

$$\begin{aligned} P\{W_1\in G_1({\varvec{\theta }}), W_2\in G_2({\varvec{\theta }}), L_1\vee L_2<R_1\wedge R_2\}>0. \end{aligned}$$
(21)

We now show that (21) actually holds for \({\varvec{\theta }}\ne {\varvec{\theta }}^*\). Since

$$\begin{aligned}&c_1P\{W_1\in G_1({\varvec{\theta }})\}P\{ W_2\in G_2({\varvec{\theta }})\}\\&\quad \le E[I\{W_1\in G_1({\varvec{\theta }}), W_2\in G_2({\varvec{\theta }})\}P\{ L_1\vee L_2<R_1\wedge R_2|W_1,W_2\}]\\&\quad =E[I\{W_1\in G_1({\varvec{\theta }}), W_2\in G_2({\varvec{\theta }})\}E\{I(L_1\vee L_2<R_1\wedge R_2)|W_1,W_2\}]\\&\quad =P\{W_1\in G_1({\varvec{\theta }}), W_2\in G_2({\varvec{\theta }}), L_1\vee L_2<R_1\wedge R_2\}\\&\quad \le P\{W_1\in G_1({\varvec{\theta }})\}P\{ W_2\in G_2({\varvec{\theta }})\}, \end{aligned}$$

we only need to show that

$$\begin{aligned} P\{W_1\in G_1({\varvec{\theta }})\}P\{ W_2\in G_2({\varvec{\theta }})\}>0 \end{aligned}$$
(22)

holds for \({\varvec{\theta }}\ne {\varvec{\theta }}^*\).

For each \({\varvec{\theta }}\) in \(\Theta\), define

$$\begin{aligned} M_{{\varvec{\theta }}} = \{w^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})=\iota _0^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}\ \quad \text{ and }\quad L_{{\varvec{\theta }}} = M_{{\varvec{\theta }}}\cap M_{{\varvec{\theta }}^*}. \end{aligned}$$

Note that \(G_1({\varvec{\theta }})\) and \(G_2({\varvec{\theta }})\) are delimited by the p-dimensional hyperplanes \(M_{{\varvec{\theta }}}\) and \(M_{{\varvec{\theta }}^*}\), and for \({\varvec{\theta }}\ne {\varvec{\theta }}^*\), \(L_{{\varvec{\theta }}}\) is a \((p-1)\)-dimensional hyperplane in \(\mathbb {R}^{p+1}\). Consider the projections

$$\begin{aligned} P_0({\varvec{\theta }}) = \{x\in \text{ CH }_{\mathcal{X}}: (t,x^{\textsf {T}})^{\textsf {T}}\in L_{{\varvec{\theta }}}\ \text{ for } \text{ some }\ t\in \mathbb {R}\} \end{aligned}$$

and for \(j=1,2\),

$$\begin{aligned} P_j({\varvec{\theta }}) = \{x\in \text{ CH }_{\mathcal{X}}: (t,x^{\textsf {T}})^{\textsf {T}}\in G_j({\varvec{\theta }})\ \text{ for } \text{ some }\ t\in \mathbb {R}\}. \end{aligned}$$

That is, \(P_0({\varvec{\theta }})\) projects \(L_{{\varvec{\theta }}}\) into \(\text{ CH }_{\mathcal{X}}\) and \(P_j({\varvec{\theta }})\) projects \(G_j({\varvec{\theta }})\) into \(\text{ CH }_{\mathcal{X}}\). Also note that \(\{P_j({\varvec{\theta }}): j=0,1,2\}\) partitions \(\text{ CH }_{\mathcal{X}}\).

Since both \(M_{{\varvec{\theta }}}\) and \(M_{{\varvec{\theta }}^*}\) contain \(\iota _0\), \(L_{{\varvec{\theta }}}\) must contain \(\iota _0\). Since \(\iota _0\) is an element of \(I_\mu\), \(P_0({\varvec{\theta }})\) must contain \(\mu\). Since \(\mu\) is an interior point of \(\text{ CH }_{\mathcal{X}}\), \(P_0({\varvec{\theta }})\) cannot contain an entire \((p-1)\)-dimensional face of \(\text{ CH }_{\mathcal{X}}\). But then each \(P_j({\varvec{\theta }})\) must contain at least one point of \(S_{\mathcal{X}}\), implying

$$\begin{aligned} \int _{P_j({\varvec{\theta }})\cap S_{\mathcal{X}}} dF_X(x)>0,\quad j=1,2, \end{aligned}$$
(23)

where \(F_X(\cdot )\) denotes the distribution of X.

For each x in \(S_{\mathcal{X}}\), write \(l_x\) for the line through x parallel to the 1th coordinate axis. If \({\varvec{\theta }}\ne {\varvec{\theta }}^*\), then there must be a nonzero angle between \(M_{{\varvec{\theta }}}\) and \(M_{{\varvec{\theta }}^*}\), and so at least one of \(M_{{\varvec{\theta }}}\) and \(M_{{\varvec{\theta }}^*}\) must intersect \(l_x\). Write \(t(x|{\varvec{\theta }})\) for the 1th component of \(M_{{\varvec{\theta }}}\cap l_x\). If \(M_{{\varvec{\theta }}}\cap l_x\) is null, define \(t_{{\varvec{\theta }}}(x)=\infty\). Then

$$\begin{aligned} P\{W\in G_j({\varvec{\theta }})\}=\int _{P_j({\varvec{\theta }})\cap S_{\mathcal{X}}}\left[ \int _{\min \{t(x|{\varvec{\theta }}),t(x|{\varvec{\theta }}^*)\}}^{\max \{t(x|{\varvec{\theta }}),t(x|{\varvec{\theta }}^*)\}}f(t|x)dt\right] dF_X(x),\quad j=1,2, \end{aligned}$$

where \(f(\cdot | x)\) denotes the conditional density of \(X^*\) given \(X=x\). Since \({\varvec{\theta }}\ne {\varvec{\theta }}^*\), \(t(x|{\varvec{\theta }})\ne t(x|{\varvec{\theta }}^*)\) for each x in \(S_{\mathcal{X}}\). This, C3, and (23) imply that \(P\{W\in G_j({\varvec{\theta }})\}>0\), \(j=1,2\). Thus, (21) actually holds. This establishes (i).

For each \({\varvec{\theta }}\in \Theta\) and each \((z_1,z_2)\) in \(\mathcal {Z}\times \mathcal {Z}\) define

$$\begin{aligned} f(z_1,z_2,{\varvec{\theta }})=\Psi (z_1,z_2)I\{w_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<w_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}-{Q}({\varvec{\theta }}). \end{aligned}$$

Then

$$\begin{aligned} Q_{N}({\varvec{\theta }})-{Q}({\varvec{\theta }})=\mathbb {U}_Nf(\cdot ,\cdot ,{\varvec{\theta }}), \end{aligned}$$

where \(\mathbb {U}_N\) denotes the random measure putting mass \(1/(N^2-N)\) on each pair \((Z_i,Z_j)\), \(i\ne j\). That is \(\{\mathbb {U}_Nf(\cdot ,\cdot ,{\varvec{\theta }}): {\varvec{\theta }}\in \Theta \}\) is a zero-mean U-process of order 2. A trivial modification of the argument given in Sherman (1993, Section 5) shows that \(\{f(\cdot ,\cdot ,{\varvec{\theta }}):{\varvec{\theta }}\in \Theta \}\) is Euclidean for the envelope \(|\Psi (z_1,z_2)|+E\{|\Psi (Z_1,Z_2)|\}\). Deduce from Corollary 7 of Sherman (1994, Section 6) that

$$\begin{aligned} \sup _{{{\varvec{\theta }}}\in \Theta }|\mathbb {U}_Nf(\cdot ,\cdot ,{\varvec{\theta }})|=O_p(N^{-1/2}). \end{aligned}$$

This establishes (ii).

Finally, fix \({\varvec{\theta }}\in \Theta\), and let \({\varvec{\theta }}_m\) denote a sequence of elements of \(\Theta\) converging to \({\varvec{\theta }}\) as m tends to infinity. Condition C3 implies that

$$\begin{aligned} P\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})=W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}=0. \end{aligned}$$

This in turn implies that

$$\begin{aligned}&\Psi (z_1,z_2)I\{w_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}_m)<w_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}_m)\}-\Psi (z_1,z_2)I\{w_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<w_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\} \\&\quad \rightarrow 0 \end{aligned}$$
(24)

as \(m\rightarrow \infty\), for almost all \((z_1,z_2)\in \mathcal {Z}\times \mathcal {Z}\). Take expectation of

$$\begin{aligned} \Psi (Z_1,Z_2)I\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}_m)<W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}_m)\}-\Psi (Z_1,Z_2)I\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}, \end{aligned}$$

then apply the dominated convergence theorem with \(2|\Psi (Z_1,Z_2)|\) as the dominating function to establish (iii). Hence, consistency is proved. \(\square\)

Proof of Theorem 2.2

To establish asymptotic normality, let \(\epsilon _N({\varvec{\theta }})=Q_{N}({\varvec{\theta }})-{Q}({\varvec{\theta }})\). A standard decomposition of U-statistics gives

$$\begin{aligned} \epsilon _N({\varvec{\theta }})-\epsilon _N({\varvec{\theta }}^*)= \frac{1}{N}\sum _{i=1}^Nb_i({\varvec{\theta }})+\frac{1}{N^2-N}\sum _{i<j}d_{ij}({\varvec{\theta }}), \end{aligned}$$

where

$$\begin{aligned} b_i({\varvec{\theta }})&= E[a_{ij}({\varvec{\theta }})+a_{ji}({\varvec{\theta }})-2E\{a_{ij}({\varvec{\theta }})\}|Z_i],\\ d_{ij}({\varvec{\theta }})&= a_{ij}({\varvec{\theta }})+a_{ji}({\varvec{\theta }})-2E\{a_{ij}({\varvec{\theta }})\}-b_i({\varvec{\theta }})-b_j({\varvec{\theta }}),\\ a_{ij}({\varvec{\theta }})&= \Lambda _{ij}I(Y_i<Y_j) \\&\quad\times [I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}-I(W_{i}^{\textsf {T}}{\varvec{\beta }}^*<W_{j}^{\textsf {T}}{\varvec{\beta }}^*)]. \end{aligned}$$

Note that \(E\{b_i({\varvec{\theta }})\} = 0\) for \({\varvec{\theta }}\in \Theta\); and \(b_i({\varvec{\theta }}^*)=0\). Under Condition C5, a Taylor expansion gives

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^Nb_i({\varvec{\theta }})= ({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}\frac{1}{N}\sum _{i=1}^N\dot{b}_{i}({\varvec{\theta }}^*)+o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2), \end{aligned}$$

where \(\dot{b}_{i}({\varvec{\theta }})=\partial b_{i}({\varvec{\theta }})/\partial {\varvec{\theta }}\). Similar to the proof of uniform convergence in Theorem 2.1, we need to show that, for any sequence \(\kappa _N\) of order o(1),

$$\begin{aligned} \sup _{\Vert {{\varvec{\theta }}}-{{\varvec{\theta }}}^*\Vert \le \kappa _N}~\left| \frac{1}{N^2-N}\sum _{i<j}d_{ij}({\varvec{\theta }})\right| =o_p(N^{-1}). \end{aligned}$$
(25)

The identical subgraph set and Vapnik–Chervonenkis class set arguments of Sherman (1993, Section 5), together with Corollary 17 and Corollary 21 in Nolan and Pollard (1987), show that the class of function \(d_{ij}({\varvec{\theta }})\) is Euclidean. The Euclidean property together with Corollary 8 of Sherman (1994) guarantee that (25) holds.

Notice that \(Q({\varvec{\theta }})=2^{-1}E\{\varrho (Z,{\varvec{\theta }})\}\). For \({\varvec{\theta }}\) in a neighbourhood of \({\varvec{\theta }}^*\), by condition C5, we have

$$\begin{aligned} {Q}({\varvec{\theta }})&= {Q}({\varvec{\theta }}^*)+({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}u({\varvec{\theta }}^*)-\frac{1}{2}({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}A({\varvec{\theta }}^*)({\varvec{\theta }}-{\varvec{\theta }}^*)\\&\quad+\,o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2)\\&= {Q}({\varvec{\theta }}^*)-\frac{1}{2}({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}A({\varvec{\theta }}^*)({\varvec{\theta }}-{\varvec{\theta }}^*)+o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2), \end{aligned}$$

where \(u({\varvec{\theta }})=\partial {Q}({\varvec{\theta }})/\partial {\varvec{\theta }}\), \(A({\varvec{\theta }})=-\partial ^2 {Q}({\varvec{\theta }})/\partial {\varvec{\theta }}\partial {\varvec{\theta }}^{\textsf {T}}\) and \(u({\varvec{\theta }}^*)=0\). Under Condition C5, the matrix \(A({\varvec{\theta }})\) is invertible for \(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert \le \kappa _N\). It then follows that

$$\begin{aligned}&Q_{N}({\varvec{\theta }})\\&\quad ={Q}({\varvec{\theta }})+\epsilon _N({\varvec{\theta }})\\&\quad =-\frac{1}{2}({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}A({\varvec{\theta }}^*)({\varvec{\theta }}-{\varvec{\theta }}^*)+({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\\&\qquad +\,{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)+o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2)+o_p(N^{-1})\\&\quad =\ell _N({\varvec{\theta }})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)+o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2)+o_p(N^{-1}), \end{aligned}$$

where

$$\begin{aligned}&\ell _N({\varvec{\theta }})\\&\quad =-\frac{1}{2}({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}A({\varvec{\theta }}^*)({\varvec{\theta }}-{\varvec{\theta }}^*)+({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}\left[ \frac{1}{N}\sum _{i=1}^N \dot{b}_i({\varvec{\theta }}^*)\right] \\&\quad =-\frac{1}{2}\left( A^{1/2}({\varvec{\theta }}^*)\left[ {\varvec{\theta }}-{\varvec{\theta }}^*-A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] \right) ^{\textsf {T}}\\&\qquad \times \left( A^{1/2}({\varvec{\theta }}^*)\left[ {\varvec{\theta }}-{\varvec{\theta }}^*-A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] \right) \\&\qquad +\frac{1}{2}\left[ \frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] ^{\textsf {T}}A^{-1}({\varvec{\theta }}^*)\left[ \frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] . \end{aligned}$$

Hence, the maximizer of \(\ell _N({\varvec{\theta }})\) is \(\hat{{\varvec{\gamma }}}={\varvec{\theta }}^*+A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\). Since \(\hat{{\varvec{\theta }}}\) is the maximizer of \(Q_{N}({\varvec{\theta }})\),

$$\begin{aligned} 0&\le \ell _N(\hat{{\varvec{\gamma }}})-\ell _N(\hat{{\varvec{\theta }}}) \\&= \{\ell _N(\hat{{\varvec{\gamma }}})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)-Q_{N}(\hat{{\varvec{\gamma }}})\} \\&\quad-\,\{\ell _N(\hat{{\varvec{\theta }}})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)-Q_{N}(\hat{{\varvec{\theta }}})\} \\&\quad-\,\{Q_N(\hat{{\varvec{\theta }}})-Q_N(\hat{{\varvec{\gamma }}})\} \\ &\le \{\ell _N(\hat{{\varvec{\gamma }}})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)-Q_{N}(\hat{{\varvec{\gamma }}})\} \\&\quad-\{\ell _N(\hat{{\varvec{\theta }}})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)-Q_{N}(\hat{{\varvec{\theta }}})\} \\&= o_p(\Vert \hat{{\varvec{\theta }}}-{\varvec{\theta }}^*\Vert ^2)+o_p(\Vert \hat{{\varvec{\gamma }}}-{\varvec{\theta }}^*\Vert ^2)+o_p(N^{-1}). \end{aligned}$$
(26)

On the other hand, in view of the expression for \(\ell _N\),

$$\begin{aligned}&\ell _N(\hat{{\varvec{\gamma }}})-\ell _N(\hat{{\varvec{\theta }}}) \\&\quad =\frac{1}{2}\left( A^{1/2}({\varvec{\theta }}^*)\left[ \hat{{\varvec{\theta }}}-{\varvec{\theta }}^*-A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] \right) ^{\textsf {T}} \\&\qquad \times \left( A^{1/2}({\varvec{\theta }}^*)\left[ \hat{{\varvec{\theta }}}-{\varvec{\theta }}^*-A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] \right) . \\ \end{aligned}$$
(27)

Combining (26) and (27), we obtain

$$\begin{aligned} \hat{{\varvec{\theta }}}= {\varvec{\theta }}^*+A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)+o_p(\Vert \hat{{\varvec{\theta }}}-{\varvec{\theta }}^*\Vert )+o_p(\Vert \hat{{\varvec{\gamma }}}-{\varvec{\theta }}^*\Vert )+o_p(N^{-1/2}). \end{aligned}$$

Obviously, \(\hat{{\varvec{\gamma }}}-{\varvec{\theta }}^*=A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)=O_p(N^{-1/2}).\) It follows that \(\hat{{\varvec{\theta }}}-{\varvec{\theta }}^*=O_p(N^{-1/2})\) and

$$\begin{aligned} \hat{{\varvec{\theta }}}={\varvec{\theta }}^*+A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)+o_p(N^{-1/2}). \end{aligned}$$

By the central limit theorem, the proof of Theorem 2.2 is complete. \(\square\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, T., Yuan, X. & Sun, J. Weighted rank estimation for nonparametric transformation models with doubly truncated data. J. Korean Stat. Soc. 50, 1–24 (2021). https://doi.org/10.1007/s42952-020-00057-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42952-020-00057-6

Keywords

Navigation