Weighted rank estimation for nonparametric transformation models with doubly truncated data

Liu, Tianqing; Yuan, Xiaohui; Sun, Jianguo

doi:10.1007/s42952-020-00057-6

Weighted rank estimation for nonparametric transformation models with doubly truncated data

Research Article
Published: 20 February 2020

Volume 50, pages 1–24, (2021)
Cite this article

Journal of the Korean Statistical Society Aims and scope Submit manuscript

Tianqing Liu¹,
Xiaohui Yuan² &
Jianguo Sun³

277 Accesses
1 Citation
Explore all metrics

Abstract

Doubly truncated data often arise when event times are observed only if they fall within subject-specific intervals. We analyze doubly truncated data using nonparametric transformation models, where an unknown monotonically increasing transformation of the response variable is equal to an unknown monotonically increasing function of a linear combination of the covariates plus a random error with an unspecified log-concave probability density function. Furthermore, we assume that the truncation variables are conditionally independent of the response variable given the covariates and leave the conditional distributions of truncation variables given the covariates unspecified. For estimation of regression parameters, we propose a weighted rank (WR) estimation procedure and establish the consistency and asymptotic normality of the resulting estimator. The limiting covariance matrix of the WR estimator can be estimated by a resampling technique, which does not involve nonparametric density estimation or numerical derivatives. A numerical study is conducted and suggests that the proposed methodology works well in practice, and an illustration based on real data is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

Article 15 July 2015

A general model-checking procedure for semiparametric accelerated failure time models

Article 07 May 2024

References

Abrevaya, J. (1999). Rank estimation of a transformation model with observed truncation. Econometrics Journal, 2, 292–305.
MATH Google Scholar
Amemiya, T. (1985). Advanced Econometrics. Cambridge, MA: Harvard University Press.
Google Scholar
Austin, D., Simon, D. K., & Betensky, R. A. (2014). Computationally simple estimation and improved efficiency for special cases of double truncation. Lifetime Data Analysis, 20(3), 335–354.
MathSciNet MATH Google Scholar
Bhattacharya, P. K., Chernoff, H., & Yang, S. S. (1983). Nonparametric estimation of the slope of a truncated regression. The Annals of Statistics, 11, 505–514.
MathSciNet MATH Google Scholar
Borzadaran, G. R. M., & Borzadaran, H. A. M. (2011). Log-concavity property for some well-known distributions. Surveys in Mathematics and its Applications, 6, 203–219.
MathSciNet MATH Google Scholar
Cavanagh, C., & Sherman, R. P. (1998). Rank estimators for monotonic index models. Journal of Econometrics, 84(2), 351–381.
MathSciNet MATH Google Scholar
Cheng, S. C., Wei, L. J., & Ying, Z. (1995). Analysis of transformation models with censored data. Biometrika, 82(4), 835–845.
MathSciNet MATH Google Scholar
Efron, B., & Petrosian, V. (1999). Nonparametric methods for doubly truncated data. Journal of the American Statistical Association, 94, 824–834.
MathSciNet MATH Google Scholar
Emura, T., Hu, Y. H., & Konno, Y. (2017). Asymptotic inference for maximum likelihood estimators under the special exponential family with double-truncation. Statistical Papers, 58(3), 877–909.
MathSciNet MATH Google Scholar
Emura, T., & Konno, Y. (2012). Multivariate normal distribution approaches for dependently truncated data. Statistical Papers, 53, 133–149.
MathSciNet MATH Google Scholar
Emura, T., Konno, Y., & Michimae, H. (2015). Statistical inference based on the nonparametric maximum likelihood estimator under double-truncation. Lifetime Data Analysis, 21(3), 397–418.
MathSciNet MATH Google Scholar
Emura, T., & Wang, W. (2010). Testing quasi-independence for truncation data. Journal of Multivariate Analysis, 101, 223–239.
MathSciNet MATH Google Scholar
Emura, T., & Wang, W. (2016). Semiparametric inference for an accelerated failure time model with dependent truncation. Annals of the Institute of Statistical Mathematics, 68(5), 1073–1094.
MathSciNet MATH Google Scholar
Frank, G., & Dörre, A. (2017). Linear regression with randomly double-truncated data. South African Statistical Journal, 51(1), 1–18.
MathSciNet MATH Google Scholar
Han, A. K. (1987). Nonparametric analysis of a generalized regression model: The maximum rank correlation estimator. Journal of Econometrics, 35(2–3), 303–316.
MathSciNet MATH Google Scholar
Horowitz, J. L. (1996). Semiparametric estimation of a regression model with an unknown transformation of the dependent variable. Econometrica, 64, 103–137.
MathSciNet MATH Google Scholar
Hu, Y. H., & Emura, T. (2015). Maximum likelihood estimation for a special exponential family under random double-truncation. Computational Statistics, 30(4), 1199–1229.
MathSciNet MATH Google Scholar
Jin, Z., Ying, Z., & Wei, L. J. (2001). A simple resampling method by perturbing the minimand. Biometrika, 88, 381–390.
MathSciNet MATH Google Scholar
Kalbfleisch, J. D., & Lawless, J. F. (1989). Inferences based of retrospective ascertainment: An analysis of the data on transfusion related AIDS. Journal of the American Statistical Association, 84, 360–372.
MathSciNet MATH Google Scholar
Khan, S., & Tamer, E. (2007). Partial rank estimation of duration models with general forms of censoring. Journal of Econometrics, 136(1), 251–280.
MathSciNet MATH Google Scholar
Kim, J. P., Lu, W., Sit, T., & Ying, Z. (2013). A unified approach to semiparametric transformation models under generalized biased sampling schemes. Journal of the American Statistical Association, 108, 217–227.
MathSciNet MATH Google Scholar
Liang, H. Y., & Iglesias-Pérez, M. C. (2018). Weighted estimation of conditional mean function with truncated, censored and dependent data. Statistics, 52, 1249–1269.
MathSciNet MATH Google Scholar
Liu, H., Ning, J., Qin, J., & Shen, Y. (2016). Semiparametric maximum likelihood inference for truncated or biased-sampling data. Statistica Sinica, 26, 1087–1115.
MathSciNet MATH Google Scholar
Mandel, M., Uña-Álvarez, J., Simon, D. K., & Betensky, R. A. (2018). Inverse probability weighted Cox regression for doubly truncated data. Biometrics, 74, 481–487.
MathSciNet MATH Google Scholar
McLaren, C., Wagstaff, M., Brittegram, G., & Jacobs, A. (1991). Detection of two-component mixtures of lognormal distributions in grouped, doubly truncated data analysis of red blood cell volume distributions. Biometrics, 47, 607–622.
Google Scholar
Medley, G. F., Anderson, R. M., Cox, D. R., & Billard, L. (1987). Incubation periods of AIDS in patients via blood transfusion. Nature, 328, 719–721.
Google Scholar
Moreira, C., & de Uñ-Álvarez, J. (2010). Bootstrapping the NPMLE for doubly truncated data. Journal of Nonparametric Statistics, 22, 567–583.
MathSciNet MATH Google Scholar
Moreira, C., & de Uñ-Álvarez, J. (2012). Kernel density estimation with doubly-truncated data. Electronic Journal of Statistics, 6, 501–521.
MathSciNet MATH Google Scholar
Moreira, C., de Uñ-Álvarez, J., & Meira-Machado, L. (2016). Nonparametric regression with doubly truncated data. Computational Statistics and Data Analysis, 93, 294–307.
MathSciNet MATH Google Scholar
Moreira, C., & Van Keilegom, I. (2013). Bandwidth selection for kernel density estimation with doubly truncated data. Computational Statistics and Data Analysis, 61, 107–123.
MathSciNet MATH Google Scholar
Nolan, D., & Pollard, D. (1987). U-processes: Rates of convergence. The Annals of Statistics, 15(2), 780–799.
MathSciNet MATH Google Scholar
Rennert, L., & Xie, S. X. (2018). Cox regression model with doubly truncated data. Biometrics, 74, 725–733.
MathSciNet MATH Google Scholar
Samworth, R. J. (2018). Recent progress in log-concave density estimation. Statistical Science, 33(4), 493–509.
MathSciNet MATH Google Scholar
Shen, P. S. (2010). Nonparametric analysis of doubly truncated data. Annals of the Institute of Statistical Mathematics, 62, 835–853.
MathSciNet MATH Google Scholar
Shen, P. S. (2011). Testing quasi-independence for doubly truncated data. Journal of Nonparametric Statistics, 23, 1–9.
MathSciNet MATH Google Scholar
Shen, P. S. (2013). Regression analysis of interval censored and doubly truncated data with linear transformation models. Computational Statistics, 28, 581–596.
MathSciNet MATH Google Scholar
Shen, P. S. (2016). Analysis of transformation models with doubly truncated data. Statistical Methodology, 30, 15–30.
MathSciNet MATH Google Scholar
Shen, P. S., & Liu, Y. (2017). Pseudo maximum likelihood estimation for the Cox model with doubly truncated data. Statistical Papers. https://doi.org/10.1007/s00362-016-0870-8.
Shen, P. S., & Liu, Y. (2019). Pseudo MLE for semiparametric transformation model with doubly truncated data. Journal of the Korean Statistical Society, 48, 384–395.
MathSciNet MATH Google Scholar
Shen, P. S., Liu, Y., Maa, D. P., & Ju, Y. (2017). Analysis of transformation models with right-truncated data. Statistics, 51, 404–418.
MathSciNet MATH Google Scholar
Sherman, R. (1993). The limiting distribution of the maximum rank correlation estimator. Econometrica, 61(1), 123–137.
MathSciNet MATH Google Scholar
Sherman, R. (1994). Maximal inequalities for degenerate U-processes with applications to optimization estimators. The Annals of Statistics, 22(1), 439–459.
MathSciNet MATH Google Scholar
Song, X., Ma, S., Huang, J., & Zhou, X. H. (2007). A semiparametric approach for the nonparametric transformation survival model with multiple covariates. Biostatistics, 8(2), 197–211.
MATH Google Scholar
Wang, H. (2007). A note on iterative marginal optimization: A simple algorithm for maximum rank correlation estimation. Computational Statistics and Data Analysis, 51, 2803–2812.
MathSciNet MATH Google Scholar
Wang, S.-H., & Chiang, C.-T. (2019). Maximum partial-rank correlation estimation for lefttruncated and right-censored survival data. Statistica Sinica, 29, 2141–2161.
MathSciNet MATH Google Scholar
Ying, Z., Yu, W., Zhao, Z., & Zheng, M. (2019). Regression analysis of doubly truncated data. Journal of the American Statistical Association,. https://doi.org/10.1080/01621459.2019.1585252.
Article MATH Google Scholar

Download references

Acknowledgements

We are grateful to the anonymous reviewers and the editor for a number of constructive and helpful comments and suggestions that have clearly improved our manuscript.

Author information

Authors and Affiliations

Center for Applied Statistical Research and School of Mathematics, Jilin University, Changchun, 130012, Jilin, China
Tianqing Liu
School of Mathematics and Statistics, Changchun University of Technology, Changchun, 130012, Jilin, China
Xiaohui Yuan
Department of Statistics, University of Missouri, Columbia, MO, 65211, USA
Jianguo Sun

Authors

Tianqing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohui Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Jianguo Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaohui Yuan.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

In this appendix, we will sketch the proof of the asymptotic results in Theorems 2.1 and 2.2. The conditional density of $\tilde{Y}$ given $\tilde{W}=w$ is given by $f_{\tilde{Y}|\tilde{W}}(y|w)=f\{H(y)-G(w^\textsf {T}{\varvec{\beta }}^*)\}h(y)$, where $h(y)=dH(y)/dy$. Then, the conditional density of Y given $(W^\textsf {T},L,R)=(w^\textsf {T},l,r)$ can be written as

$$\begin{aligned} \frac{f\{H(y)-G(w^\textsf {T}{\varvec{\beta }}^*)\}h(y)I(l<y<r)}{F\{H(r)-G(w^\textsf {T}{\varvec{\beta }}^*)\}-F\{H(l)-G(w^\textsf {T}{\varvec{\beta }}^*)\}}. \end{aligned}$$

Let $Z=(W^\textsf {T},Y,L,R)^\textsf {T}$ denote an observation from the distribution P on the set $\mathcal {Z}\subseteq \mathbb {R}^{p+1}\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}$. For each $z=(w^{\textsf {T}},y,l,r)^\textsf {T}$ in $\mathcal {Z}$ and each ${\varvec{\theta }}$ in $\Theta$, define

$$\begin{aligned} \varrho (z,{\varvec{\theta }})&= E\{ \Psi (Z,z)I(W^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<w^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}))\}\\&\quad+\, E\{\Psi (z, Z)I(W^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})>w^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}))\}, \end{aligned}$$

where $w=(x^*,x^\textsf {T})^\textsf {T}$ and $\Psi (Z_1, Z_2)=I(L_2<Y_1< R_2)I(L_1<Y_2< R_1)I(Y_1< Y_2)$. Write $\nabla _m$ for the mth partial derivative operator of the function $\varrho (z,{\varvec{\theta }})$ with respect to ${\varvec{\theta }}=(\theta _1,\ldots ,\theta _p)^{\textsf {T}}\in \mathbb {R}^p$, and let

$$\begin{aligned} |\nabla _m|\varrho (z,{\varvec{\theta }})=\sum _{i_1,\ldots ,i_m\in \{1,\ldots ,p\}}\left| \frac{\partial ^m}{\partial \theta _{i_1}\cdots \partial \theta _{i_m}}\varrho (z,{\varvec{\theta }})\right| . \end{aligned}$$

To establish all the large-sample properties in this paper, we require the following conditions:

C0
(a) $\tilde{Z}_1,\ldots ,\tilde{Z}_{\tilde{N}}$ are independent copies of $\tilde{Z}$, where $\tilde{Z}=(\tilde{W}^\textsf {T},\tilde{Y},\tilde{L},\tilde{R})^\textsf {T}$; (b) $H(\tilde{Y})=G(\tilde{W}^\textsf {T}{\varvec{\beta }}^*)+\tilde{\varepsilon }$, where $H(\cdot )$ and $G(\cdot )$ are strictly increasing and continuously differentiable functions; $\tilde{W}$ and $\tilde{\varepsilon }$ are independent; (c) there exists a positive constant $c_0$ such that $0<f(a)\le c_0<+\infty$ for $a\in \mathbb {R}$ and the function $\eta (a)=\log \{f(a)\}$ has a continuous second derivative $\eta ''(a)$ for all $a\in \mathbb {R}$ and $\eta ''(a)<0$, $a\in \mathbb {R}$; (d) $(\tilde{L},\tilde{R})$ and $\tilde{Y}$ are conditional independent given $\tilde{W}$; (e) for $i=1,\ldots ,\tilde{N}$, $\tilde{Z}_i$ is observed if and only if $\tilde{L}_i<\tilde{Y}_i<\tilde{R}_i$. Moreover, let $N=\sum _{i=1}^{\tilde{N}}I(\tilde{L}_i<\tilde{Y}_i<\tilde{R}_i)$ denote the number of observations and $Z_i=(W_i^\textsf {T},Y_i,L_i,R_i)^\textsf {T}$, $i=1,\ldots ,N$ be the observed $\tilde{Z}_i$’s, where $\varepsilon _i$’s are the corresponding error terms and $W_i=(X_i^*,X_i^\textsf {T})^\textsf {T}$;
C1
Let $\mathcal {W}$ denote the support of W. (a) there exists a positive constant $c_1$ such that, for all $(w_1,w_2)\in \mathcal {W}\times \mathcal {W}$, $P(L_1\vee L_2< R_1\wedge R_2|W_1=w_1,W_2=w_2)\ge c_1>0$; (b)
$$\begin{aligned} \int _{-\infty }^{+\infty }\int _{-\infty }^{+\infty } I(y_1<y_2)h(y_1)h(y_2)dy_1dy_2<+\infty ; \end{aligned}$$
(10)
C2
The set $\mathcal {W}$ is not contained in any proper linear subspace of $\mathbb {R}^{p+1}$;
C3
The vector of covariates, $W=(X^*,X^\textsf {T})^\textsf {T}$, is of full rank, and $X^*$ has an everywhere-positive Lebesgue density conditional on X;
C4
The unknown parameter ${\varvec{\beta }}^*={\varvec{\beta }}({\varvec{\theta }}^*)=(1,{\varvec{\theta }}^{*\textsf {T}})$, where ${\varvec{\theta }}^*$ lies in the interior of the parameter space $\Theta$, which is a compact subset of $\mathbb {R}^p$;
C5
(a) Let $\mathcal {B}$ denote a neighborhood of ${\varvec{\theta }}^*$. For each z, all mixed second partial derivatives of $\varrho (z,{\varvec{\theta }})$ exist on $\mathcal {B}$. There is a function $\rho (z)$ such that $E\{\rho (Z)\}<+\infty$ and for all $z\in \mathcal {Z}$ and ${\varvec{\theta }}$ in $\mathcal {B}$,
$$\begin{aligned} \Vert \nabla _2\varrho (z,{\varvec{\theta }})-\nabla _2\varrho (z,{\varvec{\theta }}^*)\Vert \le \rho (z)\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ; \end{aligned}$$
(b) $E\{\Vert \nabla _1\varrho (Z,{\varvec{\theta }}^*)\Vert ^2\}<+\infty$; (c) $E\{|\nabla _2|\varrho (Z,{\varvec{\theta }}^*)\}<+\infty$;

(d) $E\{\nabla _2\varrho (Z,{\varvec{\theta }}^*)\}$ is negative definite.

Remark A.1

Most of the above conditions are assumed for a standard semiparametric monotonic linear index model (Cavanagh and Sherman 1998). Additional conditions are on the truncation mechanism and the distribution assumption of the random error. C0 defines the structure which generates the observations. Conditions C0–C4 guarantee the identifiability of ${\varvec{\theta }}^*$. Conditions C5 are standard regularity conditions sufficient to support an argument based on a Taylor expansion of $\varrho (z,{\varvec{\theta }})$ about ${\varvec{\theta }}^*$. Since $f(a)>0$ for all $a\in \mathbb {R}$, F(a) is a strictly increasing function of $a\in \mathbb {R}$. Let $\mathcal {L}$ and $\mathcal {R}$ denote the support of L and R, respectively. For all $(l,r,w)\in \mathcal {L}\times \mathcal {R}\times \mathcal {W}$ with $l<r$, it follows that $F\{H(r)-G(w^\textsf {T}{\varvec{\beta }}^*)\}-F\{H(l)-G(w^\textsf {T}{\varvec{\beta }}^*)\}>0$. Since H(a) is a strictly increasing and continuously differentiable function of a, we have $h(a)>0$, $a\in \mathbb {R}$. It follows that, for all $a<b$,

$$\begin{aligned} \int _{a}^{b}\int _{a}^{b} I(y_1<y_2)h(y_1)h(y_2)dy_1dy_2>0. \end{aligned}$$

(11)

Lemma A.1

Let $\eta (a)=\log \{f(a)\}$, $a\in \mathbb {R}$. Assume that $\eta (a)$ is twice continuously differentiable. Then $\eta ''(a)<0$, $a\in \mathbb {R}$ implies that

$$\begin{aligned} \frac{f(a)}{f(a-t)}>\frac{f(b)}{f(b-t)},\quad \text{ for } \ a<b\ \text{ and } \ t>0, \end{aligned}$$

(12)

with $a,\ b,\ a-t,\ b-t\in \mathbb {R}$.

Proof of Lemma A.1

Since $\eta '(a)>\eta '(b)$ for $a<b$, there exists $\delta >0$ such that

$$\begin{aligned} \frac{f(a)-f(a-t)}{t f(a-t)}>\frac{f(b)-f(b-t)}{t f(b-t)},\quad \text{ for } \ a<b\ \text{ and } \ 0<t<\delta . \end{aligned}$$

(13)

Thus, (12) is satisfied for $0<t<\delta$. Furthermore, for $u\in \mathbb {R}_+$, there exist constants $v_j\in \mathbb {R}$, $j=1,\ldots ,m$ such that

$$\begin{aligned} \frac{f(a)}{f(a-u)}=\prod _{j=1}^m\frac{f(a+v_{j-1})}{f(a+v_{j})}>\prod _{j=1}^m\frac{f(b+v_{j-1})}{f(b+v_{j})}=\frac{f(b)}{f(b-u)}, \end{aligned}$$

where $v_{0}=0$, $v_{m}=-u$, $0<v_{j-1}-v_j<\delta$, $j=1,\ldots ,m$. $\square$

Proof of Theorem 2.1

The proof is very similar to that of Theorem 1 of Cavanagh and Sherman (1998). Write $Q({\varvec{\theta }})$ for $E\{Q_N({\varvec{\theta }})\}$. We will show that (i) $Q({\varvec{\theta }})$ is uniquely maximized at ${\varvec{\theta }}^*$; (ii) $\sup _{{{\varvec{\theta }}}\in \Theta }|Q_N({\varvec{\theta }})-Q({\varvec{\theta }})|=o_p(1)$; (iii) $Q({\varvec{\theta }})$ is continuous. Consistency will then follow from standard arguments using the compactness of $\Theta$ (See, for example, Amemiya 1985, pp. 106–107).

Let $O_i=(W_i^{\textsf {T}},L_i,R_i)^{\textsf {T}}$ and

$$\begin{aligned} Q({\varvec{\theta }})= E[\Lambda _{ij}I(Y_i<Y_j)I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}],\ \ i\ne j. \end{aligned}$$

To establish consistency, we first show that $Q({\varvec{\theta }})$ has its maximizer at ${\varvec{\theta }}={\varvec{\theta }}^*$. For all $i\ne j$, we have that

$$\begin{aligned}&E[\Lambda _{ij}I(Y_i<Y_j)I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}|O_{i},O_{j}]\\&\quad =E[\Lambda _{ij}I(Y_i<Y_j)|O_{i},O_{j}]I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}\}\\&\quad = \xi _{ij}I(L_i\vee L_j<R_i\wedge R_j)I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}, \end{aligned}$$

where $\xi _{ij}=\frac{\int _{L_i\vee L_j}^{R_i\wedge R_j}\int _{L_i\vee L_j}^{R_i\wedge R_j} I(y_1<y_2)f\{H(y_1)-G(W_i^\textsf {T}{\varvec{\beta }}^*)\}f\{H(y_2)-G(W_j^\textsf {T}{\varvec{\beta }}^*)\}h(y_1)h(y_2)dy_1dy_2}{\zeta _i\zeta _j}.$ and $\zeta _t=F\{H(R_t)-G(W_t^\textsf {T}{\varvec{\beta }}^*)\}-F\{H(L_t)-G(W_t^\textsf {T}{\varvec{\beta }}^*)\}$, $t=i,j$. Analogously, we can derive an expression for $\xi _{ji}$, that is,

$$\begin{aligned} \xi _{ji}=\frac{\int _{L_i\vee L_j}^{R_i\wedge R_j}\int _{L_i\vee L_j}^{R_i\wedge R_j} I(y_1<y_2)f\{H(y_1)-G(W_j^\textsf {T}{\varvec{\beta }}^*)\}f\{H(y_2)-G(W_i^\textsf {T}{\varvec{\beta }}^*)\}h(y_1)h(y_2)dy_1dy_2}{\zeta _i\zeta _j}. \end{aligned}$$

By using Lemma A.1 with condition C0(c) and setting $a=H(y_1)-G(u_1)$, $b=H(y_2)-G(u_1)$ and $t=G(u_2)-G(u_1)$ in (12), it is easy to verify that

$$\begin{aligned}&f\{H(y_1)-G(u_1)\}f\{H(y_2)-G(u_2)\} \\&\quad >f\{H(y_1)-G(u_2)\}f\{H(y_2)-G(u_1)\}, \end{aligned}$$

(14)

where $u_1<u_2$ and $y_1<y_2$. When $W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)$ and y₁ < y₂, by (14), we have

$$\begin{aligned}&D(y_1,y_2|W_i^\textsf {T}{\varvec{\beta }}^*,W_j^\textsf {T}{\varvec{\beta }}^*) \\&\quad =:f\{H(y_1)-G(W_i^\textsf {T}{\varvec{\beta }}^*)\}f\{H(y_2)-G(W_j^\textsf {T}{\varvec{\beta }}^*)\} \\&\qquad -\, f\{H(y_1)-G(W_j^\textsf {T}{\varvec{\beta }}^*)\}f\{H(y_2)-G(W_i^\textsf {T}{\varvec{\beta }}^*)\}>0. \end{aligned}$$

(15)

By (15), (10) and (11), if $W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)$ and $L_i\vee L_j<R_i\wedge R_j$, we have

$$\begin{aligned}&\xi _{ij}-\xi _{ji}\\&\quad =\int _{L_i\vee L_j}^{R_i\wedge R_j}\int _{L_i\vee L_j}^{R_i\wedge R_j} \frac{D(y_1,y_2|W_i^\textsf {T}{\varvec{\beta }}^*,W_j^\textsf {T}{\varvec{\beta }}^*)}{\zeta _i\zeta _j}I(y_1<y_2) h(y_1)h(y_2)dy_1dy_2>0. \end{aligned}$$

Similarly, one can show that $\xi _{ji}>\xi _{ij}$, if $W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)<W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}^*)$ and $L_i\vee L_j<R_i\wedge R_j$.

By symmetry,

$$\begin{aligned} Q({\varvec{\theta }})&= \frac{1}{2}E\{I(L_i\vee L_j<R_i\wedge R_j) \\&\quad\times \,[\xi _{ij}I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}+\xi _{ji}I\{W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}) <W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}]\}. \end{aligned}$$

(16)

The value of $I(L_i\vee L_j<R_i\wedge R_j)[\xi _{ij}I\{W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})<W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})\}+\xi _{ji}I\{W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})<W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})\}]$ may be less than $I(L_i\vee L_j<R_i\wedge R_j)(\xi _{ij}\vee \xi _{ji})$, depending on ${\varvec{\theta }}$. However, condition C3 ensures that

$$\begin{aligned}&\frac{1}{2}I(L_i\vee L_j<R_i\wedge R_j)[\xi _{ij}I\{W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})<W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})\} \\&\qquad +\,\xi _{ji}I\{W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})<W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }})\}] \\&\quad \le \frac{1}{2}I(L_i\vee L_j<R_i\wedge R_j)(\xi _{ij}\vee \xi _{ji}) \\&\quad \equiv \frac{1}{2}I(L_i\vee L_j<R_i\wedge R_j)[\xi _{ij}I\{W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }}^*)<W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }}^*)\} \\&\qquad +\,\xi _{ji}I\{W_j^\textsf {T}{\varvec{\beta }}({\varvec{\theta }}^*)<W_i^\textsf {T}{\varvec{\beta }}({\varvec{\theta }}^*)\}] \end{aligned}$$

(17)

with probability one. Taking expectation with respect to $\{O_i,O_j\}$ on both sides of the inequality in (17), we have $Q({\varvec{\theta }})\le Q({\varvec{\theta }}^*)$, which shows that ${\varvec{\theta }}^*$ is a maximizer of $Q({\varvec{\theta }})$. Let

$$\begin{aligned} \eta _{ij}(s,t)=\frac{\int _{L_i\vee L_j}^{R_i\wedge R_j}\int _{L_i\vee L_j}^{R_i\wedge R_j} I(y_1<y_2)f\{H(y_1)-G(s)\}f\{H(y_2)-G(t)\}h(y_1)h(y_2)dy_1dy_2}{\zeta _i\zeta _j}. \end{aligned}$$

We can write

$$\begin{aligned}&Q({\varvec{\theta }}^*)\\&\quad =\frac{1}{2}E [\max \{\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*),\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*)\}I(L_1\vee L_2<R_1\wedge R_2)] \end{aligned}$$

and

$$\begin{aligned}&Q({\varvec{\theta }}) \\&\quad =\frac{1}{2}E\{[\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*)I\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}+\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*) \\&\qquad \times \, I\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})>W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}]I(L_1\vee L_2<R_1\wedge R_2)\}. \end{aligned}$$

(18)

We now apply the method of proof by contradiction to show that ${\varvec{\theta }}^*$ is the unique maximizer of $Q({\varvec{\theta }})$. Suppose that for some ${\varvec{\theta }}\in \Theta$ and ${\varvec{\theta }}\ne {\varvec{\theta }}^*$,

$$\begin{aligned} Q({\varvec{\theta }})&= \frac{1}{2}E[\max \{\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*),\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*)\} \\&\quad\times \, I(L_1\vee L_2<R_1\wedge R_2)]. \end{aligned}$$

(19)

Deduce from (18) and (19) that

$$\begin{aligned}&P[\{\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*)-\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*)\} \\&\quad \times \,\{ W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})-W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}>0|L_1\vee L_2<R_1\wedge R_2]=1. \end{aligned}$$

(20)

Let $S_{\mathcal{X}}$ denote the support of $X=(X_1,\ldots ,X_p)^{\textsf {T}}$ and $\text{ CH }_{\mathcal{X}}$ denote the convex hull of $S_{\mathcal{X}}$. That is, $\text{ CH }_{\mathcal{X}}$ is the smallest convex set containing $S_{\mathcal{X}}$. Condition C2 implies that $\text{ CH }_{\mathcal{X}}$ is a p-dimensional subset of $\mathbb {R}^p$ and so has a nonempty interior. Select a point $\mu$ from this interior and define $I_\mu =\{(t,\mu ^{\textsf {T}})^{\textsf {T}}:t\in \mathbb {R}\}$. Notice that the definition of $I_\mu$ and condition C4 together imply that $\{\iota ^{\textsf {T}}{\varvec{\beta }}^*:\iota \in I_\mu \}\equiv \mathbb {R}$. (10), (11) and (14) guarantee the existence of a point $\iota _0$ in $I_\mu$ such that $s_0=\iota _0^{\textsf {T}}{\varvec{\beta }}^*$ in the support of $W^{\textsf {T}}{\varvec{\beta }}^*$ and for $w\in \mathcal {W}$,

$$\begin{aligned}&\eta _{12}(s_{0},s)<\eta _{12}(s,s_{0})\ \ \text{ if }\ \ \ s=w^{\textsf {T}}{\varvec{\beta }}^*<s_0 \ \text{ and }\ \ L_1\vee L_2<R_1\wedge R_2;\\&\quad \eta _{12}(s_{0},s)>\eta _{12}(s,s_{0})\ \ \text{ if }\ \ \ s=w^{\textsf {T}}{\varvec{\beta }}^*>s_0 \ \text{ and }\ \ L_1\vee L_2<R_1\wedge R_2. \end{aligned}$$

Define the $(p+1)$-dimensional open wedges

$$\begin{aligned} G_1({\varvec{\theta }})&= \{w^{\textsf {T}}{\varvec{\beta }}^*<\iota _0^{\textsf {T}}{\varvec{\beta }}^*\}\{w^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})>\iota _0^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}\ \ \text{ and }\\ G_2({\varvec{\theta }})&= \{w^{\textsf {T}}{\varvec{\beta }}^*>\iota _0^{\textsf {T}}{\varvec{\beta }}^*\}\{w^{\textsf {T}}({\varvec{\theta }})<\iota _0^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}. \end{aligned}$$

If $W_1\in G_1({\varvec{\theta }})$, $W_2\in G_2({\varvec{\theta }})$ and $L_1\vee L_2<R_1\wedge R_2$, then

$$\begin{aligned}&\eta _{12}(W_{2}^{\textsf {T}}{\varvec{\beta }}^*,W_{1}^{\textsf {T}}{\varvec{\beta }}^*)<\eta _{12}(W_{1}^{\textsf {T}}{\varvec{\beta }}^*,W_{2}^{\textsf {T}}{\varvec{\beta }}^*)\\&\quad \text{ while }\ \ W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})>W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}) \ \text{ and }\ \ L_1\vee L_2<R_1\wedge R_2. \end{aligned}$$

which contradicts (20) if

$$\begin{aligned} P\{W_1\in G_1({\varvec{\theta }}), W_2\in G_2({\varvec{\theta }}), L_1\vee L_2<R_1\wedge R_2\}>0. \end{aligned}$$

(21)

We now show that (21) actually holds for ${\varvec{\theta }}\ne {\varvec{\theta }}^*$. Since

$$\begin{aligned}&c_1P\{W_1\in G_1({\varvec{\theta }})\}P\{ W_2\in G_2({\varvec{\theta }})\}\\&\quad \le E[I\{W_1\in G_1({\varvec{\theta }}), W_2\in G_2({\varvec{\theta }})\}P\{ L_1\vee L_2<R_1\wedge R_2|W_1,W_2\}]\\&\quad =E[I\{W_1\in G_1({\varvec{\theta }}), W_2\in G_2({\varvec{\theta }})\}E\{I(L_1\vee L_2<R_1\wedge R_2)|W_1,W_2\}]\\&\quad =P\{W_1\in G_1({\varvec{\theta }}), W_2\in G_2({\varvec{\theta }}), L_1\vee L_2<R_1\wedge R_2\}\\&\quad \le P\{W_1\in G_1({\varvec{\theta }})\}P\{ W_2\in G_2({\varvec{\theta }})\}, \end{aligned}$$

we only need to show that

$$\begin{aligned} P\{W_1\in G_1({\varvec{\theta }})\}P\{ W_2\in G_2({\varvec{\theta }})\}>0 \end{aligned}$$

(22)

holds for ${\varvec{\theta }}\ne {\varvec{\theta }}^*$.

For each ${\varvec{\theta }}$ in $\Theta$, define

$$\begin{aligned} M_{{\varvec{\theta }}} = \{w^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})=\iota _0^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}\ \quad \text{ and }\quad L_{{\varvec{\theta }}} = M_{{\varvec{\theta }}}\cap M_{{\varvec{\theta }}^*}. \end{aligned}$$

Note that $G_1({\varvec{\theta }})$ and $G_2({\varvec{\theta }})$ are delimited by the p-dimensional hyperplanes $M_{{\varvec{\theta }}}$ and $M_{{\varvec{\theta }}^*}$, and for ${\varvec{\theta }}\ne {\varvec{\theta }}^*$, $L_{{\varvec{\theta }}}$ is a $(p-1)$-dimensional hyperplane in $\mathbb {R}^{p+1}$. Consider the projections

$$\begin{aligned} P_0({\varvec{\theta }}) = \{x\in \text{ CH }_{\mathcal{X}}: (t,x^{\textsf {T}})^{\textsf {T}}\in L_{{\varvec{\theta }}}\ \text{ for } \text{ some }\ t\in \mathbb {R}\} \end{aligned}$$

and for $j=1,2$,

$$\begin{aligned} P_j({\varvec{\theta }}) = \{x\in \text{ CH }_{\mathcal{X}}: (t,x^{\textsf {T}})^{\textsf {T}}\in G_j({\varvec{\theta }})\ \text{ for } \text{ some }\ t\in \mathbb {R}\}. \end{aligned}$$

That is, $P_0({\varvec{\theta }})$ projects $L_{{\varvec{\theta }}}$ into $\text{ CH }_{\mathcal{X}}$ and $P_j({\varvec{\theta }})$ projects $G_j({\varvec{\theta }})$ into $\text{ CH }_{\mathcal{X}}$. Also note that $\{P_j({\varvec{\theta }}): j=0,1,2\}$ partitions $\text{ CH }_{\mathcal{X}}$.

Since both $M_{{\varvec{\theta }}}$ and $M_{{\varvec{\theta }}^*}$ contain $\iota _0$, $L_{{\varvec{\theta }}}$ must contain $\iota _0$. Since $\iota _0$ is an element of $I_\mu$, $P_0({\varvec{\theta }})$ must contain $\mu$. Since $\mu$ is an interior point of $\text{ CH }_{\mathcal{X}}$, $P_0({\varvec{\theta }})$ cannot contain an entire $(p-1)$-dimensional face of $\text{ CH }_{\mathcal{X}}$. But then each $P_j({\varvec{\theta }})$ must contain at least one point of $S_{\mathcal{X}}$, implying

$$\begin{aligned} \int _{P_j({\varvec{\theta }})\cap S_{\mathcal{X}}} dF_X(x)>0,\quad j=1,2, \end{aligned}$$

(23)

where $F_X(\cdot )$ denotes the distribution of X.

For each x in $S_{\mathcal{X}}$, write $l_x$ for the line through x parallel to the 1th coordinate axis. If ${\varvec{\theta }}\ne {\varvec{\theta }}^*$, then there must be a nonzero angle between $M_{{\varvec{\theta }}}$ and $M_{{\varvec{\theta }}^*}$, and so at least one of $M_{{\varvec{\theta }}}$ and $M_{{\varvec{\theta }}^*}$ must intersect $l_x$. Write $t(x|{\varvec{\theta }})$ for the 1th component of $M_{{\varvec{\theta }}}\cap l_x$. If $M_{{\varvec{\theta }}}\cap l_x$ is null, define $t_{{\varvec{\theta }}}(x)=\infty$. Then

$$\begin{aligned} P\{W\in G_j({\varvec{\theta }})\}=\int _{P_j({\varvec{\theta }})\cap S_{\mathcal{X}}}\left[ \int _{\min \{t(x|{\varvec{\theta }}),t(x|{\varvec{\theta }}^*)\}}^{\max \{t(x|{\varvec{\theta }}),t(x|{\varvec{\theta }}^*)\}}f(t|x)dt\right] dF_X(x),\quad j=1,2, \end{aligned}$$

where $f(\cdot | x)$ denotes the conditional density of $X^*$ given $X=x$. Since ${\varvec{\theta }}\ne {\varvec{\theta }}^*$, $t(x|{\varvec{\theta }})\ne t(x|{\varvec{\theta }}^*)$ for each x in $S_{\mathcal{X}}$. This, C3, and (23) imply that $P\{W\in G_j({\varvec{\theta }})\}>0$, $j=1,2$. Thus, (21) actually holds. This establishes (i).

For each ${\varvec{\theta }}\in \Theta$ and each $(z_1,z_2)$ in $\mathcal {Z}\times \mathcal {Z}$ define

$$\begin{aligned} f(z_1,z_2,{\varvec{\theta }})=\Psi (z_1,z_2)I\{w_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<w_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}-{Q}({\varvec{\theta }}). \end{aligned}$$

Then

$$\begin{aligned} Q_{N}({\varvec{\theta }})-{Q}({\varvec{\theta }})=\mathbb {U}_Nf(\cdot ,\cdot ,{\varvec{\theta }}), \end{aligned}$$

where $\mathbb {U}_N$ denotes the random measure putting mass $1/(N^2-N)$ on each pair $(Z_i,Z_j)$, $i\ne j$. That is $\{\mathbb {U}_Nf(\cdot ,\cdot ,{\varvec{\theta }}): {\varvec{\theta }}\in \Theta \}$ is a zero-mean U-process of order 2. A trivial modification of the argument given in Sherman (1993, Section 5) shows that $\{f(\cdot ,\cdot ,{\varvec{\theta }}):{\varvec{\theta }}\in \Theta \}$ is Euclidean for the envelope $|\Psi (z_1,z_2)|+E\{|\Psi (Z_1,Z_2)|\}$. Deduce from Corollary 7 of Sherman (1994, Section 6) that

$$\begin{aligned} \sup _{{{\varvec{\theta }}}\in \Theta }|\mathbb {U}_Nf(\cdot ,\cdot ,{\varvec{\theta }})|=O_p(N^{-1/2}). \end{aligned}$$

This establishes (ii).

Finally, fix ${\varvec{\theta }}\in \Theta$, and let ${\varvec{\theta }}_m$ denote a sequence of elements of $\Theta$ converging to ${\varvec{\theta }}$ as m tends to infinity. Condition C3 implies that

$$\begin{aligned} P\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})=W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}=0. \end{aligned}$$

This in turn implies that

$$\begin{aligned}&\Psi (z_1,z_2)I\{w_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}_m)<w_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}_m)\}-\Psi (z_1,z_2)I\{w_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<w_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\} \\&\quad \rightarrow 0 \end{aligned}$$

(24)

as $m\rightarrow \infty$, for almost all $(z_1,z_2)\in \mathcal {Z}\times \mathcal {Z}$. Take expectation of

$$\begin{aligned} \Psi (Z_1,Z_2)I\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}_m)<W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }}_m)\}-\Psi (Z_1,Z_2)I\{W_{1}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{2}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}, \end{aligned}$$

then apply the dominated convergence theorem with $2|\Psi (Z_1,Z_2)|$ as the dominating function to establish (iii). Hence, consistency is proved. $\square$

Proof of Theorem 2.2

To establish asymptotic normality, let $\epsilon _N({\varvec{\theta }})=Q_{N}({\varvec{\theta }})-{Q}({\varvec{\theta }})$. A standard decomposition of U-statistics gives

$$\begin{aligned} \epsilon _N({\varvec{\theta }})-\epsilon _N({\varvec{\theta }}^*)= \frac{1}{N}\sum _{i=1}^Nb_i({\varvec{\theta }})+\frac{1}{N^2-N}\sum _{i<j}d_{ij}({\varvec{\theta }}), \end{aligned}$$

where

$$\begin{aligned} b_i({\varvec{\theta }})&= E[a_{ij}({\varvec{\theta }})+a_{ji}({\varvec{\theta }})-2E\{a_{ij}({\varvec{\theta }})\}|Z_i],\\ d_{ij}({\varvec{\theta }})&= a_{ij}({\varvec{\theta }})+a_{ji}({\varvec{\theta }})-2E\{a_{ij}({\varvec{\theta }})\}-b_i({\varvec{\theta }})-b_j({\varvec{\theta }}),\\ a_{ij}({\varvec{\theta }})&= \Lambda _{ij}I(Y_i<Y_j) \\&\quad\times [I\{W_{i}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})<W_{j}^{\textsf {T}}{\varvec{\beta }}({\varvec{\theta }})\}-I(W_{i}^{\textsf {T}}{\varvec{\beta }}^*<W_{j}^{\textsf {T}}{\varvec{\beta }}^*)]. \end{aligned}$$

Note that $E\{b_i({\varvec{\theta }})\} = 0$ for ${\varvec{\theta }}\in \Theta$; and $b_i({\varvec{\theta }}^*)=0$. Under Condition C5, a Taylor expansion gives

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^Nb_i({\varvec{\theta }})= ({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}\frac{1}{N}\sum _{i=1}^N\dot{b}_{i}({\varvec{\theta }}^*)+o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2), \end{aligned}$$

where $\dot{b}_{i}({\varvec{\theta }})=\partial b_{i}({\varvec{\theta }})/\partial {\varvec{\theta }}$. Similar to the proof of uniform convergence in Theorem 2.1, we need to show that, for any sequence $\kappa _N$ of order o(1),

$$\begin{aligned} \sup _{\Vert {{\varvec{\theta }}}-{{\varvec{\theta }}}^*\Vert \le \kappa _N}~\left| \frac{1}{N^2-N}\sum _{i<j}d_{ij}({\varvec{\theta }})\right| =o_p(N^{-1}). \end{aligned}$$

(25)

The identical subgraph set and Vapnik–Chervonenkis class set arguments of Sherman (1993, Section 5), together with Corollary 17 and Corollary 21 in Nolan and Pollard (1987), show that the class of function $d_{ij}({\varvec{\theta }})$ is Euclidean. The Euclidean property together with Corollary 8 of Sherman (1994) guarantee that (25) holds.

Notice that $Q({\varvec{\theta }})=2^{-1}E\{\varrho (Z,{\varvec{\theta }})\}$. For ${\varvec{\theta }}$ in a neighbourhood of ${\varvec{\theta }}^*$, by condition C5, we have

$$\begin{aligned} {Q}({\varvec{\theta }})&= {Q}({\varvec{\theta }}^*)+({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}u({\varvec{\theta }}^*)-\frac{1}{2}({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}A({\varvec{\theta }}^*)({\varvec{\theta }}-{\varvec{\theta }}^*)\\&\quad+\,o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2)\\&= {Q}({\varvec{\theta }}^*)-\frac{1}{2}({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}A({\varvec{\theta }}^*)({\varvec{\theta }}-{\varvec{\theta }}^*)+o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2), \end{aligned}$$

where $u({\varvec{\theta }})=\partial {Q}({\varvec{\theta }})/\partial {\varvec{\theta }}$, $A({\varvec{\theta }})=-\partial ^2 {Q}({\varvec{\theta }})/\partial {\varvec{\theta }}\partial {\varvec{\theta }}^{\textsf {T}}$ and $u({\varvec{\theta }}^*)=0$. Under Condition C5, the matrix $A({\varvec{\theta }})$ is invertible for $\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert \le \kappa _N$. It then follows that

$$\begin{aligned}&Q_{N}({\varvec{\theta }})\\&\quad ={Q}({\varvec{\theta }})+\epsilon _N({\varvec{\theta }})\\&\quad =-\frac{1}{2}({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}A({\varvec{\theta }}^*)({\varvec{\theta }}-{\varvec{\theta }}^*)+({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\\&\qquad +\,{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)+o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2)+o_p(N^{-1})\\&\quad =\ell _N({\varvec{\theta }})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)+o_p(\Vert {\varvec{\theta }}-{\varvec{\theta }}^*\Vert ^2)+o_p(N^{-1}), \end{aligned}$$

where

$$\begin{aligned}&\ell _N({\varvec{\theta }})\\&\quad =-\frac{1}{2}({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}A({\varvec{\theta }}^*)({\varvec{\theta }}-{\varvec{\theta }}^*)+({\varvec{\theta }}-{\varvec{\theta }}^*)^{\textsf {T}}\left[ \frac{1}{N}\sum _{i=1}^N \dot{b}_i({\varvec{\theta }}^*)\right] \\&\quad =-\frac{1}{2}\left( A^{1/2}({\varvec{\theta }}^*)\left[ {\varvec{\theta }}-{\varvec{\theta }}^*-A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] \right) ^{\textsf {T}}\\&\qquad \times \left( A^{1/2}({\varvec{\theta }}^*)\left[ {\varvec{\theta }}-{\varvec{\theta }}^*-A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] \right) \\&\qquad +\frac{1}{2}\left[ \frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] ^{\textsf {T}}A^{-1}({\varvec{\theta }}^*)\left[ \frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] . \end{aligned}$$

Hence, the maximizer of $\ell _N({\varvec{\theta }})$ is $\hat{{\varvec{\gamma }}}={\varvec{\theta }}^*+A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)$. Since $\hat{{\varvec{\theta }}}$ is the maximizer of $Q_{N}({\varvec{\theta }})$,

$$\begin{aligned} 0&\le \ell _N(\hat{{\varvec{\gamma }}})-\ell _N(\hat{{\varvec{\theta }}}) \\&= \{\ell _N(\hat{{\varvec{\gamma }}})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)-Q_{N}(\hat{{\varvec{\gamma }}})\} \\&\quad-\,\{\ell _N(\hat{{\varvec{\theta }}})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)-Q_{N}(\hat{{\varvec{\theta }}})\} \\&\quad-\,\{Q_N(\hat{{\varvec{\theta }}})-Q_N(\hat{{\varvec{\gamma }}})\} \\ &\le \{\ell _N(\hat{{\varvec{\gamma }}})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)-Q_{N}(\hat{{\varvec{\gamma }}})\} \\&\quad-\{\ell _N(\hat{{\varvec{\theta }}})+{Q}({\varvec{\theta }}^*)+\epsilon _N({\varvec{\theta }}^*)-Q_{N}(\hat{{\varvec{\theta }}})\} \\&= o_p(\Vert \hat{{\varvec{\theta }}}-{\varvec{\theta }}^*\Vert ^2)+o_p(\Vert \hat{{\varvec{\gamma }}}-{\varvec{\theta }}^*\Vert ^2)+o_p(N^{-1}). \end{aligned}$$

(26)

On the other hand, in view of the expression for $\ell _N$,

$$\begin{aligned}&\ell _N(\hat{{\varvec{\gamma }}})-\ell _N(\hat{{\varvec{\theta }}}) \\&\quad =\frac{1}{2}\left( A^{1/2}({\varvec{\theta }}^*)\left[ \hat{{\varvec{\theta }}}-{\varvec{\theta }}^*-A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] \right) ^{\textsf {T}} \\&\qquad \times \left( A^{1/2}({\varvec{\theta }}^*)\left[ \hat{{\varvec{\theta }}}-{\varvec{\theta }}^*-A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)\right] \right) . \\ \end{aligned}$$

(27)

Combining (26) and (27), we obtain

$$\begin{aligned} \hat{{\varvec{\theta }}}= {\varvec{\theta }}^*+A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)+o_p(\Vert \hat{{\varvec{\theta }}}-{\varvec{\theta }}^*\Vert )+o_p(\Vert \hat{{\varvec{\gamma }}}-{\varvec{\theta }}^*\Vert )+o_p(N^{-1/2}). \end{aligned}$$

Obviously, $\hat{{\varvec{\gamma }}}-{\varvec{\theta }}^*=A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)=O_p(N^{-1/2}).$ It follows that $\hat{{\varvec{\theta }}}-{\varvec{\theta }}^*=O_p(N^{-1/2})$ and

$$\begin{aligned} \hat{{\varvec{\theta }}}={\varvec{\theta }}^*+A^{-1}({\varvec{\theta }}^*)\frac{1}{N}\sum _{i=1}^N\dot{b}_i({\varvec{\theta }}^*)+o_p(N^{-1/2}). \end{aligned}$$

By the central limit theorem, the proof of Theorem 2.2 is complete. $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, T., Yuan, X. & Sun, J. Weighted rank estimation for nonparametric transformation models with doubly truncated data. J. Korean Stat. Soc. 50, 1–24 (2021). https://doi.org/10.1007/s42952-020-00057-6

Download citation

Received: 15 October 2019
Accepted: 10 February 2020
Published: 20 February 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s42952-020-00057-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighted rank estimation for nonparametric transformation models with doubly truncated data

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

A general model-checking procedure for semiparametric accelerated failure time models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Remark A.1

Lemma A.1

Proof of Lemma A.1

Proof of Theorem 2.1

Proof of Theorem 2.2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Weighted rank estimation for nonparametric transformation models with doubly truncated data

Abstract

Access this article

Similar content being viewed by others

Violating the normality assumption may be the lesser of two evils

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

A general model-checking procedure for semiparametric accelerated failure time models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Remark A.1

Lemma A.1

Proof of Lemma A.1

Proof of Theorem 2.1

Proof of Theorem 2.2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation