1 Introduction

The subject of this work is random coefficient regression (RCR) models in which only one observation per observational unit (individual) is possible. These models are popular e.g. in psychology (see Freund and Holling (2008)) and pharmacokinetics (see Patan and Bogacka (2007)). The main purpose of the present paper is to obtain an analytical solution for the designs that are optimal for the prediction of random effects. Optimal designs for prediction in RCR models have been discussed e.g. in Gladitz and Pilz (1982), Fedorov and Hackl (1997), Prus and Schwabe (2016) and Prus (2023). In Gladitz and Pilz (1982) and Prus and Schwabe (2016) the number of observations per individual is required to be not smaller than the number of unknown model parameters, which excludes the case of only one observation. Also the same design for all individuals has been assumed in the both papers. Fedorov and Hackl (1997) investigated models with specific regression functions. The models considered in the present work may be seen as a particular case (with one individual per group) of the multiple-group models discussed in Prus (2023). However, the solution developed in that paper is based on the assumption of sufficient number of observations per individual, i.e. not smaller than the number of parameters. Models with only one observation per individual were discussed e.g. in Patan and Bogacka (2007) and Graßhoff et al. (2012). Patan and Bogacka (2007) investigated non-linear mixed-effects models. Graßhoff et al. (2012) proposed a solution for optimal designs in form of an optimality condition for RCR models. In these papers optimal design were determined for estimation of fixed effects. However, not much has been done for prediction of random effects.

In the present work we determine optimal designs for prediction of random effects for a group of selected individuals in RCR models with one observation per individual. The obtained design criterion (linear criterion) results in a multiple-design problem, for which the standard approach for design optimization proposed in Kiefer (1974) cannot be applied. We make use of the optimality conditions for multiple-design problems proposed in Prus (2022). The obtained analytical results are illustrated by examples of linear regression models with particular covariance matrices.

The paper has the following structure: In the second section the model is specified and the mean squared error (MSE) matrix for the best linear unbiased prediction (BLUP) of the random effects is determined. In Sect. 3 the linear design criterion for the prediction and the resulting optimality conditions are formulated. The analytical results are illustrated by examples in Sect. 4. The paper is concluded by a short discussion in Sect. 5.

2 Model specification

In this work we the consider RCR models with only one observation per observational unit (individual) of the following form:

$$\begin{aligned} {Y}_{i}=\textbf{f}(x_{i})^\top {\varvec{\beta }} _i+ \varepsilon _{i}, \quad i=1, \dots , n, \quad x_i\in \mathcal {X}, \end{aligned}$$
(1)

where \(Y_i\) is an observation at the i-th individual, n is the number of individuals, \(\textbf{f} =(f_1, \dots ,f_p)^\top \) is a vector of known regression functions, experimental settings \(x_i\) come from an experimental region \(\mathcal {X}\). The observational errors \(\varepsilon _{i}\) are assumed to have zero mean and common variance \(\sigma ^2>0\). The individual parameters \(\varvec{\beta }_i=( \beta _{i1}, \dots , \beta _{ip})^\top \) have unknown expected value (population mean) \(\textrm{E}\,(\varvec{\beta }_i)={\varvec{\beta }}\) and known covariance matrix \(\textrm{Cov}\,(\varvec{\beta }_i)=\sigma ^2\textbf{D}\). All individual parameters \(\varvec{\beta }_{i}\) and all observational errors \(\varepsilon _{i}\) are assumed to be uncorrelated.

According to Graßhoff et al. (2012) the covariance matrix of the best linear unbiased estimator (BLUE) \(\hat{\varvec{\beta }}\) for the population parameter (fixed effects) \(\varvec{\beta }\) is given by

$$\begin{aligned} \textrm{Cov}(\hat{\varvec{\beta }})=\sigma ^2\textbf{M}^{-1}, \end{aligned}$$
(2)

where

$$\begin{aligned}{} & {} \textbf{M}=\sum _{i=1}^n\textbf{M}(x_i), \end{aligned}$$
(3)
$$\begin{aligned}{} & {} \textbf{M}(x_i)=\textbf{g}(x_i)\textbf{g}(x_i)^\top , \end{aligned}$$
(4)
$$\begin{aligned}{} & {} \textbf{g}(x_i)=\frac{1}{\sqrt{\textbf{f}(x_i)^\top \textbf{D}\textbf{f}(x_i)+1}}\textbf{f}(x_i). \end{aligned}$$
(5)

In this work we focus on random effects, in particular on the individual deviations from the mean: \(\textbf{b}_i=\varvec{\beta }_i-\varvec{\beta }\) (see e.g. Prus and Schwabe (2013)). Namely, we consider the situation in which our main interest is in some selected individuals. We determine optimal experimental settings for prediction of the individual effects for the k selected individuals: \({\varvec{\Psi }}=\frac{1}{k}\sum _{i=1}^k\textbf{b}_i\), for \(k\in [p,n-p]\). We assume \(n\ge 2p\), otherwise \([p,n-p]=\emptyset \). Note that the order of the individuals does not matter in this case. Therefore, we can consider k first individuals without loss of generality.

Further we search for experimental settings that minimize the MSE matrix of the BLUP \(\hat{{\varvec{\Psi }}}\) for the individual deviations \({\varvec{\Psi }}\).

Lemma 1

The MSE matrix of the BLUP \(\hat{{\varvec{\Psi }}}\) is given by

$$\begin{aligned} \textrm{Cov}(\hat{{\varvec{\Psi }}}-{\varvec{\Psi }})=\sigma ^2\left( \frac{1}{k}\textbf{D}-\frac{1}{k^2}\textbf{D}\textbf{M}_{k}\textbf{D}+\frac{1}{k^2}\textbf{D}\textbf{M}_{k}\textbf{M}^{-1}\textbf{M}_{k}\textbf{D}\right) , \end{aligned}$$
(6)

where \(\textbf{M}_{k}=\sum _{i=1}^k\textbf{M}(x_i)\).

The proof of Lemma 1 is deferred to Appendix A.

3 Experimental design

We consider the following two groups of individuals: the k selected individuals build the first group (Group 1), and the second group (Group 2) consists of the \(n-k\) remaining individuals. We also allow for group-specific restrictions with respect to experimental settings, i.e. the experimental regions for the two groups of individuals may differ from each other. In practice it can be useful in case of some particular restrictions in two centers/clinics/etc. Further \(\mathcal {X}_{\ell }\) denotes the experimental region for group \(\ell \), \(\ell =1,2\), and \(\mathcal {X}_1\cup \mathcal {X}_2=\mathcal {X}\). The particular case \(\mathcal {X}_1=\mathcal {X}_2=\mathcal {X}\) will be later considered more detailed. We add the group index to experimental settings for clear notation and we define an exact design in group \(\ell \) as

$$\begin{aligned} \xi _{\ell ,e}= \left( \begin{array}{c} x_{\ell 1},\dots , x_{\ell N_\ell } \\ m_{\ell 1},\dots , m_{\ell N_\ell } \end{array} \right) ,\end{aligned}$$

where \(x_{\ell j}\in \mathcal {X}_\ell \) are the support point of \(\xi _{\ell ,e}\), \(m_{\ell j}>0\) is the number of observations at \(x_{\ell j}\) with \(\sum _{j=1}^{N_1}{m_{1j}}=k\) and \(\sum _{j=1}^{N_2}{m_{2j}}=n-k\) and \(N_{\ell }=|\{x_{\ell 1},\dots , x_{\ell N_\ell }\}|\). Note that \(m_{\ell j}\) can be larger than one in case if observations for more than one individual are taken at point \(x_{\ell j}\). Note also that the set \(\{x_{1 1},\dots , x_{1 N_1}\}\) coincides with \(\{x_1, \dots x_k\}\) in model (1) (and \(\{x_{2 1},\dots , x_{2 N_2}\}\) coincides with \(\{x_{k+1}, \dots x_n\}\)).

For analytical purposes we also introduce approximate designs:

$$\begin{aligned} \xi _{\ell }= \left( \begin{array}{c} x_{\ell 1},\dots , x_{\ell \tilde{N_\ell }} \\ w_{\ell 1},\dots , w_{\ell \tilde{N_\ell }} \end{array} \right) ,\end{aligned}$$

where \(w_{\ell j}> 0\) denotes the weight of observations at \(x_{\ell j}\), \(\sum _{j=1}^{\tilde{N_\ell }}{w_{\ell j}}=1\), \(\ell =1,2\), and \(\tilde{N_{\ell }}=|\{x_{\ell 1},\dots , x_{\ell \tilde{N_\ell }}\}|\). Note that \(\tilde{N_{\ell }}=N_{\ell }\) for exact designs. Further we use the notation \(\xi =(\xi _1, \xi _2)\) for the pair of group designs.

We also define the information (or moment) matrices for the first and second design as

$$\begin{aligned} \textbf{M}_{1,\xi }=\sum _{j=1}^{\tilde{N_1}}w_{1j}\textbf{M}(x_{1j}), \end{aligned}$$

and

$$\begin{aligned} \textbf{M}_{2,\xi }=\sum _{j=1}^{\tilde{N_2}}w_{2j}\textbf{M}(x_{2j}), \end{aligned}$$

respectively.

For exact designs we obtain \(w_{1j}=m_{1j}/k\), \(w_{2j}=m_{2j}/(n-k)\), \(\textbf{M}_{k}=k\textbf{M}_{1,\xi }\) and

$$\begin{aligned} \textbf{M}=k\textbf{M}_{1,\xi }+(n-k)\textbf{M}_{2,\xi }. \end{aligned}$$

In this work we focus on the linear criterion for the prediction of individual deviations for the selected individuals, which is defined for exact designs as

$$\begin{aligned} \Phi _L=\text {tr}\left( \textrm{Cov}(\mathcal {L}\hat{{\varvec{\Psi }}}-\mathcal {L}{\varvec{\Psi }})\right) , \end{aligned}$$
(7)

where \(\mathcal {L}\) denotes the transformation matrix.

Neglecting the constants that have no influence on designs, we obtain for approximate designs the following results.

Theorem 1

The linear criterion for the prediction of individual effects \({\varvec{\Psi }}\) is given by

$$\begin{aligned} \Phi _L(\xi )=-\text {tr}\left[ \tilde{\textbf{L}}\left( \frac{1}{k}\textbf{M}_{1,\xi }^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi }^{-1}\right) ^{-1}\right] , \end{aligned}$$
(8)

where \(\tilde{\textbf{L}}=\textbf{D}\textbf{L}\textbf{D}\) and \(\textbf{L}=\mathcal {L}^\top \mathcal {L}\).

The proof of this result is deferred to Appendix B.

Note that we assume matrices \(\textbf{M}_{1,\xi }\) and \(\textbf{M}_{2,\xi }\) and consequently \(\textbf{M}_{k}\) non-singular, which requires \(k\in [p,n-p]\). Otherwise linear criterion (7) cannot be written in form (8).

As we can see by formula (8), the linear criterion depends on two designs simultaneously. In this case the general equivalence theorem (see Kiefer (1974)) cannot be directly applied. Instead we use the extended version for multiple-design problems proposed in Prus (2022). To make use of the equivalence theorems presented in that work, we have to verify convexity of the criterion.

Lemma 2

The linear criterion for the prediction of individual effects \({\varvec{\Psi }}\) is convex with respect to \((\textbf{M}_{1,\xi },\textbf{M}_{2,\xi })\).

For the proof of Lemma 2 see Appendix C.

As the linear criterion for the prediction of the individual deviations is differentiable and convex with respect to both moment matrices, optimality conditions can be formulated.

Theorem 2

Approximate designs \(\xi ^*=(\xi _1^*,\xi _2^*)\) are L-optimal for the prediction of the individual effects \({\varvec{\Psi }}\) iff

$$\begin{aligned}{} & {} \textbf{g}(x)^\top \textbf{M}_{1,\xi ^*}^{-1}\left( \frac{1}{k}\textbf{M}_{1,\xi ^*}^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi ^*}^{-1}\right) ^{-1}\tilde{\textbf{L}}\left( \frac{1}{k}\textbf{M}_{1,\xi ^*}^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi ^*}^{-1}\right) ^{-1}\textbf{M}_{1,\xi ^*}^{-1}\textbf{g}(x) \nonumber \\{} & {} \quad \le \text {tr}\left[ \textbf{M}_{1,\xi ^*}^{-1}\left( \frac{1}{k}\textbf{M}_{1,\xi ^*}^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi ^*}^{-1}\right) ^{-1}\tilde{\textbf{L}}\left( \frac{1}{k}\textbf{M}_{1,\xi ^*}^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi ^*}^{-1}\right) ^{-1}\right] , \end{aligned}$$
(9)

for all \(x \in \mathcal {X}_1\), and

$$\begin{aligned}{} & {} \textbf{g}(x)^\top \textbf{M}_{2,\xi ^*}^{-1}\left( \frac{1}{k}\textbf{M}_{1,\xi ^*}^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi ^*}^{-1}\right) ^{-1}\tilde{\textbf{L}}\left( \frac{1}{k}\textbf{M}_{1,\xi ^*}^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi ^*}^{-1}\right) ^{-1}\textbf{M}_{2,\xi ^*}^{-1}\textbf{g}(x) \nonumber \\{} & {} \quad \le \text {tr}\left[ \textbf{M}_{2,\xi ^*}^{-1}\left( \frac{1}{k}\textbf{M}_{1,\xi ^*}^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi ^*}^{-1}\right) ^{-1}\tilde{\textbf{L}}\left( \frac{1}{k}\textbf{M}_{1,\xi ^*}^{-1}+\frac{1}{n-k}\textbf{M}_{2,\xi ^*}^{-1}\right) ^{-1}\right] , \end{aligned}$$
(10)

for all \(x \in \mathcal {X}_2\).

For support points of \(\xi _1^*\) and \(\xi _2^*\) equalities hold in (9) and (10), respectively.

The proof is deferred to Appendix D.

Note that optimal designs depend on the dispersion matrix of random effects \(\textbf{D}\), the total number of individuals n and the number of selected individuals k.

Note also that in case where the design region is the same for all individuals \(\mathcal {X}_1=\mathcal {X}_2=\mathcal {X}\), optimal designs may be also the same for both groups, i.e. \(\textbf{M}_{1,\xi ^*}=\textbf{M}_{2,\xi ^*}\). In this situation design criterion (8) simplifies to

$$\begin{aligned} \Phi _L(\xi )=-\text {tr}\left[ \tilde{\textbf{L}}\textbf{M}_{1,\xi }\right] , \end{aligned}$$
(11)

which is linear in design, i.e. singular designs (that result in a singular moment matrix) or all permissible designs may be optimal. The optimality conditions simplify to the following inequality:

$$\begin{aligned} \textbf{g}(x)^\top \tilde{\textbf{L}}\textbf{g}(x)\le \text {tr}\left( \tilde{\textbf{L}}\textbf{M}_{1,\xi ^*}\right) ,\quad \forall x \in \mathcal {X}. \end{aligned}$$
(12)

Behavior of such designs will be illustrated by Example 4.1.

4 Examples

We consider the linear regression model

$$\begin{aligned} {Y}_{i}=\varvec{\beta }_{i1}+ \varvec{\beta }_{i2}x_i+\varepsilon _{i}, \end{aligned}$$
(13)

which is the particular case of model (1) with \(\textbf{f}(x)=(1,x)^\top \), and we assume the diagonal structure of the covariance matrix of random effects: \(\textbf{D}=\textrm{diag}(d_1,d_2)\).

Further we distinguish between the two different cases: the same design region for all individuals (Example 4.1) and different design regions for different groups (Example 4.2). We focus on the A-optimality criterion, i.e. \(\textbf{L}=\mathbb {I}_p\), where \(\mathbb {I}_p\) denotes the \(p\times p\) identity matrix.

4.1 Example 1

In this example we consider the situation where there are no particular restrictions for the designs for the selected individuals (that are of our main interest) or for all other individuals, i.e. the design region is the same for both groups: \(\mathcal {X}_1=\mathcal {X}_2=\mathcal {X}\). In particular, we consider symmetric design regions: \(\mathcal {X}=[-a,a]\), for \(a>0\). Further we focus on the designs that are the same for both groups: \(\xi _1=\xi _2\) with all observations at the endpoints:

$$\begin{aligned} \xi _1=\left( \begin{array}{cc} -a &{} a \\ 1-w &{} w \end{array}\right) , \end{aligned}$$
(14)

where w is the weight of observations at point a. The total number of individuals n and the number of selected individuals k have no influence on designs in the present case and do not need to be specified. If optimal designs of form (14) exist, they assign \(w=0\) or \(w=1\) (and make the moment matrix singular) or all values of w are equally good.

For the present model we obtain

$$\begin{aligned} \textbf{g}(x)=\frac{1}{\sqrt{d_2x^2+d_1+1}}(1,x)^\top , \end{aligned}$$

and

$$\begin{aligned} \textbf{M}_{1,\xi }=\textbf{M}_{2,\xi }=\frac{1}{d_2a^2+d_1+1}\left( \begin{array}{cc} 1 &{} a(1-2w) \\ a(1-2w) &{} a^2 \end{array}\right) . \end{aligned}$$

The simplified linear criterion (11) is given by

$$\begin{aligned} \Phi _L(\xi )=-\frac{d_2a^2+d_1^2}{d_2a^2+d_1+1}, \end{aligned}$$

which is independent of the designs, i.e. all designs (all \(w\in (0,1)\)) are optimal or there is no optimal designs of form (14) at all. The optimality condition (12) is given by

$$\begin{aligned} \frac{d_2(d_1^2-d_2(d_1+1))(a^2-x^2)}{(d_2a^2+d_1+1)(d_2x^2+d_1+1)}\le 0,\quad \forall \, x\in [-a,a], \end{aligned}$$

which is satisfied only in case

$$\begin{aligned} \{d_2\ge \frac{d_1^2}{d_1+1}\}\cup \{d_2=0\}. \end{aligned}$$
(15)

Hence, under condition (15) all design are equally good. Otherwise, there is no solution with \(\xi _1=\xi _2\), where \(\xi _1\) is of form (14).

4.2 Example 2

In this example we assume different design regions \(\mathcal {X}_1\) and \(\mathcal {X}_2\). In particular, we concentrate on symmetric design regions of different lengths: \(\mathcal {X}=[-1,1]\) and \(\mathcal {X}=[-a,a]\), for \(a>0\). We consider the endpoints-designs

$$\begin{aligned} \xi _1=\left( \begin{array}{cc} -1 &{} 1 \\ 1-w_1 &{} w_1 \end{array}\right) , \end{aligned}$$
(16)

and

$$\begin{aligned} \xi _2=\left( \begin{array}{cc} -a &{} a \\ 1-w_2 &{} w_2 \end{array}\right) , \end{aligned}$$
(17)

where \(w_1\) and \(w_2\) are the weights of observations at points 1 and a for the first and the second group, respectively. We simplified the notations \(w_{12}\) and \(w_{22}\) to \(w_1\) and \(w_2\), as there are only two support points for each design. Further we consider two different cases for the covariance matrix of random effects: random intercept and random slope.

Case 1 \(d_1=d\) and \(d_2=0\)

In the case of random intercept we have

$$\begin{aligned}{} & {} \textbf{g}(x)=\frac{1}{\sqrt{d+1}}(1,x)^\top , \\{} & {} \textbf{M}_{1,\xi }=\frac{1}{d+1}\left( \begin{array}{cc} 1 &{} 1-2w \\ 1-2w &{} 1 \end{array}\right) , \end{aligned}$$

and

$$\begin{aligned} \textbf{M}_{2,\xi }=\frac{1}{d+1}\left( \begin{array}{cc} 1 &{} a(1-2w) \\ a(1-2w) &{} a^2 \end{array}\right) . \end{aligned}$$

Linear criterion (8) for this model will not be presented here because of its complexity. We use software Maple2020 for further computations. For the following points the first derivative of the criterion function is zero:

$$\begin{aligned} w_1=aw_2-\frac{a}{2}+\frac{1}{2},\quad (w_1,w_2)\in [0,1]^2. \end{aligned}$$
(18)

All designs with property (18) turn out to be optimal, i.e. they satisfy optimality conditions (9) and (10) for all possible values of a. Note that the designs are independent of the variance parameter d.

Case 2 \(d_1=0\) and \(d_2=d\)

For the random slope we obtain

$$\begin{aligned}{} & {} \textbf{g}(x)=\frac{1}{\sqrt{da^2+1}}(1,x)^\top , \\{} & {} \textbf{M}_{1,\xi }=\frac{1}{da^2+1}\left( \begin{array}{cc} 1 &{} 1-2w \\ 1-2w &{} 1 \end{array}\right) , \end{aligned}$$

and

$$\begin{aligned} \textbf{M}_{2,\xi }=\frac{1}{da^2+1}\left( \begin{array}{cc} 1 &{} a(1-2w) \\ a(1-2w) &{} a^2 \end{array}\right) . \end{aligned}$$

Using the same approach as for the random intercept case, it can be verified that all designs with

$$\begin{aligned} w_1=\frac{a+2w_2-1}{2a},\quad (w_1,w_2)\in [0,1]^2, \end{aligned}$$
(19)

are optimal for all \(a>0\).

Note that in both cases: random intercept and random slope, the obtained designs are optimal for all values of the total number of individuals n and the number of selected individuals \(k\in [p,n-p]\). Note also that \(a=1\) would lead to \(w_1=w_2\), which is in accordance with Example 4.1.

5 Discussion

We considered RCR models in which only one observation per individual is possible. We focused on individual random effects for a group of selected individuals, in particular on the mean random effect in the group. The solution for optimal designs is proposed in form of optimality conditions. The number of selected individuals k is assumed to be between the number of parameters p and \(n-p\), where n is the total number of individuals, which is possible only for models with \(n\ge 2p\). If the main interest is in a very small, i.e. \(k<p\), or a very large number of selected individuals, another approach is, however, needed. Optimal designs are determined for the BLUP of the random effects, which depends on the variance parameters (covariance matrices). The variance parameters are assumed to be known, which is in general not the case in practice. In a practical situation, where the covariance matrices have to be estimated, we deal with estimated BLUP (EBLUP). The obtained optimal designs in general depend on the variance parameters and are, therefore, locally optimal. The problem of local optimality may be solved for particular models by considering robust design criteria, for example minimax-criterion. For specific covariance structure, optimal designs may be independent of variance parameters, which has been illustrated by the example of the linear regression models with random intercept and random slope. All observational errors were assumed to have the same variance. Heteroscedastic errors, especially depending on the experimental settings (as considered by Graßhoff et al. (2012)), are analytically more challenging. However, an extension may be considered in a future research. Moreover optimal designs have been obtained for individual effects (individual deviations from the population parameter) via their arithmetic mean. Design optimization for the prediction of the individual deviations or individual parameters themselves turned out to be more challenging. It may also be a subject of future investigations, especially for particular models.