1 Introduction

Over the last decade, technological progress has increased the storage of data. In particular, nowadays many time series are recorded with a very high frequency, for instance intraday prices of stocks or temperature records. Data of this type is often viewed as functional observations. Due to this development, the field of functional data analysis has been very active recently (see the monographs Bosq 2000; Ferraty and Vieu 2006; Horváth and Kokoszka 2012 and Hsing and Eubank 2015, among others).

The statistical analysis of functional data simplifies substantially if the observations are serially uncorrelated (or even serially independent). In fact, a huge amount of methodology has been proposed solely for this scenario, whence it is important to validate or reject this assumption in applications. Moreover, in the context of (univariate) financial return data, the absence or insignificance of serial correlation is commonly interpreted as a sign for efficient market prices (Fama 1970). Likewise, investors may be interested in knowing whether functional counterparts like cumulative intraday returns exhibit significant autocorrelation.

If the observations are not only serially uncorrelated, but also centred and homoscedastic, then the time series is referred to as a functional white noise. Testing for the functional white noise hypothesis has found considerable interest in the recent literature. Portmanteau-type methodology in univariate or multivariate time series analysis was developed, for instance, by Box and Pierce (1970); Hosking (1980); Hong (1996); Lobato (2001); Peña and Rodríguez (2002). Other authors study robust tests for white noise and serial correlation, which are also valid for non i.i.d. data (see, for example Robinson 1991; Horowitz et al. 2006; Dalla et al. 2022, among others). Motivated by the results for multivariate data, Gabrys and Kokoszka (2007) propose to apply a multivariate portmanteau test to vectors of a few principal components from the functional time series. Horváth et al. (2013) and Kokoszka et al. (2017) investigate a portmanteau-type test which is based on estimates of the norms of the autocovariance operators. Alternatively, tests in the frequency domain have been proposed as well, which are based on the fact that the spectral density operator of a functional white noise time series is constant. Zhang (2016) proposes a Cramér-von Mises type test based on the functional periodogram and Bagchi et al. (2018) suggest a test based on an estimate of the minimum distance between the spectral density operator and its best approximation by a constant. While this approach estimates the minimal distance directly avoiding estimation of the spectral density operator, Characiejus and Rice (2020) suggest a test which is based on the distance between the estimated spectral density operator and an estimator of the operator calculated under the assumption of an uncorrelated time series.

A common feature of all aforementioned references consists in the fact that the proposed methodology is only applicable under the assumption of a (second order) stationary time series (the latter is for instance met for classical Gaussian functional time series; see Górecki et al. 2018 for a test on Gaussianity). If the functional data is non-stationary, then the tests from the aforementioned papers may yield spurious results, both under the null hypothesis as well as under the alternative. This paper hence goes a step further and investigates the problem of testing for uncorrelatedness in possibly non-stationary functional data; in particular, for certain forms of heteroscedasticity. More precisely, we propose a portmanteau type test for locally stationary functional time series, whose critical values may be obtained by a multiplier block bootstrap. As a by-product, if accompanied by a test for constancy of the variance (see, e.g., Bücher et al. 2020), we straightforwardly obtain a test for the null hypothesis of functional white noise as well. Finally, we propose a generalized procedure to test for so-called relevant serial correlations, see Sect. 3.2 for a rigorous definition.

The paper is organized as follows: mathematical preliminaries, including a precise description of the hypotheses, are collected in Sect. 2. Suitable test statistics are introduced in Sect. 3, where we also prove weak convergence and validate a bootstrap approximation to obtain suitable critical values. Finite sample results are collected in Sect. 4, a case study is presented in Sect. 5 and all proofs are deferred to Sect. 6.

2 Mathematical preliminaries

Throughout this document, we deal with objects in \(L^p([0,1]^d)\), for different choices of \(p\ge 1\) and \(d\in \mathbb {N}\). We denote the respective \(L^2\)-norms by \(\Vert \cdot \Vert _{2}\). Further, for functions \(f,g\in L^p([0,1])\), we write \((f\otimes g)(x,y)=f(x)g(y)\).

2.1 Locally stationary time series

For \(t\in \mathbb {Z}\), let \(X_t:[0,1]\times \Omega \rightarrow \mathbb {R}\) denote a \((\mathcal {B}|_{[0,1]} \otimes \mathcal {A})\)-measurable function with \(X_t(\cdot ,\omega )\in \mathcal {L}^2([0,1])\) for any \(\omega \in \Omega \). We can regard \([X_t]\) as a random variable in \(L^2([0,1])\) and will denote it by \(X_t\) as well. The expected value of \([X_t]\) in \(L^2([0,1])\) coincides with the equivalence class of \( \tau \mapsto \mu _t(\tau )=\mathbb {E}[X_t(\tau )]\). Similarly, the kernel of the (auto-)covariance operator of \([X_t]\) has a representation in \(\mathcal {L}^2([0,1]^2)\) with \(c_{X_t,X_{t+h}}(\tau ,\sigma )=\text {Cov}\big (X_t(\tau ),X_{t+h}(\sigma )\big )\). More precisely, the lag-h auto-covariance operator \(\text {Cov}(X_{t},X_{t+h}):L^2([0,1]) \rightarrow L^2([0,1])\) is given by the integral operator

$$\begin{aligned} \text {Cov}(X_{t},X_{t+h}) (f) ( \tau ) = \int _{[0,1]} \text {Cov}\big (X_t(\tau ),X_{t+h}(\sigma )\big ) f(\sigma ) {\,\textrm{d}}\sigma . \end{aligned}$$

We refer to Section 2.1 in Bücher et al. (2020) for further technical details.

The sequence \((X_t)_{t\in \mathbb {Z}}\) is called a functional time series in \(L^2([0,1])\). The sequence is called stationary if, for all \(q\in \mathbb {N}\) and \(h,t_1,\dots ,t_q\in \mathbb {Z}\),

$$\begin{aligned} (X_{t_1+h},\dots ,X_{t_q+h})\overset{d}{=}\ (X_{t_1},\dots ,X_{t_q}) \end{aligned}$$

in \(L^2([0,1])^q\). For the definition of a locally stationary functional time series we use a concept introduced by van Delft and Eichler (2018) (see also van Delft et al. 2021; van Delft and Dette 2021; Bücher et al. 2020). To be precise we call a sequence of functional time series \((X_{t,T})_{t\in \mathbb {Z}}\) indexed by \(T\in \mathbb {N}\) a locally stationary functional time series of order \(\rho >0\) if there exists, for any \(u\in [0,1]\), a stationary functional time series \((X_t^{\scriptscriptstyle (u)})_{t\in \mathbb {Z}}\) in \(L^2 ([0,1])\) and an array of real-valued random variables \(\{P_{t,T}^{(u)}:t=1,\dots ,T,T\in \mathbb {N}\}\) with \(\mathbb {E}|P_{t,T}^{\scriptscriptstyle (u)}|^\rho <\infty \) uniformly in \(t\in \{1,\dots ,T\}, T\in \mathbb {N}\) and \(u\in [0,1]\), such that

$$\begin{aligned} \Vert X_{t,T}-X_t^{(u)}\Vert _2 = \bigg \{ \int _{0}^1 \{X_{t,T}(\tau ) - X_{t}^{(u)}(\tau )\}^2 {\,\textrm{d}}\tau \bigg \}^{1/2} \le \bigg (\bigg |\frac{t}{T}-u\bigg |+\frac{1}{T}\bigg )P_{t,T}^{(u)}, \end{aligned}$$

for any \(t\in \{1,\dots ,T\}, T\in \mathbb {N}\) and \(u\in [0,1]\). Note that in the case \(\rho \ge 1\) the approximating family \(\{ (X_t^{\scriptscriptstyle (u)})_{t\in \mathbb {Z}} : u\in [0,1]\}\) is \(L^2\)-Lipschitz continuous in the sense that

$$\begin{aligned} \mathbb {E}\Vert X_t^{(u)}-X_t^{(v)}\Vert _2\le C|u-v|, \end{aligned}$$
(2.1)

for some constant \(C>0\), by local stationarity of \(X_{t,T}\). In the following discussion we assume that \(X_{t,T}\) (and hence \(X_{t}^{\scriptscriptstyle (u)}\)) is centred, i. e. \( \mu _{t,T}= {\mathbb {E}}[X_{t,T}] =0\) for all \(t \in \{1,\ldots , T\} \).

2.2 Serial correlation in locally stationary time series

In classical (functional) time series analysis, a time series is called uncorrelated if its autocovariances are zero for any lag \(h>0\). In the locally stationary setup, a slightly more subtle definition is required to obtain meaningful asymptotic results. Note that the discussion in Sect. 2.1 suggests the approximation

$$\begin{aligned} \text {Cov}(X_{t,T},X_{t+h,T}) \approx \text {Cov}(X_{t}^{(t/T)},X_{t+h}^{((t+h)/T)}) \approx \text {Cov}(X_{t}^{(t/T)},X_{t+h}^{(t/T)})~, \end{aligned}$$
(2.2)

for sufficiently large T, where we used (2.1) in the second step. As “no serial correlation at lag h” of a non-stationary process implies that the left hand side of (2.2) vanishes for all \(t=1, \ldots , T\) as \(T \rightarrow \infty \), it is reasonable to formulate the null hypotheses in terms of the approximating sequences \(\{X_{t}^{\scriptscriptstyle (u)}: t \in \mathbb {Z}\}\). Note that the formulation of hypotheses in terms of approximating processes is rather common in this field (see, for example, Paparoditis 2009; Dette et al. 2011; Hidalgo and Souza 2019; van Delft et al. 2021).

In view of the preceding paragraph, we call a centred locally stationary functional time series of order \(\rho \ge 2\) with approximating family of square-integrable stationary time series \(\{ (X_t^{\scriptscriptstyle (u)})_{t\in \mathbb {Z}} : u\in [0,1]\}\) (i. e., \(\mathbb {E}[\Vert X_0^{\scriptscriptstyle (u)}\Vert _2^2]<\infty \) for all u) serially uncorrelated if the hypothesis

$$\begin{aligned} {\bar{H}}_0 :=H_0^{(1)} \cap H_0^{(2)} \cap \dots \end{aligned}$$
(2.3)

holds, where the individual hypothesis \(H_0^{\scriptscriptstyle (h)}\) at lag \(h \in \mathbb {N}\) is defined by

$$\begin{aligned} H_0^{(h)}:~ \Vert \text {Cov}(X_{0}^{(u)},X_{h}^{(u)})\Vert _{2} = 0 \text {for all}~u\in [0,1], \end{aligned}$$
(2.4)

where

$$\begin{aligned} \Vert \text {Cov}(X_{0}^{(u)},X_{h}^{(u)})\Vert _{2} = \int _{[0,1]^2} | \text {Cov}(X_0^{(u)}(\tau _1), X_h^{(u)}(\tau _2))|^2 {\,\textrm{d}}(\tau _1, \tau _2). \end{aligned}$$

If, additionally, \(u \mapsto \text {Var}(X_0^{\scriptscriptstyle (u)})\) is constant, then the locally stationary time series will be called functional white noise. As in Remark 1 in Bücher et al. (2020), it may be shown that these definitions are independent of the choice of the approximating family.

Throughout this paper, we will develop suitable tests for certain hypotheses related to \({\bar{H}}_0\) and \(H_0^{\scriptscriptstyle (h)}\) in (2.3) and (2.4), respectively. Following the main principle of classical portmanteau-type tests for detecting serial correlations, we start by fixing a maximum lag \(H \in \mathbb {N}\) and to test the hypotheses

$$\begin{aligned} {\bar{H}}_0^{(H)}: \Vert \text {Cov}(X_{0}^{(u)},X_{h}^{(u)})\Vert _{2} = 0 \text {for all}~h\in \{1,\dots ,H\}~\text {and}~u\in [0,1] . \end{aligned}$$
(2.5)

Note that \({\bar{H}}_0= \bigcap _{H\in \mathbb {N}} {\bar{H}}_0^{\scriptscriptstyle (H)}\).

2.3 Regularity conditions on the observation scheme

In order to obtain meaningful asymptotic results, the following regularity conditions will be imposed. We refer to Aue and van Delft (2020), Appendix B, for the definition of cumulants.

Condition 2.1

(Assumptions on the observations)  

(A1):

Local Stationarity. The observations \(X_{1,T}, \dots X_{T,T}\) are an excerpt from a centered locally stationary functional time series \(\{(X_{t,T})_{t\in \mathbb {Z}}:T\in \mathbb {N}\}\) of order \(\rho =4\) in \(L^2([0,1],\mathbb {R})\), with approximating family of stationary time series \(\{ (X_t^{\scriptscriptstyle (u)})_{t\in \mathbb {Z}}: u\in [0,1]\}\).

(A2):

Moment Condition. For any \(k\in \mathbb {N}\), there exists a constant \(C_k<\infty \) such that \(\mathbb {E}\Vert X_{t,T}\Vert _2^k\le C_k\) and \(\mathbb {E}\Vert X_0^{\scriptscriptstyle (u)}\Vert _2^k\le C_k\) uniformly in \(t\in \mathbb {Z},T\in \mathbb {N}\) and \(u\in [0,1]\).

(A3):

Cumulant Condition. For any \(j\in \mathbb {N}\) there is a constant \(D_j <\infty \) such that

$$\begin{aligned} \sum _{t_1,\dots ,t_{j-1}=-\infty }^{\infty } \big \Vert {{\,\textrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_j,T})\big \Vert _{2} \le D_j<\infty , \end{aligned}$$

for any \(t_j\in \mathbb {Z}\) (for \(j=1\) the condition is to be interpreted as \(\Vert \mathbb {E}X_{t_1,T}\Vert _2\le D_1\) for all \(t_1 \in \mathbb {Z}\)). Further, for \(k\in \{2,3,4\}\), there exist functions \(\eta _k:\mathbb {Z}^{k-1}\rightarrow \mathbb {R}\) satisfying

$$\begin{aligned} \sum _{t_1,\dots ,t_{k-1}=-\infty }^{\infty } (1+|t_1|+\dots +|t_{k-1}|)\eta _k(t_1,\dots ,t_{k-1}) < \infty \end{aligned}$$

such that, for any \(T\in \mathbb {N}, 1 \le t_1 , \dots , t_k \le T, v, u_1, \dots , u_k \in [0,1],h_1,h_2\in \mathbb {Z}\), \(Z_{t,T}^{\scriptscriptstyle (u)}\in \{X_{\scriptscriptstyle t, T},X_{t}^{\scriptscriptstyle (u)}\}\), and any \(Y_{t,h,T}(\tau _1,\tau _2)\in \{ X_{t,T}(\tau _1), X_{t,T}(\tau _1)X_{t+h,T}(\tau _2) \}\), we have

(i):

\(\Vert {{\,\textrm{cum}\,}}(X_{t_1,T}-X_{t_1}^{(t_1/T)},Z_{t_2,T}^{(u_2)},\cdots ,Z_{t_k,T}^{(u_k)})\Vert _{2} \le \frac{1}{T} \eta _k(t_2-t_1,\dots ,t_k-t_1)\),

(ii):

\(\Vert {{\,\textrm{cum}\,}}(X_{t_1}^{(u_1)}-X_{t_1}^{(v)},Z_{t_2,T}^{(u_2)},\cdots ,Z_{t_k,T}^{(u_k)})\Vert _{2} \le |u_1-v| \eta _k(t_2-t_1,\dots ,t_k-t_1)\),

(iii):

\(\Vert {{\,\textrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_k,T})\Vert _{2} \le \eta _k(t_2-t_1,\cdots ,t_k-t_1)\),

(iv):

\(\int _{[0,1]^{2}} |{{\,\textrm{cum}\,}}\big (Y_{t_1,h_1,T}(\tau ),Y_{t_2,h_2,T}(\tau ) \big )|{\,\textrm{d}}\tau \le \eta _2(t_2-t_1)\).

Assumption (A1) restricts the non-stationary behaviour of the observations to smooth changes, while the moment condition ensures existence of the cumulants. The cumulant condition originates from classical multivariate time series analysis (see, e. g., Brillinger 1981). Similar assumptions were made by Lee and Rao (2017) and Aue and van Delft (2020) in the context of non-stationary functional data. Lemma 2 in Bücher et al. (2020) shows that (A3) follows from (A1), (A2) and an additional moment condition, provided that a certain strong mixing condition is met. Finally, note that Condition 2.1 does not exclude (nonlinear) dependence. Consequently, the testing procedures developed below allow for (nonlinear) dependence under the null.

3 Testing for serial correlation in locally stationary functional data

3.1 A test statistic for detecting serial correlation

In this section, we propose a test statistic for detecting deviations from hypothesis (2.5) and prove a corresponding weak convergence result. A bootstrap device for deriving suitable critical values will be discussed in the subsequent Sect. 3.3.

The test statistic is based on the following observation: as \(X_t^{\scriptscriptstyle (u)}\) is centred we may rewrite (observing (2.1)) hypotheses (2.4) and (2.5) as

$$\begin{aligned} H_0^{(h)}: \Vert M_h\Vert _{2}=0 \quad \text { and } \quad {\bar{H}}_0^{(H)} : \max _{h=1}^{H}\Vert M_h\Vert _{2}=0, \end{aligned}$$
(3.1)

where

$$\begin{aligned} M_h(u,\tau _1,\tau _2)=\int _0^u \mathbb {E}[X_0^{(w)}(\tau _1)X_h^{(w)}(\tau _2)]{\,\textrm{d}}w \end{aligned}$$
(3.2)

denotes the integrated auto-covariance at \((\tau _1,\tau _2)\) up to time u. By a measure theoretic argument this function vanishes on \([0,1]^3\) if and only if \(\mathbb {E}[X_0^{(w)}(\tau _1)X_h^{(w)}(\tau _2)] \equiv 0\) on \([0,1]^3\) (a.e.), and \( \Vert M_h\Vert _{2}\) can be interpreted as a measure of the deviation from this equality.

An empirical version of \(M_h\), based on the observations \(X_{1,T}, \dots , X_{T,T}\), is provided by the statistic

$$\begin{aligned} {\hat{M}}_{h,T}(u,\tau _1,\tau _2)=\frac{1}{T}\sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)}X_{t,T}(\tau _1)X_{t+h,T}(\tau _2). \end{aligned}$$
(3.3)

Note that we form cumulative sums in (3.3) instead of calculating local averages of the form

$$\begin{aligned} \frac{1}{2m+1}\sum _{s=-m}^{m}X_{t+s,T}(\tau _1)X_{t+s+h,T}(\tau _2) \end{aligned}$$

to estimate \(\mathbb {E}[X_0^{(t/T)}(\tau _1)X_h^{(t/T)}(\tau _2)]\) directly. This approach avoids the choice of tuning parameters, and it results in a procedure which can detect alternatives converging to the null hypothesis at a rate \(1/\sqrt{T}\) (see Remark 3.6 for details). We also emphasize that this approach is different from the method introduced in Lobato (2001), who used cumulative sums for self-normalization to avoid long-run variance estimation in testing for white noise of (multivariate) time series. In contrast to this work we use sequential estimates to avoid local estimation and use a suitable bootstrap device instead of estimating the long-run variance (see Sect. 3.3 for more details).

The next theorem shows the consistency of the empirical versions. It implies that it is sensible to reject the null hypotheses in (2.4) and (2.5) for large values of the statistics

$$\begin{aligned} \mathcal {S}_{h,T} = \sqrt{T} \Vert {{\hat{M}}}_{h,T}\Vert _{2} \quad \text { and } \quad {\bar{\mathcal {S}}}_{H,T} = \sqrt{T} \max _{h=1}^H \Vert {{\hat{M}}}_{h,T}\Vert _{2}, \end{aligned}$$
(3.4)

respectively. For the statement of our first main result we define \(\langle \cdot , \cdot \rangle \) as the standard inner product on \(L^2([0,1]^3)\).

Theorem 3.1

Under Condition 2.1, we have, for any \(h\in \mathbb {N}\) as \(T \rightarrow \infty \)

$$\begin{aligned} {1 \over \sqrt{T}} \mathcal {S}_{h,T} \rightarrow \Vert M_h\Vert _{2} \end{aligned}$$

in probability. Moreover, for any \(H\in \mathbb {N}\), \( h \in \{ 1, \ldots , H\}\) as \(T \rightarrow \infty \)

where \({\tilde{B}}=({{\tilde{B}}}_1,\dots ,{{\tilde{B}}}_H)\) denotes a centred Gaussian variable in \(L^2([0,1]^3)^H\), whose covariance operator \(C_\mathbb {B}:L^2([0,1]^3)^H \rightarrow L^2([0,1]^3)^H\) is defined by

$$\begin{aligned} C_{\mathbb {B}} \left( \begin{array}{c} f_1 \\ \vdots \\ f_{H} \end{array} \right) \left( \begin{array}{c} (u_1, \tau _{11}, \tau _{12}) \\ \vdots \\ (u_H, \tau _{H1}, \tau _{H2}) \end{array} \right) = \left( \begin{array}{c} \sum _{h=1}^H \langle r_{1,h} ((u_1,\tau _{11}, \tau _{12}) , \cdot ), f_h \rangle \\ \vdots \\ \sum _{h=1}^H \langle r_{H,h} ((u_H,\tau _{H1}, \tau _{H2}) , \cdot ), f_h \rangle \end{array} \right) . \end{aligned}$$
(3.5)

Here, the kernel function \(r_{h,h'}\)is given by

$$\begin{aligned} r_{h,h'}((u,\tau _1, \tau _2), (v,\varphi _1, \varphi _2))&= \text {Cov}\big ({\tilde{B}}_h(u,\tau _1,\tau _2),{\tilde{B}}_{h'}(v,\varphi _1,\varphi _2)\big ) \nonumber \\&= \sum _{k=-\infty }^{\infty }\int _{0}^{u\wedge v}c_{k,h,h'}^{(w)}(\tau _1, \tau _2, \varphi _1, \varphi _2) {\,\textrm{d}}w, \end{aligned}$$
(3.6)

with

$$\begin{aligned} c_{k,h,h'}^{(w)}&= c_{k,h,h'}^{(w)}(\tau _1, \tau _2, \varphi _1, \varphi _2) = \text {Cov}\big (X_0^{(w)}(\tau _1)X_h^{(w)}(\tau _2),X_k^{(w)}(\varphi _1)X_{k+h'}^{(w)}(\varphi _2)\big ), \end{aligned}$$

for any \(1\le h,h'\le H\). In particular, the infinite sum in (3.6) converges.

Note that infinite sum in the kernel function in Eq. (3.6) reduces to the single summand \(k=0\) in the serially independent case, that is, if the approximating family \((X_t^{\scriptscriptstyle (u)})_t\) is serially independent for any u. Moreover, in the stationary case, that is, if the law of the approximating family \((X_t^{\scriptscriptstyle (u)})_t\) does not depend on u, the integral in Equat. (3.6) reduces to \((u \wedge v)c_{k,h,h'}(\tau _1, \tau _2, \varphi _1, \varphi _2) \).

It is worthwhile to mention that the distributions of the limiting variables in the previous theorem are not pivotal under the null hypotheses. As a consequence, critical values for respective statistical tests must be estimated, for instance by a plug-in approach or by a suitable bootstrap device. Throughout this paper, we propose a bootstrap approach which will be worked out in Sect. 3.3 below.

3.2 Detecting relevant serial correlations

In the previous section, we considered “classical” hypotheses in the sense that we were testing whether the covariance operators up to lag H are exactly equal to zero. However, in concrete applications, hypotheses of this type might rarely be satisfied exactly and it might rather be reasonable to reformulate the null hypothesis in the form that “the norm of the autocovariance operator is small”, but not exactly equal to 0. More precisely, given thresholds \(\Delta _{h} > 0\) which may vary with the lag \(h \in \{1, \ldots , H\} \), we propose to consider the following relevant hypotheses

$$\begin{aligned} \nonumber H_0^{(h,\Delta )}&: \Vert M_h\Vert _{2} \le \Delta _h,\\ {\bar{H}}_0^{(H,\Delta )}&: \Vert M_h\Vert _{2} \le \Delta _h \quad \text {for all}~h\in \{1,\dots ,H\}, \end{aligned}$$
(3.7)

where \(H\in \mathbb {N}\) is some fixed constant representing the maximal lag under consideration. Note that the hypotheses in (3.1) are obtained for \(\Delta _h=0\), but in this section we consider the case of strict inequality, that is \( \Delta _h >0\) (\(h=1, \ldots , H\)). Those so-called relevant hypotheses may be motivated by the fact that, in many applications, it is clear from the scientific background that the exact equality \(\Vert M_h\Vert _{2} =0\) is unreasonable. However, one might be interested in working under the white noise assumption if the deviations are sufficiently small. We also emphasize that testing relevant hypotheses avoids the consistency problem mentioned in Berkson (1938): any consistent test will detect any arbitrary small deviation from the null hypothesis if the sample size is sufficiently large.

In practice, the choice of the threshold \(\Delta _h\) is a difficult problem; it must depend on the specific statistical problem and has to be discussed with the practitioner in each concrete application. The selection can be simplified by calculating Monte Carlo approximations of \(\Vert M_h\Vert _{2}\) in different models that are regarded as potential candidates for the data generating process. We also refer to Dette and Wied (2016), where similar challenges are discussed for change point tests in the context of portfolio analysis. Moreover, the results of the following discussion can also be used to construct confidence intervals for the quantities \(\Vert M_h \Vert _{2}\) (see Remark 3.6(b) below).

Consistency of \({{\hat{M}}}_{h,T}\) for \(M_h\) suggests to reject the above hypotheses for large values of \({{\hat{M}}}_{h,T}\). We propose to consider the “normalized” test statistics

$$\begin{aligned} {\mathcal {S}}_{h,\Delta _h,T}&= \sqrt{T}(\Vert {\hat{M}}_{h,T}\Vert _{2}-\Delta _h)\Vert {\hat{M}}_{h,T}\Vert _{2} ~, \\ \bar{{\mathcal {S}}}_{H, \Delta , T}&= \max _{h=1}^H \sqrt{T}(\Vert {\hat{M}}_{h,T}\Vert _{2}-\Delta _h)\Vert {\hat{M}}_{h,T}\Vert _{2} ~, \end{aligned}$$

whose asymptotic properties are described in the following result. It is worthwhile to mention that related test statistics like \(\sqrt{T}(\Vert {{\hat{M}}}_{h,T}\Vert _{2}^2-\Delta _h^2)\) or \(\sqrt{T}(\Vert {\hat{M}}_{h,T}\Vert _{2}-\Delta _h)\Delta _h\) may be treated similarly, but that the respective tests exhibited a worse finite-sample performance in an unreported Monte-Carlo simulation study.

Corollary 3.2

Under Condition 2.1, we have, for any fixed \(H\in \mathbb {N}\) and for \(T\rightarrow \infty \),

where \({{\tilde{B}}}_1, \dots , {{\tilde{B}}}_H\) are defined in Theorem 3.1 and \(\langle f,g \rangle = \int _{[0,1]^3} f(x) g(x) {\,\textrm{d}}x\). As a consequence,

Moreover,

where \(N_H=\{h \in \{1, \dots , H\}: \Vert M_h\Vert _{2} = \Delta _h\}\), \(O_H=\{h\in \{1, \dots , H\}: \Vert M_h\Vert _{2}= 0\}\) and where the maximum over the empty set is interpreted as \(-\infty \).

As in Sect. 3.1, the limiting distributions under the null hypotheses are not pivotal, whence a bootstrap procedure will be introduced next.

3.3 Critical values based on bootstrap approximations

The limiting distributions of the test statistics derived in the previous sections depend in a complicated way on the higher order serial dependence of the underlying approximating family \(\{ (X_t^{\scriptscriptstyle (u)})_{t\in \mathbb {Z}}: u\in [0,1]\}\) and are rather difficult to estimate. To avoid the estimation, we propose a multiplier block bootstrap procedure which is akin to the dependent wild bootstrap in Shao (2010).

Following Bücher et al. (2020) the bootstrap scheme will be defined in terms of i.i.d. standard normally distributed random variables \(\{R_i^{(\scriptscriptstyle k)}\}_{i,k\in \mathbb {N}}\) which are independent of \(\{(X_{t,T})_{t\in \mathbb {Z}}:T\in \mathbb {N}\}\). Further, let \(m=m_T\) and \(n=n_T\) denote two block length sequences satisfying one of the following two conditions.

Condition 3.3

  1. (B1)

    The block length \(m=m_T \in \{1, \dots , T\}\) tends to infinity and satisfies \(m=o(T)\) as \(T\rightarrow \infty \).

  2. (B2)

    The block length \(n=n(T)\in \{1, \dots , T\}\) satisfies \(m/n=o(1)\) and \(m n^2=o(T^2)\) as \(T\rightarrow \infty \).

Next, let \(K\in \mathbb {N}\) denote the number of bootstrap replications. For \(k\in \{1, \dots ,K\}\) and \(h \in \{1, \dots , H\}\), define multiplier bootstrap approximations for

$$\begin{aligned} B_{h,T}(u, \tau _1, \tau _2) = \sqrt{T} \{ {\hat{M}}_{h,T}(u, \tau _1, \tau _2)- M_{h}(u, \tau _1, \tau _2)\} \end{aligned}$$

as

$$\begin{aligned} {\hat{B}}_{h,n,T}^{(k)}(u,\tau _1,\tau _2) = \frac{1}{\sqrt{T}} \sum _{i=1}^{\lfloor uT\rfloor \wedge (T-h)}\frac{R_i^{(k)}}{\sqrt{m}} \sum _{t=i}^{(i+m-1)\wedge (T-h)} \big \{ X_{t,T}(\tau _1)&X_{t+h,T}(\tau _2) \\&- {\hat{\mu }}_{t,h,n,T}(\tau _1,\tau _2) \big \}, \end{aligned}$$

where

$$\begin{aligned} {\hat{\mu }}_{t,h,n,T}(\tau _1,\tau _2) = \frac{1}{{\tilde{n}}_{t,h}}\sum _{j=\underline{n}_t}^{\bar{n}_{t,h}}X_{t+j,T}(\tau _1)X_{t+j+h,T}(\tau _2) \end{aligned}$$

denotes the local empirical product moment of lag h with

$$\begin{aligned} \bar{n}_{t,h}=n\wedge (T-t-h), \quad \underline{n}_t=-n\vee (1-t), \quad {\tilde{n}}_{t,h} = \bar{n}_{t,h}-\underline{n}_t+1. \end{aligned}$$

Note that multipliers from classical multiplier bootstraps are the same within successive blocks of summands of length m, which allows to catch the serial dependence within the data and eventually yields consistency (see also Shao 2010). Further, note that for \(n=T\) we obtain \({\hat{\mu }}_{t,h,T,T} = {\hat{\mu }}_{h,T}\) for all \(t\in \{1, \dots , T\}\), where

$$\begin{aligned} {\hat{\mu }}_{h,T}(\tau _1,\tau _2) = \frac{1}{T-h}\sum _{t=1}^{T-h}X_{t,T}(\tau _1)X_{t+h,T}(\tau _2) \end{aligned}$$

denotes the global empirical product moment. Let \(\hat{\mathbb {B}}_{n,T}^{\scriptscriptstyle (k)} = ( {\hat{B}}_{1,n,T}^{\scriptscriptstyle (k)},\dots , {\hat{B}}_{H,n,T}^{\scriptscriptstyle (k)})\) and \(\hat{\mathbb {B}}_T = \sqrt{T}\big ( B_{1,T}, \dots , B_{H,T}\big )\). The following result shows that this multiplier bootstrap is consistent (the provided unconditional formulation of bootstrap consistency is equivalent to alternative formulations in terms of conditional ‘distributions’, see Bücher and Kojadinovic 2019).

Theorem 3.4

Suppose that Condition 2.1 is met and let \({\tilde{B}}^{(1)},{\tilde{B}}^{(2)}, \dots \) denote independent copies of \({\tilde{B}}\). Fix \(K,H\in \mathbb {N}\).

  1. (i)

    If Condition 3.3 (B1) and (B2) are met, then, as \(T\rightarrow \infty \),

  2. (ii)

    If Condition 3.3 (B1) is met and if \(\text {Cov}(X_0^{\scriptscriptstyle (0)}, X_h^{\scriptscriptstyle (0)})=\text {Cov}(X_0^{\scriptscriptstyle (w)}, X_h^{\scriptscriptstyle (w)})\) for any \(w\in [0,1]\) and \(h\in \mathbb {Z}\), then, as \(T\rightarrow \infty \),

It is worthwhile to mention that the assumption on \(\text {Cov}(X_0^{\scriptscriptstyle (w)}, X_h^{\scriptscriptstyle (w)})\) in Theorem 3.4(ii) is met provided that \(X_{t,T}=X_t\) for some stationary time series \((X_t)_{t\in \mathbb {Z}}\). In such a situation (for instance to be validated by a stationarity test in practice), using the bootstrap scheme with \(n=T\) over the one with n satisfying Condition 3.3 (B2) typically results in better finite sample results, see Sect. 4 for more details.

Subsequently, we reconsider the problem of testing for serial uncorrelation of a locally stationary time series using classical and relevant hypotheses. For the sake of brevity, we only treat the hypotheses \({\bar{H}}_0^{\scriptscriptstyle (H)}\) and \({\bar{H}}_{0}^{\scriptscriptstyle (H,\Delta )}\), which are defined in (2.5) and (3.7), respectively and involve multiple lags. For this purpose we consider the following bootstrap approximations of the respective test statistics

$$\begin{aligned} \bar{\mathcal {S}}_{H,n,T}^{(k)}&= \max _{h=1}^H \Vert \hat{\mathbb {B}}_{h,n,T}^{(k)} \Vert _{2} \end{aligned}$$
(3.8)

for the classical hypotheses and

$$\begin{aligned} \bar{\mathcal {S}}_{H,n,T, \text {rel}}^{(k)}&= \max _{h=1}^H \langle {{\hat{M}}}_{h,T}, \hat{\mathbb {B}}_{h,n,T}^{(k)} \rangle \end{aligned}$$
(3.9)

for the relevant hypotheses. Finally, we propose to reject the classical hypothesis (2.5) whenever

$$\begin{aligned} {\bar{p}}_{H,n,K,T} = \frac{1}{K}\sum _{k=1}^{K}\mathbbm {1}\Big (\bar{\mathcal {S}}_{H,n,T}^{(k)} \ge \bar{{\mathcal {S}}}_{H,T} \Big ) < \alpha ~. \end{aligned}$$
(3.10)

Similarly, the relevant hypothesis (3.7) is rejected whenever

$$\begin{aligned} {\bar{p}}_{H,n,K,T, \text {rel}} = \frac{1}{K}\sum _{k=1}^{K}\mathbbm {1}\Big (\bar{\mathcal {S}}_{H,n,T, \text {rel}}^{(k)} \ge \bar{{\mathcal {S}}}_{H,\Delta , T} \Big ) < \alpha . \end{aligned}$$
(3.11)

Corollary 3.5

Fix \(\alpha \in (0,1)\), suppose that Condition 2.1 is met and let \(K=K_T\rightarrow \infty \).

  1. (i)

    If Condition 3.3 (B1) and (B2) hold, then the decision rule (3.10) defines a consistent asymptotic level \(\alpha \) test for the classical hypotheses (2.5), that is

    $$\begin{aligned} \lim _{T \rightarrow \infty } \mathbb {P}({\bar{p}}_{H,n,K,T} < \alpha ) = {\left\{ \begin{array}{ll} \alpha &{} \text { under } {\bar{H}}_0^{(H),} \\ 1 &{} \text { else.} \end{array}\right. } \end{aligned}$$

    Similarly, for \(\alpha <1/2\), the decision rule (3.11) for the relevant hypotheses (3.7) satisfies

    $$\begin{aligned} \nonumber \lim _{T\rightarrow \infty } \mathbb {P}({\bar{p}}_{H,n,K,T, \text {rel}}< \alpha ) = 0&\quad \text { if } \Vert M_h\Vert _{2}< \Delta _h \text { for all } h\in \{1, \dots , H\}, \\ \limsup _{T \rightarrow \infty }\mathbb {P}({\bar{p}}_{H,n,K,T, \text {rel}}< \alpha ) \le \alpha&\quad \text { if } {\bar{H}}_0^{(H,\Delta )} \cap R \text { is met,} \\ \nonumber \lim _{T \rightarrow \infty }\mathbb {P}({\bar{p}}_{H,n,K,T, \text {rel}} < \alpha )=1&\quad \text { else}, \end{aligned}$$
    (3.12)

    where R denotes the set of all models from the null hypothesis \( {\bar{H}}_0^{(H,\Delta )} \) for which \(\Vert M_h\Vert _{2} = \Delta _h\) for some \(h\in \{1, \dots , H\}\) and for which \(\text {Var}(\langle M_h, \tilde{B}_h\rangle ) >0\) for each such h. In (3.12), the value \(\alpha \) is attained if \(\Vert M_h\Vert _{2} = \Delta _h\) for all \(h\in \{1, \dots , H\}\).

  2. (ii)

    If Condition 3.3 (B1) is met and if \(\text {Cov}(X_0^{\scriptscriptstyle (0)}, X_h^{\scriptscriptstyle (0)})=\text {Cov}(X_0^{\scriptscriptstyle (w)}, X_h^{\scriptscriptstyle (w)})\) for any \(w\in [0,1]\) and \(h\in \mathbb {N}_0\), then the same assertions as in (i) are met for \(n=T\).

Remark 3.6

(a) The restriction to \(\alpha <1/2\) for the test defined by (3.11) is needed to make sure that the contribution from \(\max _{h\in O_H} - \Delta _h \Vert {{\tilde{B}}}_h\Vert _{2}\) in Corollary 3.2 is negligible (see Sect. 6 for details).

(b) If the specification of the thresholds \(\Delta _h\) in the hypotheses (3.7) is difficult, one can use the results presented so far to construct confidence intervals for the quantities \(\Vert M_h \Vert _{2}\). To be precise, consider the case \(h=1\) and let \(q_{\alpha ,n}^* \) denote the \(\alpha \)-quantile, \(\alpha <1/2\), of the bootstrap distribution defined in (3.9) (which can be estimated by the empirical \(\alpha \)-quantile of \(\bar{\mathcal {S}}_{1,n,T, \text {rel}}^{\scriptscriptstyle (1)} , \ldots , \bar{\mathcal {S}}_{1,n,T, \text {rel}}^{\scriptscriptstyle (K)} \)). Then, an asymptotic \((1-\alpha )\)-confidence interval for the quantity \(\Vert M_1 \Vert _{2}\) is given by

$$\begin{aligned} {\hat{I}}_T:=\Big [0,\Vert {\hat{M}}_{1,T}\Vert _{2}-\frac{q_{\alpha ,n}^* }{\sqrt{T}\Vert {\hat{M}}{_1,_T\Vert _{2}}}\Big ] \end{aligned}$$

Indeed, it follows from Corollary 3.2 that

$$\begin{aligned} {\mathbb {P}}(\Vert M_1\Vert _{2}\in {\hat{I}}_T)= {\mathbb {P}} \big ( \sqrt{T}\big ((\Vert {\hat{M}}_{1,T}\Vert _{2}-\Vert M_1\Vert _{2})\Vert {\hat{M}}_{1,T}\Vert _{2}\big ) \ge q_{\alpha ,n}^* \big ) \rightarrow 1-\alpha \end{aligned}$$

if \(\Vert M_1\Vert _{2}>0\), for \(n\rightarrow \infty \). On the other hand, if \(\Vert M_1\Vert _{2}=0\) the same calculation shows that

$$\begin{aligned} {\mathbb {P}}(\Vert M_1\Vert _{2}\in {\hat{I}}_T)={\mathbb {P}} (\sqrt{T} \Vert {\hat{M}}_{1,T}\Vert ^2_{2} \ge q_{\alpha ,n}^* ) = 1 \end{aligned}$$

as \(q^*_{\alpha ,n} \le 0\) by symmetry whenever \(\alpha < 1/2\).

Remark 3.7

  It was pointed out by a referee that it is of interest to investigate the properties of the proposed tests under local alternatives. Roughly speaking, the tests can detect local alternatives converging to the null hypothesis at a rate \(1/\sqrt{T}\), that is \(\max _{h=1, \ldots , H} \Vert M_h\Vert _{2} = {c \over \sqrt{T}} \) for some constant \(c > 0\), and we briefly illustrate this fact for hypotheses of the form (2.5). Similar results can be obtained for the relevant hypotheses defined in (3.7).

Observing the discussion at the beginning of this section, the hypothesis can be rewritten as \({\bar{H}}_0^{\scriptscriptstyle (H)}: \max _{h=1, \ldots , H} \Vert M_h\Vert _{2}=0 \). Let \(X_{t,T} = Y_{t,T} + \frac{1}{T^{1/4}}Z_{t,T}\), where the processes \(\{ Y_{t,T} \}_{t=1,\ldots , T} \) and \(\{Z_{t,T}\}_{t=1,\ldots , T}\) are independent and satisfy the assumptions of Theorem 3.1 with approximating processes \(\{ Y_{t}^{\scriptscriptstyle (u)} \}_{t\in {\mathbb {Z}}} \) and \(\{Z_{t}^{\scriptscriptstyle (u)}\}_{t\in {\mathbb {Z}}}\), respectively. We define \(M_h^X\), \(M_h^Y\) and \(M_h^Z\) by (3.2) for the processes \(X_{t}^{\scriptscriptstyle (u)} =Y_{t}^{\scriptscriptstyle (u)}+ Z_{t}^{(u)}\), \(Y_{t}^{\scriptscriptstyle (u)} \) and \(Z_{t}^{\scriptscriptstyle (u)} \), respectively, and assume that \(\Vert M_h^Y \Vert _{2} = 0 \) for all \({h\in \{1, \ldots , H\}}\) and that \(\Vert M_h^Z \Vert _{2} \not = 0 \) for at least one \({h \in \{1, \ldots , H \} }\). Note that

$$\begin{aligned} \Vert M_h^X \Vert _{2} = \frac{1}{\sqrt{T} }\Vert M_h^Z \Vert _{2} \quad \quad \quad \quad \quad ~(h=1, \ldots , H) . \end{aligned}$$

It now follows from the arguments in the proof of Theorem 3.1 that the statistic \({\bar{\mathcal {S}}}_{H,T}\) defined in (3.4) converges weakly, that is,

where \(\Vert M_h^Z \Vert _{2} \not = 0 \) for at least one \(h\in \{1, \ldots , H\} \). On the other hand, by Theorem 3.4, the bootstrap approximations \(\bar{\mathcal {S}}_{H,n,T}^{\scriptscriptstyle (k)}\) defined in (3.8) converge weakly to \(\max _{h=1, \ldots , H} \Vert {{\tilde{B}}}_h^{\scriptscriptstyle (k)} \Vert _{2}\), where \( {{\tilde{B}}}_1^{\scriptscriptstyle (k)} , \ldots , {{\tilde{B}}}_1^{\scriptscriptstyle (k)}\) are independent copies of \( \tilde{B}_1\). As a consequence, the test has non-trivial asymptotic power. The same arguments can be used to show that local alternatives converging to the null at a rate that is strictly slower than \(T^{-1/2}\) can be detected with asymptotic power 1.

4 Monte Carlo simulations

A large scale Monte Carlo simulation study was performed to analyse the finite-sample properties of the proposed tests. The major goal of the study was to analyse the level approximation and the power of the tests for hypotheses of the form \(\bar{H}_0^{\scriptscriptstyle (H)}\) and \(\bar{H}_0^{\scriptscriptstyle (H,\Delta )}\), with \(H \in \{1,\dots , 4\}\). Moreover, we also provide a comparison with existing tests for white noise / no serial correlation in the stationary setup, both for tests in the time domain (Kokoszka et al. 2017) and in the frequency domain (Zhang 2016; Bagchi et al. 2018; Characiejus and Rice 2020).

4.1 Models

We start by employing the same (stationary) models as in Zhang (2016) and Bagchi et al. (2018). In particular, for the null hypothesis of serial uncorrelation for any lag h, we consider: Model (\(\textrm{N}_1\)), an i.i.d. sequence of Brownian motions; Model (\(\textrm{N}_2\)), an i.i.d. sequence of Brownian bridges; and Model (\(\textrm{N}_3\)), data from a FARCH(1) process defined by

$$\begin{aligned} X_{t}(\tau )=B_t(\tau ) \sqrt{\tau + \int _0^1 c_\psi \exp \Big (\frac{\tau ^2+\sigma ^2}{2}\Big )X_{t-1}^2(\sigma ){\,\textrm{d}}\sigma }, \end{aligned}$$
(4.1)

where \((B_t)_{t\in \mathbb {Z}}\) denotes an i.i.d. sequence of Brownian motions and \(c_\psi = 0.3418\). Under the alternative, we consider the FAR(1) model given by

$$\begin{aligned} X_t = \rho (X_{t-1}-\mu )+{\varepsilon }_t, \end{aligned}$$

where \(\rho \) denotes an integral operator \(\rho (f) = \int _0^1 K(\cdot ,\sigma )f(\sigma ){\,\textrm{d}}\sigma ,~f\in L^2([0,1])\), for a given kernel \(K \in L^2([0,1]^2)\) and a sequence of centred, i.i.d. innovations \(({\varepsilon }_t)_{t\in \mathbb {Z}}\) in \(L^2([0,1])\). We consider the following choices for K and \({\varepsilon }_t\):

$$\begin{aligned} (\textrm{A}_1) \quad&~ K(\tau ,\sigma )=c_g \exp \big ((\tau ^2+\sigma ^2)/2\big ),~ {\varepsilon }_t~\text {i.i.d. Brownian motions},\\ (\textrm{A}_2) \quad&~ K(\tau ,\sigma )=c_g \exp \big ((\tau ^2+\sigma ^2)/2\big ),~ {\varepsilon }_t~\text {i.i.d. Brownian bridges}, \\ (\textrm{A}_3) \quad&~ K(\tau ,\sigma )=c_w \min (\tau ,\sigma ),~ {\varepsilon }_t~\text {i.i.d. Brownian motions}, \\ (\textrm{A}_4) \quad&~ K(\tau ,\sigma )=c_w \min (\tau ,\sigma ),~ {\varepsilon }_t~\text {i.i.d. Brownian bridges}, \end{aligned}$$

where \(c_g\) and \(c_w\) are chosen such that the Hilbert-Schmidt norm of the \(\rho \) is 0.3.

Note that the above models are stationary. Since our proposed methodology allows for smooth changes in the distribution of the underlying stochastic processes as well, we additionally consider the following heteroscedastic locally stationary models:

$$\begin{aligned} (\textrm{N}_4) \quad&~ X_{t,T} = \sigma (t/T) B_t, \\ (\textrm{A}_5) \quad&~ X_{t,T} = \rho (X_{t-1,T})+\sigma (t/T) B_t, \\ (\textrm{A}_6) \quad&~ X_{t,T} = \sigma (t/T)\rho (X_{t-1,T})+ B_t, \end{aligned}$$

where \((B_t)_{t\in \mathbb {Z}}\) denotes an i.i.d. sequence of Brownian motions, \(\sigma (x)=x+1/2\) and \(\rho \) is defined as in model (\(\textrm{A}_1\)). For model (\(\textrm{N}_4\)), the null hypothesis holds true, whereas the alternative is true for models (\(\textrm{A}_5\)) and (\(\textrm{A}_6\)).

Finally, in order to study both the robustness of the method with respect to the existence of higher order moments (which is technically required in Condition 2.1) and the influence of the strength of the serial dependence on the power, we also study a FAR(1)-FARCH(1)-model with various selected parameter choices. More precisely, we consider the FAR(1)-model from \((\textrm{A}_1)\), but with \({\varepsilon }_t\) replaced by the FARCH(1)-model from (4.1), with constants \(c_\psi \in \{0.2, 0.3418, 0.45\}\) and \(c_g\in \{0.05, 0.1, 0.15\}\). Note that increasing values of \(c_\psi \) imply a heavier tail of the norm of \({\varepsilon }_t\), with \(c_\psi =0.2, 0.3418, 0.45\) roughly corresponding to existing moments up to the order 5, 4 or 3, respectively, as was derived from a preliminary simulation experiment based on an application of the Hill estimator.

4.2 Details on the implementation

For the comparison with the tests by Zhang (2016) and Bagchi et al. (2018) (results in Table ) and the evaluation of the finite-sample properties under non-stationarity (results in Tables  and  ), the data was simulated on an equidistant grid of size 1000 on the interval [0, 1]. For the comparison with the tests by Kokoszka et al. (2017) and Characiejus and Rice (2020) (results in Table ), the size of the grid was chosen as 100 to accommodate the computational complexity of the tests. For the latter two tests, we relied on their implementation in the R-package wwntests by Petoukhov (2020).

For computational reasons, we reduced the dimension by projecting the generated data onto the subspace of \(L^2([0,1])\) spanned by the first \(D=17\) functions of the Fourier basis \(\{\psi _n\}_{n\in \mathbb {N}_0}\), where, for \(n\in \mathbb {N}\),

$$\begin{aligned} \psi _0\equiv 1,\quad \psi _{2n-1}(\tau )=\sqrt{2}\sin (2\pi n\tau ),\quad \psi _{2n}(\tau )=\sqrt{2}\cos (2\pi n\tau ) \end{aligned}$$

to calculate the proposed test statistic.

For the calculation of the bootstrap quantiles, we employed the data driven choice of the block length m explained in Bücher et al. (2020). In the context of stationary processes (models \((\textrm{N}_1)\)\((\textrm{N}_3)\) and \((\textrm{A}_1)\)\((\textrm{A}_4)\)), it is natural to consider global estimators in the bootstrap procedure and we chose the bandwidth \(n=T\). In fact, preliminary simulations suggested that this choice of n leads to better finite sample behavior. For the non-stationary models however, this choice is not reasonable and we used local estimators in order to avoid a possible bias. In this setting, we chose the bandwidth \(n=\lfloor T^{2/3}\rfloor \), satisfying Condition 3.3 (B2). The number of bootstrap replicates was chosen as 200 and each model was simulated 1000 times.

Table 1 Empirical rejection rates of test (3.10) for the classical hypotheses (2.5) in the case of stationary models, for various values of the maximal lag H in \(\bar{H}_0^{\scriptscriptstyle (H)}\)

4.3 Results for the classical hypotheses

In the following, we denote by (B) and (Z) the tests proposed by Bagchi et al. (2018) and Zhang (2016), respectively. \((\textrm{M}_H)\), \(H\in \{1,2,3\}\), denotes the multiple-lag test at lag H proposed by Kokoszka et al. (2017). Finally, \((\textrm{Spec}_s)\) and \((\textrm{Spec}_a)\) denote the spectral test as proposed by Characiejus and Rice (2020), with static and adaptive bandwidth, respectively. The empirical rejection rates of test (3.10) for the stationary models \((\textrm{N}_1)\)\((\textrm{N}_3)\) and \((\textrm{A}_1)\)\((\textrm{A}_4)\) are shown in Tables 1 and 2. We observe that the level approximation of the new test (3.10) is very accurate for all scenarios under consideration, and that the power is larger than for the competitors from the literature, in particular for small samples. A partial explanation for this observation consists in the fact that tests based in frequency domain formulate the white noise hypothesis in terms of the spectral density operator and therefore implicitly consider the auto-covariance operators at any lag h. Although the power of test (3.10) slightly decreases with increasing H, it decreases slower than the power of the multiple-lag time domain test by Kokoszka et al. (2017). The type I errors of the tests \((\textrm{Spec}_s)\) and \((\textrm{Spec}_a)\) seem to exceed the level of \(5\%\) for model \((\textrm{N}_3)\). This difficulty might arise from the fact that the data is uncorrelated but dependent. In contrast, the level approximation of the proposed tests seems to be more accurate.

The empirical rejection rates of test (3.10) for the locally stationary models \((\textrm{N}_4)\), \((\textrm{A}_5)\) and \((\textrm{A}_6)\) are shown in Table 3, for different sample sizes. We observe a reasonable approximation of the nominal level and high power under the non-stationary alternatives.

Table 2 Empirical rejection rates of test (3.10) for the classical hypotheses (2.5) in the case of stationary models, for various values of the maximal lag H in \(\bar{H}_0^{\scriptscriptstyle (H)}\)
Table 3 Empirical rejection rates of test (3.10) for the classical hypotheses (2.5) in the case of locally stationary models, for various values for the maximal lag H in \(\bar{H}_0^{\scriptscriptstyle (H)}\)

Finally, the results for the FAR(1)-FARCH(1)-model can be found in Table . Under the null hypothesis (\(c_g=0\)), the tests can be seen to become more conservative when the tail of the innovations becomes heavier (larger values of \(c_\psi \)), but they do hold their level approximately. As a consequence, the tests are also less powerful for heavier tails. As expectable, the power increases with the strength of the serial dependence (larger values of \(c_g\)) and the sample size T. Concerning a comparison of the different number of lags under consideration, the best power is observed at the smallest lag \(H=1\). For other models however, the power curve may well be larger at some intermediate value of H (for instance, for certain MA(2)-type models). As a practical recommendation, we follow respective arguments from Bücher et al. (2019) in the univariate case and suggest to restrict attention to either a choice of H that may be suggestive for the particular problem at hand for external reasons, or to some rather small value \(H \approx 4\). The latter typically guarantees sufficient power properties for a reasonably large class of alternatives.

Table 4 Empirical rejection rates of test (3.10) for the classical hypotheses (2.5) in the FAR(1)-FARCH(1)-model described at the end of Sect. 4.1

4.4 Results for relevant hypotheses

We conclude this section with a brief discussion of the performance of the proposed test (3.11) for the relevant hypotheses (3.7). For this purpose we have calculated the quantities \(\Vert M_h\Vert _{2}\) for the models \((\textrm{A}_1)\)\((\textrm{A}_4)\) by a numerical simulation (specifically, we simulated 10, 000 time series of length \(T=2000\), projected them on a Fourier basis of dimension \(D=101\), calculated for each time series the quantity \(\Vert {\hat{M}}_{h,T}\Vert \), for \(h\in \{1,\dots ,4\}\), and used the respective means as an approximation for \(\Vert M_h\Vert \)). The results can be found in Table . For the simulation experiment, we chose hypotheses corresponding to \(\Delta =\Delta _{h,w} = w\Vert M_h\Vert _{2}\) with \(w\in \{0.4+i/10: i=1,\dots , 11\}\) and \(h=1,\dots , 4\), such that the null hypotheses are met for \(w \ge 1\) and the alternative hypotheses are met for \(w < 1\). The results can be found in Table 6, where we omit the results for \(H\in \{2,3\}\) since they are qualitatively similar to the cases \(H\in \{1,4\}\). Again, we observe convincing level approximations and good power properties.

Table 5 Theoretical values of \(\Vert M_h\Vert _{2}\), obtained by simulation. The numbers in brackets correspond to the empirical variance of the simulation
Table 6 Empirical rejection rates of the test (3.11) for the relevant hypotheses (3.7) in the case of stationary models

5 Case study

Functional data arises naturally when time series are recorded with a very high frequency. To illustrate the proposed methodology, we consider intraday prices of various stocks. More specifically, we consider prices over the time span from February 2016 to January 2020, where each observation corresponds to the intraday price at a given day. In particular, let \(P_t(x_j)\), \(t\in \{1,\dots , T\}, j\in \{1,\dots , m\}\) denote the price of a share, observed at time points \(x_j\) at day t. The lengths T of the considered time series depend on the different stocks as for some days observations are missing.

Gabrys et al. (2010) define intradaily cumulative returns as

$$\begin{aligned} R_t(x_j)= 100 \{ \log P_t(x_j) - \log P_t(x_1) \} ,\quad j\in \{1,\dots , m\},~ t\in \{1,\dots ,T\}. \end{aligned}$$

Throughout, we consider \(R_t(\cdot )\) as an \(L^2\)-function. Some exemplary intradaily cumulative return curves are displayed in Fig. . The results of our testing procedure for detecting possible serial correlations can be found in Table , where we employed \(K=1000\) bootstrap replicates and considered up to \(H=4\) lags. The null hypotheses of serial correlation cannot be rejected at level \(\alpha =0.05\), as the p-values clearly exceed \(\alpha \). Thus, our results match the common assumption of uncorrelatedness in the literature.

Fig. 1
figure 1

Intradaily cumulative returns of Boeing and Blackrock from 8th to 12th of February 2016, where the x-axis corresponds to rescaled time and the y-axis denotes returns

Table 7 p-values of the (combined) tests for the respective null hypotheses in percent

6 Proofs

Proof of Theorem 3.1

We prove that for any \(H\in \mathbb {N}\) and as \(T \rightarrow \infty \),

(6.1)

where \({\tilde{B}}\) denotes a centred Gaussian variable in \(L^2([0,1]^3)^H\), with covariance operator given by (3.5). The statement is then a consequence of the continuous mapping theorem.

By Theorem 1 of Bücher et al. (2020), the vector \(\sqrt{T}({\hat{M}}_{1,T}-\mathbb {E}{\hat{M}}_{1,T},\dots , {\hat{M}}_{H,T}-\mathbb {E}{\hat{M}}_{H,T})\) converges weakly to a vector of centred Gaussian variables \({{\tilde{B}}} = ({{\tilde{B}}}_1,\dots ,{{\tilde{B}}}_H)\) in \(L^2([0,1]^3)^H\). Thus, (6.1) follows from Slutsky’s lemma, once we have shown that \(\lim _{T \rightarrow \infty } \sqrt{T}\Vert \mathbb {E}{\hat{M}}_{h,T}-M_h\Vert _{2}=0\) for any \(h\in \mathbb {N}\). For the latter purpose, invoke the triangle inequality to obtain

$$\begin{aligned} \sqrt{T}\Vert \mathbb {E}{\hat{M}}_{h,T}-M_h\Vert _{2}&= \sqrt{T} \bigg (\int _0^1 \bigg \Vert \frac{1}{T}\sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]\\&\quad -\int _0^u \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}]{\,\textrm{d}}w\bigg \Vert _{2}^2 {\,\textrm{d}}u \bigg )^{1/2}\\&= \sqrt{T} \bigg (\int _0^1 \bigg \Vert \sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} \int _{\frac{t-1}{T}}^{\frac{t}{T}}\mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-\mathbb {E}[X_t^{(w)}\otimes X_{t+h}^{(w)}] {\,\textrm{d}}w \\&\quad - \int _{T^{-1}\{\lfloor uT\rfloor \wedge (T-h)\}}^u \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}] {\,\textrm{d}}w\bigg \Vert _{2}^2 {\,\textrm{d}}u\bigg )^{1/2} \\&\le \sqrt{T} \bigg (\int _0^1 \bigg \{\sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} \bigg \Vert \int _{\frac{t-1}{T}}^{\frac{t}{T}}\mathbb {E}[X_{t,T}\otimes X_{t+h,T}-X_t^{(w)}\otimes X_{t+h}^{(w)}]{\,\textrm{d}}w\bigg \Vert _{2} \\&\quad + \bigg \Vert \int _{T^{-1}\{\lfloor uT\rfloor \wedge (T-h)\}}^u \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}] {\,\textrm{d}}w\bigg \Vert _{2} \bigg \}^2 {\,\textrm{d}}u \bigg )^{1/2}. \end{aligned}$$

The integral from \(T^{-1}\{\lfloor uT\rfloor \wedge (T-h)\}\) to u at the right-hand side is of order 1/T. Further, by Jensen’s inequality and local stationarity,

$$\begin{aligned}&\bigg \Vert \int _{\frac{t-1}{T}}^{\frac{t}{T}} \mathbb {E}[X_{t,T}\otimes X_{t+h,T}-X_t^{(w)}\otimes X_{t+h}^{(w)}]{\,\textrm{d}}w\bigg \Vert _{2}\\&\quad \le \int _{\frac{t-1}{T}}^{\frac{t}{T}}\Vert \mathbb {E}[X_{t,T}\otimes X_{t+h,T}-X_t^{(w)}\otimes X_{t+h}^{(w)}]\Vert _{2} {\,\textrm{d}}w \le \frac{C}{T^2} \end{aligned}$$

for some constant \(C>0\). Thus, it follows

$$\begin{aligned} \sqrt{T}\Vert \mathbb {E}{\hat{M}}_{h,T}-M_h\Vert _{2} = O(T^ {-1/2}), \end{aligned}$$

which completes the proof of the theorem. \(\square \)

Proof of Corollary 3.2

If \(\Vert M_h\Vert _{2} = 0\) for some \(h \in \{1, \dots H\}\), then \(\sqrt{T}(\Vert {\hat{M}}_{h,T}\Vert _{2}-\Vert M_h\Vert _{2})\Vert {\hat{M}}_{h,T}\Vert _{2}\) converges to zero in probability by Theorem 3.1 and Slutsky’s lemma. Hence, it is sufficient to assume that \(\Vert M_h\Vert _{2} \ne 0\) for all \(h \in \{1, \dots H\}\). We then obtain

from the functional delta method (Theorem 3.9.4 in van der Vaart and Wellner 1996), applied to the functional in Proposition 6.1 below. Apply Slutsky’s lemma to conclude. \(\square \)

Proposition 6.1

The function \(\Phi :=\Vert \cdot \Vert _{2}\) from \(L^2([0,1]^3)\) to \(\mathbb {R}\) is Hadamard-differentiable in any M with \(\Vert M\Vert _{2}>0\), with derivative \(\Phi '_M(h)=\tfrac{\langle M,h\rangle }{\Vert M\Vert _{2}}\) in direction \(h\in L^2([0,1]^3)\).

Proof

For any sequences \(h_n\rightarrow h\) with \(h_n \in L^2([0,1]^3)\) and \(t_n\rightarrow 0\) with \(t_n \in \mathbb {R}\setminus \{0\}\), it holds

$$\begin{aligned} \frac{\Vert M+t_nh_n\Vert _{2}^2-\Vert M\Vert _{2}^2}{t_n}&= \frac{1}{t_n}\int _{[0,1]^3} 2 M(x)t_nh_n(x)+t_n^2h_n^2(x) {\,\textrm{d}}x\\&= \int _{[0,1]^3} 2M(x) h_n(x){\,\textrm{d}}x +t_n\int _{[0,1]^3}h_n^2(x) {\,\textrm{d}}x, \end{aligned}$$

which converges to \(2\int _{[0,1]^3}M(x)h(x) {\,\textrm{d}}x=2\langle M,h\rangle \). The square root function in \(\mathbb {R}\) is Hadamard-differentiable at \(x>0\) with derivative \((\sqrt{x})'=\frac{1}{2\sqrt{x}}\). By the chain rule for Hadamard-differentiable functions (Lemma 3.9.3 in van der Vaart and Wellner 1996), the Hadamard-derivative of \(\Phi \) is given by \(\Phi '_M(h)=\tfrac{\langle M,h\rangle }{\Vert M\Vert _{2}}\). \(\square \)

Proof of Theorem 3.4

(i) can be deduced directly from Theorem 2 of Bücher et al. (2020). For (ii) note that by Theorem C.3 of the supplementary material of the latter article, it holds , where \(\mathbb {B}_T^{(k)} = ({\tilde{B}}_{T,1}^{(k)},\dots ,{\tilde{B}}_{T,H}^{(k)})\) and

$$\begin{aligned}&{\tilde{B}}_{T,h}^{(k)}(u,\tau _1,\tau _2) \\&= \frac{1}{\sqrt{T}}\sum _{i=1}^{\lfloor uT \rfloor \wedge (T-h)} \frac{R_i^{(k)}}{\sqrt{m}} \sum _{t=i}^{(i+m-1)\wedge (T-h)}X_{t,T}(\tau _1)X_{t+h,T}(\tau _2)-\mathbb {E}[X_{t,T}(\tau _1)X_{t+h,T}(\tau _2)]. \end{aligned}$$

Note that for \(u<1\) it holds \(\lfloor uT\rfloor +m-1 \le T-h\), for any sufficiently large \(T\in \mathbb {N}\). Thus, rewrite

$$\begin{aligned}&{\hat{B}}_{h,T,T}^{(k)}(u,\tau _1,\tau _2)={\tilde{B}}_{T,h}^{(k)}(u,\tau _1,\tau _2) \\&\quad + \sqrt{\frac{m}{T}} \sum _{i=1}^{\lfloor uT \rfloor }R_i^{(k)} \bigg (\frac{1}{T-h}\sum _{t=1}^{T-h} \mathbb {E}[X_{t,T}(\tau _1)X_{t+h,T}(\tau _2)]-X_{t,T}(\tau _1)X_{t+h,T}(\tau _2) \bigg )\\&\quad +\mathcal {O}_\mathbb {P}\Big (\sqrt{\tfrac{m}{T}}\Big ). \end{aligned}$$

For the second term on the right-hand side of the latter display, it holds by independence of the random variables \(R_i^{(k)}\),

$$\begin{aligned}&\mathbb {E}\bigg \Vert \sqrt{\frac{m}{T}} \sum _{i=1}^{\lfloor \cdot T \rfloor \wedge (T-h)}R_i^{(k)} \bigg (\frac{1}{T-h}\sum _{t=1}^{T-h} \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-X_{t,T}\otimes X_{t+h,T} \bigg ) \bigg \Vert _{2}^2\\&\le \int _{[0,1]^2}m \mathbb {E}\bigg [\bigg (\frac{1}{T-h}\sum _{t=1}^{T-h} \mathbb {E}[X_{t,T}(\tau _1) X_{t+h,T}(\tau _2)]-X_{t,T}(\tau _1) X_{t+h,T}(\tau _2) \bigg )^2\bigg ] {\,\textrm{d}}(\tau _1,\tau _2)\\&= \frac{m}{(T-h)^2} \sum _{t_1,t_2=1}^{T-h} \int _{[0,1]^2} \text {Cov}\big (X_{t_1,T}(\tau _1)X_{t_1+h,T}(\tau _2),X_{t_2,T}(\tau _1)X_{t_2+h,T}(\tau _2)\big ) {\,\textrm{d}}(\tau _1,\tau _2), \end{aligned}$$

which is of order O(m/T) by the same arguments as in the proof of Theorem 2 of Bücher et al. (2020). Thus, \(\hat{\mathbb {B}}_{T,T}^{(k)} = \mathbb {B}_T^{(k)} + O_\mathbb {P}(\sqrt{m/T})\) and (ii) follows. \(\square \)

Proof of Corollary 3.5

The assertions for the null hypothesis \(H_0^{\scriptscriptstyle (H)}\) follow from Theorem 3.4 and Corollary 4.3 in Bücher and Kojadinovic (2019). The null hypothesis \(H_0^{\scriptscriptstyle (H, \Delta )}\) may be treated by similar arguments as in the last-named corollary, observing that the weak limit of \(\bar{{\mathcal {S}}}_{H, \Delta ,T}\) is stochastically bounded by \(\max _{h=1}^H \langle M_h, {{\tilde{B}}}_h\rangle \) on the positive real line. The assertions regarding the alternative hypotheses follow from divergence to infinity of the test statistics and stochastic boundedness of the bootstrap statistics. \(\square \)