1 Introduction

Dynamical systems within the category of skew products have a long history. They receive attention for many different reasons: in earlier ergodic theory, they were studied as mild generalizations of suspensions which cannot be factored [17, Chapter 10], providing examples of simple partially hyperbolic systems; in recent days, they are often used to model real-life situations where fast-slow dynamics can be observed, such as in the study of climate [4, 25]. As their name suggests, they are products of a base dynamics and a fiber dynamics. Here, we are concerned with the statistical properties of these systems.

One of the simplest examples of a partially hyperbolic skew product is given by circle extensions of Anosov diffeomorphisms on the two-dimensional torus \({\mathbb {T}}^2\), defined as follows. Given an Anosov diffeomorphism \(A:{\mathbb {T}}^2 \rightarrow {\mathbb {T}}^2\) and a smooth map \(f :{\mathbb {T}}^2 \rightarrow {\mathbb {T}}\), a skew product \(F :{\mathbb {T}}^3 \rightarrow {\mathbb {T}}^3\) over A induced by f is defined by \(F(x,r)= (Ax, r+f(x))\). The torus \({\mathbb {T}}^3\) is equipped with a product measure \(\mu _u \times \text {Leb}\), where \(\mu _u\) is any Gibbs measure with Hölder potential u and \(\text {Leb}\) is the Lebesgue measure on the fibersFootnote 1. In this case, Dolgopyat [18] proved that generic functions f induce skew products F with rapid decay of correlations, or rapid mixing, i.e., decorrelation of \(\mathscr {C}^{\infty }\)-observables is faster than any given polynomial. The speed of mixing may, in fact, be exponential, but this is still an open problem ([18, Problem 2]). Dolgopyat’s result holds in general for compact group extensions. The interested reader can also check the introductions of [11, 24, 41] for an overview of old and new results on skew products.

In this paper, we are interested in skew products with non-compact fibers. In particular, we will consider \({\mathbb {R}}\)-extensions of topologically mixing Anosov diffeomorphisms via their symbolic counterparts. For an introduction to infinite ergodic theory, we refer the reader to Aaronson’s book [1]. In our setting, Guivarc’h showed that any Hölder function f with zero integral which is not cohomologous to a constant induces an ergodic skew product [28] (see also [16]).

Concerning stronger statistical properties, an historical perspective on the various possible definitions of mixing in the infinite ergodic setting can be found in [34]. We will be interested in global-local mixing, a notion introduced by Lenci in [34], namely we will study the correlations between global and local observables (see Definition 2.1). Local observables are akin to compactly supported observables, while global observables are supported over most of the of phase space. One possible concern about the notion of global-local mixing may be the seemingly arbitrary choice of the averaging involved in the definition of global observables. In our setting, the infinite volume average is analogous to statistical infinite volume limits introduced and refined along the years by Van Hove, Fisher [37, Section 3.3] and Ruelle [46, Section 3.9], which built on the inspirational work of Bogoliubov [6]. Global-local mixing has been studied in different situations, for example random walks [34, 35], mechanical systems [19, 20] and one-dimensional parabolic systems [12].

The main ingredient needed to study our class of skew product is accessibility (see, among others, [13, 14, 47]). The skew product F is accessible if, roughly speaking, it is possible to reach any point in the space by moving along segments of stable and unstable manifolds.

From the measure-theoretic point of view, using Markov partitions, we can translate the problem to study a skew product over a subshift of finite type, keeping the same \({\mathbb {R}}\)-fibers. The question whether accessibility is preserved passing to symbolic dynamics appears to be delicate, and will be discussed in “Appendix A.” Our main result, Theorem 3.9, provides quantitative estimates for the decay of correlations of global and local observables for an accessible \({\mathbb {R}}\)-extension of a symbolic shift. To the best of our knowledge, this is the first quantitative result in the context of global-local mixing.

Contrary to the case of compact group extensions, we cannot expect exponential mixing in general, since, taking Fourier transforms, we have to deal with arbitrary low frequencies. Indeed, we will show in Theorem 3.9 that the speed of convergence of correlations depends on the behavior near zero of the spectral measure associated to the global observable (namely, its inverse Fourier–Stieltjes transform): if the support of this measure intersects a neighborhood of 0 only at 0, then mixing is rapid (Theorem 2.2) as we expect from Dolgopyat’s result; in other cases we obtain polynomial estimates (Theorems 2.3, 2.4), which correspond to the expected behavior (see Remark 2.5). Note that since the infinite volume average equals the value of the associated spectral measure at zero, our choice of infinite volume average is natural.

The main novelties of the work rely on a careful choice of the functional spaces involved. The accessibility hypothesis, when coupled with a standard central limit theorem for the underlying symbolic dynamics on the basis, allows for transfer operator bounds on suitable splitting of lower and higher frequency modes and the exploiting of cancellation effects due to accessibility.

1.1 Krickeberg Mixing

Another independent notion of mixing for infinite measure preserving transformation is known as Krickeberg (or local) mixing. An infinite measure preserving system \((X, \mu , T)\) is said to be Krickeberg mixing if there exists a sequence of positive numbers \(\rho _n \rightarrow \infty \) such that, for any pair of “nice” finite measure sets AB (precisely, bounded sets whose boundary has zero measure), the rescaled correlations converge, that is

$$\begin{aligned} \lim _{n \rightarrow \infty } \rho _n \, \mu (T^{-n}A \cap B) = \mu (A) \mu (B). \end{aligned}$$
(1)

The first example of a system satisfying this property dates back to Hopf in 1937 [31], but it was overlooked for several years, until the 60s when Krickeberg proposed (1) as a definition of infinite measure mixing [33]. Since then, it has received considerable attention and Krickeberg mixing has been proved in several situations, see, e.g., [21, 27, 38,39,40, 42] to name a few.

In the language of this paper, Krickeberg mixing can be seen as a remarkably strong form of “local-local mixing”: the correlations of local observables converge to zero and the first-order term is the same up to a constant for any two sufficiently smooth compactly supported functions. On the other hand, global-local mixing is a softer property, which however gives information on the correlations for a much wider class of observables, in particular for those which are not supported on a compact subset but “see” the whole phase space.

Exploiting some variations on the methods developed in this paper, we are able to establish a strong form of Krickeberg mixing for the accessible skew products we consider here, namely we can prove a full asymptotic expansion of the correlations of Schwartz observables, in the spirit of the main theorem of [22]. This result will appear in a separate paper.

1.2 Outline of the Paper

The rest of the paper is organized as follows. In Sect. 2, we rigorously introduce our framework and state our main results. In Sect. 3, we describe in detail the classes of global and local observables we consider and we state our core result, Theorem 3.9. We then deduce Theorems 2.2, 2.3 and 2.4 from Theorem 3.9.

In Sect. 4, we present a preliminary result in the non-invertible case of skew products over one-sided subshifts, Theorem 4.5. We also describe a “collapsed accessibility” property, which constitutes the main working assumption on the skew product in this setting.

The main tool to prove Theorem 4.5 is a family of twisted transfer operators. In Sect. 5, we show how the collapsed accessibility property can be exploited to obtain some cancellations in the expression for the twisted transfer operators, as in the work of Dolgopyat [18]. In Sect. 6, we prove some estimates on the norm of the twisted transfer operators. For large twisting parameters, the estimates are obtained exploiting the results in Sect. 5; for small parameters, we apply some standard results in the theory of analytic perturbations of bounded linear operators.

Section 7 contains some technical results that will be applied to prove the main theorems. Section 8 is devoted to the proof of Theorem 4.5. In order to deduce Theorem 3.9 from Theorem 4.5, in Sect. 9 we deduce the collapsed accessibility property for a one-sided skew product from the accessibility property of the corresponding two-sided skew product. In Sect. 10, we prove Theorem 3.9.

In “Appendix A,” we discuss the problem whether the accessibility property for a skew product over an Anosov diffeomorphism is equivalent to the accessibility of the associated symbolic system. Appendices B and C contain the proofs of several technical results.

2 Setup and Main Results

Let \(\sigma :\Sigma \rightarrow \Sigma \) be a topologically mixing two-sided subshift of finite type, equipped with a Gibbs measure \(\mu = \mu _u\) with respect to a Hölder potential u (see, e.g., [10, §1] or [41, §3]). Explicitly, for \( x \in \Sigma \), \(x= \{ x_i \}_{i \in \mathbb {Z}}\), we have \((\sigma x)_i = x_{i+1}\). For \(0<\theta < 1\), define the distance

$$\begin{aligned} d_\theta (x,y) = \theta ^{\max \{j \in {\mathbb {N}}\ :\ x_i = y_i \ \text { for all}\ |i|<j\}}. \end{aligned}$$

Let us denote by \(\mathscr {F}_{\theta }\) the space of Lipschitz continuous functions \(w:\Sigma \rightarrow {\mathbb {C}}\) equipped with the norm

$$\begin{aligned} \left\Vert w\right\Vert _{\theta } = \left\Vert w\right\Vert _{\infty } + |w|_{\theta }, \quad \text { where }\quad |w|_{\theta } = \sup _{x\ne y} \frac{|w(x)-w(y)|}{d_{\theta }(x,y)}. \end{aligned}$$
(2)

We consider the skew product

$$\begin{aligned} F:\Sigma \times {\mathbb {R}}\rightarrow \Sigma \times {\mathbb {R}}, \quad F(x, r) = (\sigma x, r + f(x)), \end{aligned}$$
(3)

where \(f:\Sigma \rightarrow {\mathbb {R}}\) is a Lipschitz continuous function with zero average, \(\int _\Sigma f \mathop {}\!\mathrm {d}\mu _u = 0\).

As mentioned in the introduction, we will assume that F is accessible: roughly speaking, this means that any two points can be connected by a path consisting of pieces of stable and unstable manifolds; see Sect. 3 for the precise definition.

We are interested in the mixing properties of the map F with respect to the infinite measure \(\nu =\mu \times {{\,\mathrm{Leb}\,}}\), where \({{\,\mathrm{Leb}\,}}\) is the Lebesgue measure on \({\mathbb {R}}\).

Definition 2.1

A local observable is any function \(\psi \in L^1(\nu )\). A global observable is any function \(\Phi \in L^{\infty }(\nu )\) such that the following limit exists

$$\begin{aligned} \nu _{{{\,\mathrm{av}\,}}}(\Phi ) := \lim _{R \rightarrow \infty } \frac{1}{2R} \int _{\Sigma \times [-R,R]} \Phi (x,r) \mathop {}\!\mathrm {d}\nu (x,r). \end{aligned}$$
(4)

If \(\psi \) is a local observable, we will write \(\nu (\psi ) = \int _{\Sigma \times {\mathbb {R}}} \psi \mathop {}\!\mathrm {d}\nu \). We will show in Lemma 3.2 that if \(\Phi \) is a global observable, then so is \(\Phi \circ F\), and the average \(\nu _{{{\,\mathrm{av}\,}}}\) defined in (4) is invariant under F.

For any pair of global and local observables \((\Phi ,\psi )\), let us denote by \({{\,\mathrm{cov}\,}}(\Phi , \psi )\) the covariance

$$\begin{aligned} {{\,\mathrm{cov}\,}}(\Phi , \psi ) := \nu ( \Phi \cdot {\overline{\psi }} ) - \nu _{{{\,\mathrm{av}\,}}}(\Phi ) \nu ( \psi ). \end{aligned}$$

We are interested in showing “global-local mixing,” namely in proving that the correlations \({{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi )\) satisfy

$$\begin{aligned} \begin{aligned} \lim _{n \rightarrow \infty } {{\,\mathrm {cov}\,}}(\Phi \circ F^n, \psi ) = 0 \end{aligned} \end{aligned}$$
(5)

and in studying the rate of convergence to such limit (also known as the rate of decay of correlations).

The main result of this paper, Theorem 3.9, establishes quantitative global-local mixing estimates. Since some preliminary work is needed, the statement of the main theorem is postponed to Sect. 3. We state here some corollaries which should give the reader a rather complete picture of the possible scenarios. Theorem 2.2 states that, for a dense class of almost periodicFootnote 2 global observables, we have rapid mixing, namely the decay of correlations is faster than any given polynomial, in analogy to Dolgopyat’s result [18] in the case of circle extensions. On the other hand, for a dense class of global observables which vanish at infinity, Theorems 2.3 and 2.4 state that the decay is polynomial. The bound in Theorem 2.4 is generically optimal, as we show in §3.6.

Let us fix some notation. Let \(\mathscr {C}^k({\mathbb {R}})\) be the space of k-times differentiable functions on \({\mathbb {R}}\). We will denote by \(\mathscr {P}\) the subspace of \(\mathscr {C}^2({\mathbb {R}})\) consisting of \(2\pi \)-periodic functions; by \(\mathscr {C}_0({\mathbb {R}})\) the space of continuous functions on \({\mathbb {R}}\) which vanish at infinity, and by \(\mathscr {C}^\infty _c({\mathbb {R}})\) the subspace of infinitely differentiable functions with compact support. The space \(\mathscr {C}^\infty _c({\mathbb {R}})\) has the structure of a Fréchet space, induced by the family of seminorms \(\Vert \cdot \Vert _{\mathscr {C}^k}\), for \(k \in {\mathbb {N}}\). We will say that a map \(\psi :\Sigma \rightarrow \mathscr {C}^\infty _c({\mathbb {R}})\) is Lipschitz if it is Lipschitz with respect to \(\Vert \cdot \Vert _{\mathscr {C}^k}\), for all \(k \in {\mathbb {N}}\).

In the rest of the paper, we will implicitly identify maps a from \(\Sigma \) to some space of complex-valued measurable functions over \({\mathbb {R}}\) with complex-valued measurable functions on \(\Sigma \times {\mathbb {R}}\) by setting \(a(x,r) = [a(x)](r)\).

Theorem 2.2

(Rapid global-local mixing) Assume that F, defined as in (3), is accessible. For any Lipschitz map \(\psi :\Sigma \rightarrow \mathscr {C}^\infty _c({\mathbb {R}})\), for any Lipschitz map \(\Phi :\Sigma \rightarrow \mathscr {P}\), and for every \(\ell \in {\mathbb {N}}\), there exists a constant \(C=C(\ell , \psi , \Phi )\ge 0\) such that for all \(n \in {\mathbb {N}}\),

$$\begin{aligned} \left|{{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi ) \right|\le C n^{-\ell }. \end{aligned}$$

Rapid mixing holds in many more situations besides the one in Theorem 2.2. In a precise sense that will become clear later, the key property of the global observable that ensures rapid mixing is the absence of “low frequencies components” in almost every fiber, a property which is clearly satisfied by smooth functions which are periodic on the fibers.

The situation is different when the global observable vanishes at infinity. Note that, in this case, the average \(\nu _{{{\,\mathrm{av}\,}}}\) defined in (4) is zero.

Theorem 2.3

(Polynomial global-local mixing I) Assume that F, defined as in (3), is accessible. There exists a space of bounded continuous functions \(\mathscr {D} \subset \mathscr {C}_0({\mathbb {R}})\), which is dense in \(\mathscr {C}_0({\mathbb {R}})\) with respect to \(\Vert \cdot \Vert _{\infty }\), and there exists \(\alpha >0\) such that the following holds. For any Lipschitz map \(\psi :\Sigma \rightarrow \mathscr {C}^\infty _c({\mathbb {R}})\), and for any map \(\Phi :\Sigma \rightarrow \mathscr {D}\) which satisfies some explicit Lipschitz condition, there exists a constant \(C=C(\psi , \Phi )\ge 0\) such that for all \(n \in {\mathbb {N}}\),

$$\begin{aligned} \left|{{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi ) \right|\le C n^{-\alpha }. \end{aligned}$$

The Lipschitz conditions for the global observable in Theorem 2.3 will be stated explicitly in Sect. 3, as well as a bound on \(\alpha \) (see the second paragraph of the proofs at §3.5). If we further assume that the global observable takes values in \(W^1({\mathbb {R}})\), where \(W^1({\mathbb {R}})\) is the Sobolev space of \(L^2\) functions with weak derivative in \(L^2\), then the statement reads as follows.

Theorem 2.4

(Polynomial global-local mixing II). For any Lipschitz map \(\psi :\Sigma \rightarrow \mathscr {C}^\infty _c({\mathbb {R}})\), for any Lipschitz map \(\Phi :\Sigma \rightarrow W^1({\mathbb {R}})\), and for any \(\varepsilon >0\), there exists a constant \(C=C(\psi , \Phi , \varepsilon )\ge 0\) such that for all \(n \in {\mathbb {N}}\),

$$\begin{aligned} \left|{{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi ) \right|\le C n^{-\frac{1}{4}+\varepsilon }. \end{aligned}$$

If moreover \(\Phi :\Sigma \rightarrow W^1({\mathbb {R}}) \cap L^p({\mathbb {R}})\) for some \(1 \le p \le 2\), then

$$\begin{aligned} \left|{{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi ) \right|\le C n^{-\frac{1}{2p}+\varepsilon }. \end{aligned}$$

Remark 2.5

The bound in Theorem 2.4 is optimal: we will provide an example in §3.6 of a pair of global and local observables \(\Phi , \psi \) for which the correlations are bounded below by \(| {{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi ) | \ge B n^{-\frac{1}{2}}\) for some constant \(B>0\), and, on the other hand, Theorem 2.4 implies that for any \(\varepsilon >0\) there exists a constant \(C>0\) such that \(| {{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi ) | \le C n^{-\frac{1}{2}+\varepsilon }\).

We stated our results on the symbolic systems, as it makes the statements easier to read. Thanks to the semiconjugacy between the diffeomorphism and the symbolic dynamics, analogous classes of observables give analogous results on the original dynamics, as their regularity is preserved (see Lemma 3.2 for the details).

3 The Main Result

In this section, we first recall the definition of accessibility for F as in (3); then, we describe the classes of global and local observables we consider, and we state our main result. We deduce Theorems 2.2, 2.3 and 2.4 from Theorem 3.9. To help the reader in following the flow of the proofs, some of the lemmas stated in this section are proved in “Appendix B.”

3.1 Accessibility

For each point \(x\in \Sigma \), we define the stable and unstable set at x by, respectively,

$$\begin{aligned} \begin{aligned}&W^s(x) = \{ y \in \Sigma : \text { there exists }n \in {\mathbb {Z}}\text { such that } y_i=x_i \text { for all } i \ge n\}, \\&W^u(x) = \{ y \in \Sigma : \text { there exists }n \in {\mathbb {Z}}\text { such that } y_i=x_i \text { for all } i \le n \}. \end{aligned} \end{aligned}$$

By definition, for any \(y \in W^s(x)\), \(d_\theta (\sigma ^nx, \sigma ^ny) \rightarrow 0 \) exponentially fast and, similarly, for \(y \in W^u(x)\), \(d_\theta (\sigma ^{-n}x, \sigma ^{-n}y) \rightarrow 0 \); moreover, note that \(d_\theta \) attains a discrete set of values \(\{ \theta ^{i} \}_{i \in {\mathbb {N}}}\).

The skew product (3) is partially hyperbolic in the following sense. Let us denote by \(f_n(x) = \sum _{i=0}^{n-1} f \circ \sigma ^i(x)\) the n-th Birkhoff sum at x. Let us define for any \((x,r) \in \Sigma \times {\mathbb {R}}\)

$$\begin{aligned} \begin{aligned}&W^s(x,r) = \{ (y,s) \in \Sigma \times {\mathbb {R}}: y \in W^s(x) \text { and } s-r = \lim _{n \rightarrow \infty } f_n(x) - f_n(y)\}, \\&W^u(x,r) = \{ (y,s) \in \Sigma \times {\mathbb {R}}: y \in W^u(x) \text { and } s-r = \lim _{n \rightarrow \infty } f_n(\sigma ^{-n}y) - f_n(\sigma ^{-n}x)\}. \end{aligned} \end{aligned}$$

We equip \( \Sigma \times \mathbb {R}\) with the product distance given by

$$\begin{aligned} {\text {dist}}\big ((x,r), (y,s) \big ) = d_\theta (x,y) + |s-t|; \end{aligned}$$

it is easy to see that

$$\begin{aligned} \begin{aligned}&\lim _{n \rightarrow \infty } {\text {dist}}(F^n(x,r), F^n(y,s)) = 0 \text { exponentially fast, if }(y,s) \in W^s(x,r), \\&\lim _{n \rightarrow -\infty } {\text {dist}}(F^n(x,r), F^n(y,s)) = 0 \text { exponentially fast, if }(y,s) \in W^u(x,r). \end{aligned} \end{aligned}$$

The sets \(W^s(x,r)\) and \(W^u(x,r)\) are called the (strong) stable and (strong) unstable manifold at \((x,r) \in \Sigma \times {\mathbb {R}}\). Vertical lines \(\{x\} \times {\mathbb {R}}\) constitute the center manifolds, namely they form an invariant fibration and the action of F on each line is isometric.

We now define the accessibility property. A su-path from (xr) to (ys) is a finite sequence \((x^i,r_i) \in \Sigma \times {\mathbb {R}}\), for \(0 \le i \le m\) for some \(m \in {\mathbb {N}}\), such that \((x^0,r_0) = (x,r)\), \((x^m,r_m)= (y,s)\), and \((x^i,r_i) \in W^s(x^{i-1},r_{i-1})\) or \((x^i,r_i) \in W^u(x^{i-1},r_{i-1})\) for all \(1 \le i \le m\). We say that F is accessible if for any two points \((x,r), (y,s) \in \Sigma \times {\mathbb {R}}\) there is a su-path from (xr) to (ys).

A consequence of the accessibility property is the following fact, which will be proved in “Appendix B.”

Lemma 3.1

If F is accessible, then f is not cohomologous to zero.

This lemma is used in Sect. 6 when analyzing the analytic perturbation of the transfer operator. However, we still need to directly use accessibility of F in other parts of the proof.

3.2 The Classes of Global and Local Observables

We now describe the classes of global and local observables we consider, which we will call good global and local observables. Let us start by observing that the average defined in (4) is invariant under F.

Lemma 3.2

If \(\Phi \) is a global observable according to Definition 2.1, then \(\Phi \circ F\) is a global observable and \(\nu _{{{\,\mathrm{av}\,}}}(\Phi \circ F)=\nu _{{{\,\mathrm{av}\,}}}(\Phi )\).

The proof of Lemma 3.2, which can be found in “Appendix B,” is a consequence of the invariance of the Gibbs measure with respect to the dynamics.

We will denote by \(\mathscr {S}\) the Fréchet space of Schwartz functions on \({\mathbb {R}}\), with the family of seminorms

$$\begin{aligned} \Vert g\Vert _{a,\ell } := \sup _{r \in {\mathbb {R}}} |r|^a \left|\frac{\mathop {}\!\mathrm {d}^\ell }{(\mathop {}\!\mathrm {d}r)^\ell } g(r)\right|. \end{aligned}$$

We will say that a function \(\psi :\Sigma \rightarrow \mathscr {S}\) is Hölder if it is Hölder with respect to \(\Vert \cdot \Vert _{a,\ell }\) for all \(a,\ell \in {\mathbb {N}}\). Starting from definition (2.1), we restrict ourselves from now on to smaller classes of observables.

Definition 3.3

(Good local observables) We denote by \(\mathscr {L}\subset L^1(\nu )\) the space of Hölder functions \(\psi :\Sigma \rightarrow \mathscr {S}\).

Let \(\eta \) be a complex measure over \({\mathbb {R}}\). We will denote by \(|\eta |\) the variation of \(\eta \) and by \(\Vert \eta \Vert _{{{\,\mathrm{TV}\,}}} = |\eta |({\mathbb {R}})\) its total variation. We recall that the Fourier–Stieltjes transform \({\widehat{\eta }}(r)\) of a complex measure \(\eta \) of finite total variation is the \(L^\infty \) function defined by

$$\begin{aligned} {\widehat{\eta }}(r):= \int _{\mathbb {R}}e^{-ir\xi } \mathop {}\!\mathrm {d}\eta (\xi ). \end{aligned}$$

Let \(\mathscr {A}\) be the set of such \({\widehat{\eta }}\)’s: the space \(\mathscr {A}\) of all Fourier–Stieltjes transforms is an algebra of functions, called the Fourier–Stieltjes algebra. We equip \(\mathscr {A}\) with the total variation norm, namely, for \({\widehat{\eta }}_1, {\widehat{\eta }}_2 \in \mathscr {A}\), we set

$$\begin{aligned} \Vert {\widehat{\eta }}_1 - {\widehat{\eta }}_2\Vert := \Vert \eta _1 - \eta _2\Vert _{{{\,\mathrm{TV}\,}}}. \end{aligned}$$

Definition 3.4

(Good global observables). We denote by \(\mathscr {G}\subset L^\infty (\nu )\) the space of Hölder functions \(\Phi :\Sigma \rightarrow \mathscr {A}\) which satisfy the following tightness condition:

figure a

where \(\widehat{\eta _x} = \Phi (x)\).

The tightness condition (TC) above ensures that one can control the large frequency behavior of \(\Phi (x,\cdot )\) uniformly in the point \(x \in \Sigma \). In particular, it will be exploited in the proofs of Lemma 10.2 and Proposition 10.3 in “Appendix C” to have some compactness property.

Remark 3.5

Any Hölder function from \(\Sigma \) into a metric space can be made Lipschitz by choosing a larger \(\theta \) and hence changing the metric \(d_\theta \) on \(\Sigma \). Since we are not imposing any condition on \(\theta \), here and henceforth we will assume that good local observables are Lipschitz maps \(\psi :\Sigma \rightarrow \mathscr {S}\) and good global observables are Lipschitz maps \(\Phi :\Sigma \rightarrow \mathscr {A}\) satisfying (TC).

The following lemma shows that elements of \(\mathscr {G}\) are indeed global observables; more precisely, the average \(\nu _{{{\,\mathrm{av}\,}}}\) of \(\Phi \) is the average of the values of the associated measures \(\eta _x\) at 0. See “Appendix B” for the short complex integration computation which leads to the result.

Lemma 3.6

If \(\Phi \in \mathscr {G}\), then \(\Phi \) is a global observable according to Definition 2.1 and

$$\begin{aligned} \nu _{{{\,\mathrm{av}\,}}}(\Phi ) = \int _{\Sigma } \eta _x(\{0\}) \mathop {}\!\mathrm {d}\mu (x), \end{aligned}$$

where as before, \(\widehat{\eta _x} = \Phi (x)\).

We conclude this section by providing a useful criterion to determine whether a given function is the Fourier–Stieltjes transform of a finite complex measure which satisfies (TC). A positive definite function is any function \(g :{\mathbb {R}}\rightarrow {\mathbb {C}}\) such that

$$\begin{aligned} \sum _{i,j=1}^ng(x_i-x_j)z_i\overline{z_j} \ge 0, \end{aligned}$$

for all \(n \ge 1\), \(x_i,x_j \in {\mathbb {R}}\) and \(z_i, z_j \in {\mathbb {C}}\). By Bochner’s theorem, a function g is continuous and positive definite if and only if it is the Fourier–Stieltjes transform \({\widehat{\eta }}\) of a finite positive measure \(\eta \) on \({\mathbb {R}}\) (see, e.g., [45, Theorem IX.9]). For example, it is easy to check that \(g(x) = e^{ix}\) or \(g(x)=\cos (x)\) are positive definite functions. A less trivial example is the function \(g(x)=\frac{1}{|x|+1}\); the fact that g is positive definite follows from Pólya’s Criterion: any positive, continuous, even function which, for positive x, is non-increasing, convex and tends to 0 for \(x\rightarrow \infty \) is the Fourier–Stieltjes transform of an \(L^1\) function, thus positive definite. We refer the reader to [36] and [49, Chapter 6] for more results concerning the Fourier–Stieltjes transform of measures.

Lemma 3.7

Any linear combination of Lipschitz positive definite functions is the Fourier–Stieltjes transform of a complex measure of finite total variation which satisfies (TC).

Since any complex measure of finite total variation is a linear combination of positive finite measures, the proof of the lemma above follows immediately from the following tail estimate, whose proof can be found in “Appendix B.”

Lemma 3.8

Let \(\eta \) be a finite positive measure on \({\mathbb {R}}\), let \(\Phi (r)\) be its Fourier–Stieltjes transform. Then, if \(\Phi (r)\) is Lipschitz of constant L, for all \(r >0\) we have

$$\begin{aligned} \eta ( {\mathbb {R}}\setminus [-r,r]) \le \frac{2L}{r}. \end{aligned}$$

3.3 Statement of the Main Result

For any good global observable \(\Phi \in \mathscr {G}\) and for any \(r >0\), let us define the “low frequency variation” as

$$\begin{aligned} {{\,\mathrm{LF}\,}}(\Phi , r) := \int _{\Sigma } |\eta _x| \big ( (-r, r) \setminus \{0\} \big ) \mathop {}\!\mathrm {d}\mu (x). \end{aligned}$$
(6)

Notice that \({{\,\mathrm{LF}\,}}(\Phi , \cdot )\) is monotone and \({{\,\mathrm{LF}\,}}(\Phi , r) \rightarrow 0\) for \(r \rightarrow 0\). We are now ready to state our main result.

Theorem 3.9

(Quantitative global-local mixing). Assume that F, defined as in (3), is accessible. Then, for every \(\psi \in \mathscr {L}\), for every \(\Phi \in \mathscr {G}\), for any \(k \in {\mathbb {N}}\), and for every \(\varepsilon >0\), there exists a constant \(C=C(\Phi , \psi , k, \varepsilon )>0\) such that for every \(n \in {\mathbb {N}}\),

$$\begin{aligned} |{{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi )| \le C \left( {{\,\mathrm{LF}\,}}\Big (\Phi , n^{-\frac{1}{2}+ \varepsilon } \Big ) + n^{-k} \right) . \end{aligned}$$

The bound in Theorem 3.9 is the sum of two terms, namely a superpolynomial term and the contribution given by the measures \(|\eta _x|\) close to 0. In particular, if the support of the measures \(\eta _x\) does not intersect some neighborhood of 0, then the decay of correlations is superpolynomial. On the other hand, for example under the assumptions of Theorem 2.4, the measures \(|\eta _x|\) are absolutely continuous and the decay is polynomial.

In the rest of the section, we prove Theorems 2.2, 2.3 and 2.4 from the result above.

3.4 Proof of Theorem 2.2

We deduce Theorem 2.2 from Theorem 3.9.

By the theory of Fourier series, any \(p \in \mathscr {P} \subset \mathscr {C}^2({\mathbb {R}})\), by periodicity, is the Fourier–Stieltjes transform of a discrete measure \(\eta \) of the form \(\eta =\sum _{n \in {\mathbb {Z}}} a_n \delta _n\), where \(a_n \in {\mathbb {C}}\) and \(\delta _n\) is the Dirac measure at n. We claim that any Lipschitz map \(\Phi :\Sigma \rightarrow \mathscr {P}\) is contained in \(\mathscr {G}\). Theorem 3.9 then immediately implies the result, since \(|\eta _x| ( (-1,1) \setminus \{0\} )=0\) (where, as usual, we write \(\Phi (x) = \widehat{\eta _x}\)).

We first check the Lipschitz condition. For \(x \in \Sigma \), let us write \(\eta _x = \sum _{n \in {\mathbb {Z}}} a_n(x) \delta _n\). Since \(\Phi (x) \in \mathscr {C}^2({\mathbb {R}})\), it follows that \(\lim _{n \rightarrow \infty }|n^2 a_n(x)| = 0\). In particular, the sequence \(|a_n(x)| \cdot (1+i|n|)\) is square-summable (notice that \(ina_n(x)\) are the Fourier coefficients of the derivative \(\Phi (x)'\)). Thus, for any \(x,y \in \Sigma \), by Cauchy–Schwartz, we have

$$\begin{aligned} \begin{aligned} \Vert \Phi (x) - \Phi (y)\Vert&= \Vert \eta _x - \eta _y\Vert _{{{\,\mathrm{TV}\,}}} = \sum _{n \in {\mathbb {Z}}} |a_n(x)-a_n(y)| = \sum _{n \in {\mathbb {Z}}} |a_n(x)-a_n(y)| \cdot \frac{1+i|n|}{1+i|n|} \\&\le \left( \sum _{n \in {\mathbb {Z}}} \frac{1}{1+n^2} \right) ^{\frac{1}{2}} \cdot \left( \sum _{n \in {\mathbb {Z}}} |a_n(x)-a_n(y)|^2 + |a_n(x)-a_n(y)|^2 \cdot n^2 \right) ^{\frac{1}{2}}. \end{aligned} \end{aligned}$$

Hence, by Plancharel formula, there exists a constant \(C >0\) such that

$$\begin{aligned} \Vert \Phi (x) - \Phi (y)\Vert\le & {} C \left( \Vert \Phi (x) - \Phi (y)\Vert _\infty + \Vert \Phi (x)' - \Phi (y)'\Vert _\infty \right) \\\le & {} C \Vert \Phi (x) - \Phi (y)\Vert _{\mathscr {C}^2}. \end{aligned}$$

This shows that \(\Phi :\Sigma \rightarrow \mathscr {G}\) is Lipschitz.

We now show that \(\Phi \) satisfies the tightness condition (TC). Since \(\Phi (x) \in \mathscr {C}^2({\mathbb {R}})\), we can bound \(|a_n(x)| \le \Vert \Phi (x)''\Vert _\infty n^{-2}\). Thus, for any \(r \ge 2\) we have

$$\begin{aligned} |\eta _x| ({\mathbb {R}}\setminus [-r,r]) = \sum _{|n|> r} |a_n(x)|\le \Vert \Phi (x)''\Vert _\infty \sum _{|n| > r} n^{-2}\le \Vert \Phi (x)\Vert _{\mathscr {C}^2} \, r^{-1}, \end{aligned}$$

which concludes the proof.

3.5 Proof of Theorems 2.3 and 2.4

Let us first prove Theorem 2.3. To this end, fix any \(p>1\) and consider as \(\mathscr {D}\) the space of Fourier transforms of functions \(f \in L^1 \cap L^p\) with power decay, namely, for which there exist constants \(A,a >0\) such that \(f(\xi ) \le A |\xi |^{-a}\) for all \(|\xi | \ge 1\). Then, since \(\mathscr {S} \subset \mathscr {D} \subset \mathscr {G}\), it is clear that \(\mathscr {D}\) is dense in \(\mathscr {C}_0({\mathbb {R}})\).

Consider \(\Phi :\Sigma \rightarrow \mathscr {D}\), and write \(\Phi (x) = \widehat{\eta _x}\) where \(\mathop {}\!\mathrm {d}\eta _x = f_x(\xi ) \mathop {}\!\mathrm {d}\xi \), with \(f_x \in L^1 \cap L^p\). The Lipschitz condition we need to impose on \(\Phi \) to be a good global observable is \(\Vert \eta _x - \eta _y\Vert _{{{\,\mathrm{TV}\,}}} = \Vert f_x - f_y\Vert _{L^1} \le L(\Phi ) \, d_{\theta }(x,y)\) for some constant \(L(\Phi ) >0\). In order to conclude, we show that \(|\eta _x|\big ( -n^{-\frac{1}{2}+ \varepsilon }, n^{-\frac{1}{2}+ \varepsilon }\big )\) decays as a power of n for all \(x \in \Sigma \). Let \(0<{{\tilde{\alpha }}} = \frac{1}{2} - \varepsilon < \frac{1}{2}\). Then, using Hölder inequality, with \(\frac{1}{p}+\frac{1}{q} = 1\),

Therefore, Theorem 2.3 holds for any \(\alpha \) of the form \({{\tilde{\alpha }}} (1\!-\!\frac{1}{p}) \!=\! (\frac{1}{2} \!-\! \varepsilon )(1\!-\!\frac{1}{p}) \), with \(\varepsilon >0\).

Let us now prove Theorem 2.4. It follows from [5, Theorem 4.2] that any function \(f \in W^1\) is the Fourier transform of a function \(g \in L^1 \cap L^2\), which satisfies \(\Vert g\Vert _2 = \Vert f\Vert _2\) and \(\Vert g\Vert _1 \le \Vert f\Vert _{W^1}\). This implies that any Lipschitz map \(\Phi :\Sigma \rightarrow W^1\) is Lipschitz also with respect to the total variation norm.

Let us check that \(\Phi \) satisfies the tightness condition (TC). Denote \(\mathop {}\!\mathrm {d}\eta _x(\xi ) = f_x \mathop {}\!\mathrm {d}\xi \), with \(f_x \in L^1 \cap L^2\). Again, it follows from [5, Theorem 4.2] that \(\xi f_x(\xi ) \in L^2\) and \( \Vert \xi f_x(\xi )\Vert _2 \le \Vert \Phi (x)'\Vert _2\). For any \(r \ge 2\), by Cauchy–Schwartz and by Plancharel formula, we have

The estimate \(|\eta _x|\big ( -n^{-\frac{1}{2}+ \varepsilon }, n^{-\frac{1}{2}+ \varepsilon } \big ) = O \big ( n^{-\frac{1}{4}+ \varepsilon }\big )\) follows from Cauchy–Schwarz inequality exactly as above. If in addition \(\Phi \) has range in \(W^1 \cap L^p\), then the functions \(f_x\) belong to \(L^q\), where \(\frac{1}{p} + \frac{1}{q} = 1\), and one can conclude using Hölder inequality again. This finishes the proof.

3.6 Example

We discuss a simple example, which shows that the bound in Theorem 2.4 cannot, in general, be improved. As good local observable, let us consider any nonnegative \(\psi (x,r) = \psi (r) \in \mathscr {C}^\infty _c({\mathbb {R}})\) which equals 1 in the interval \(\left[ -\frac{1}{2},\frac{1}{2} \right] \) and, as good global observable, let \(\Phi (x,r)= \Phi (r) = \frac{1}{1+|r|}\). Then, \(\Phi \in W^1({\mathbb {R}}) \cap L^p({\mathbb {R}})\) for any \(p >1\), so that Theorem 2.4 implies that for any \(\varepsilon >0\) there exists a constant \(C\ge 0\) such that

$$\begin{aligned} \left|{{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi ) \right|= \int _{\Sigma \times {\mathbb {R}}} (\Phi \circ F^n)\cdot \psi \mathop {}\!\mathrm {d}\nu \le C n^{-\frac{1}{2}+\varepsilon }. \end{aligned}$$

Let us show that there is a lower bound of order exactly \(O(n^{-\frac{1}{2}})\).

Lemma 3.1 implies that f is not cohomologous to zero. Moreover, by the Central Limit Theorem, there exists a constant \(C'>0\) such that for any \(n \in {\mathbb {N}}\) sufficiently large, on a subset \(Y_n \subset \Sigma \) of measure at least 1/2, the Birkhoff sums \(f_n (x) = f(x) + \cdots + f(\sigma ^{n-1}x)\) are bounded by \(|f_n(x)| \le C' \sqrt{n}\). In particular, for any \(x \in Y_n\) and \( r \in \left[ -\frac{1}{2},\frac{1}{2} \right] \), we have

$$\begin{aligned} \Phi \circ F^n(x,r) = \Phi (r+ f_n(x)) = \frac{1}{1+r+|f_n(x)|} \ge \frac{1}{2 C' \sqrt{n}}. \end{aligned}$$

Thus, for any \(n \in {\mathbb {N}}\) sufficiently large, it follows that

$$\begin{aligned} \begin{aligned} \int _{\Sigma \times {\mathbb {R}}} (\Phi \circ F^n)\cdot \psi \mathop {}\!\mathrm {d}\nu&\ge \int _{Y_n \times \left[ -\frac{1}{2},\frac{1}{2} \right] } (\Phi \circ F^n)\cdot \psi \ \mathop {}\!\mathrm {d}\nu \ge \nu \left( Y_n \times \left[ -\frac{1}{2},\frac{1}{2} \right] \right) \frac{1}{2 C' \sqrt{n}} \\&\ge \frac{1}{4C' \sqrt{n}}. \end{aligned} \end{aligned}$$

We have shown that there exists a constant \(C = (4C')^{-1}\) and, for any \(\varepsilon >0\), there exists a constant \(C_\varepsilon >0\) such that we can bound the correlations by \(C n^{-\frac{1}{2}} \le {{\,\mathrm{cov}\,}}(\Phi \circ F^n, \psi ) \le C_\varepsilon n^{-\frac{1}{2}+\varepsilon }\), hence the bound of Theorem 2.4 is, in this case, optimal.

4 Skew Products Over One-Sided Subshifts

To prove Theorem 3.9, we have to first prove analogous statements for one-sided subshifts. In this section, we discuss the case of skew products over topologically mixing one-sided subshifts of finite type.

Let \(\sigma :X \rightarrow X\) be a topologically mixing one-sided subshift of finite type, equipped with a Gibbs measure \(\mu = \mu _u\) with respect to the potential u. For \(0<\theta < 1\), the distance \(d^+_\theta \) and the space of Lipschitz functions \(\mathscr {F}^+_{\theta }\) are defined analogously to the case of the two-sided shift. Let \(f^+ \in \mathscr {F}^+_{\theta }\) be a real-valued Lipschitz function with zero average and consider the skew shift

$$\begin{aligned} F^+:X \times {\mathbb {R}}\rightarrow X \times {\mathbb {R}}, \quad F(x, r) = (\sigma x, r + f^+(x)). \end{aligned}$$
(7)

Denote by \(\nu \) the infinite measure \(\mu \times {{\,\mathrm{Leb}\,}}\) on \(X \times {\mathbb {R}}\). For any pair of global and local observables \(\Phi , \psi \) over \(X \times {\mathbb {R}}\), define the analogous correlation function

$$\begin{aligned} {{\,\mathrm{cov}\,}}(\Phi \circ (F^+)^n, \psi ) := \int _{X \times {\mathbb {R}}} ( \Phi \circ (F^+)^n) (x,r) \cdot \overline{\psi (x,r)} \mathop {}\!\mathrm {d}\nu (x,r). \end{aligned}$$

4.1 Good Global and Local Observables for Skew Shifts Over One-Sided Subshifts

The class of global and local observables we consider in this case are described below. In this setting, we require less regularity of the observables than in the case of two-sided shifts.

Definition 4.1

(Good local observables—one-sided case). Let \(\mathscr {L}^+\subset L^1(\nu )\) be the space of functions \(\psi :X \rightarrow \mathscr {S}\) such that, for every \(\ell \in {\mathbb {N}}\), the function \(x \mapsto \partial ^{\ell } \psi (x)\) from X to \(L^1({\mathbb {R}})\) is Lipschitz. For every \(\psi \in \mathscr {L}^+\), denote by \({{\,\mathrm{Max}\,}}_\ell (\psi )\) and \({{\,\mathrm{Lip}\,}}_\ell (\psi )\) the minimum constants such that

$$\begin{aligned} \left\Vert \partial ^{\ell } \psi (x) \right\Vert _{L^1({\mathbb {R}})} \le {{\,\mathrm{Max}\,}}_\ell (\psi ) \text { and } \left\Vert \partial ^{\ell }\psi (x) -\partial ^{\ell }\psi (y)\right\Vert _{L^1({\mathbb {R}})} \le {{\,\mathrm{Lip}\,}}_\ell (\psi ) d^+_{\theta }(x,y).\qquad \end{aligned}$$
(8)

Let us remark that, if \(\psi \in \mathscr {L}^+\), then, for every fixed \(x \in X\), the Fourier transform \(\widehat{\psi (x)}\) of \(\psi (x) \in \mathscr {S}\) is a Schwarz function as well. For any fixed \(\xi \in {\mathbb {R}}\), we denote by \({\widehat{\psi }}_\xi :X \rightarrow {\mathbb {C}}\) the function \({\widehat{\psi }}_\xi (x)=\widehat{\psi (x)}(\xi )\).

Lemma 4.2

Let \(\psi \in \mathscr {L}^+\). For every \(\xi \in {\mathbb {R}}\), we have \({\widehat{\psi }}_\xi \in \mathscr {F}^+_\theta \). Moreover, for every \(\ell \ge 0\), and for all \(\xi \ne 0\) we have

$$\begin{aligned} \left\Vert {\widehat{\psi }}_\xi \right\Vert _{\infty } \le {{\,\mathrm{Max}\,}}_\ell (\psi ) \xi ^{-\ell } \quad \text { and }\quad |{\widehat{\psi }}_\xi |_{\theta } \le {{\,\mathrm{Lip}\,}}_\ell (\psi ) \xi ^{-\ell }. \end{aligned}$$

Proof

For any \(\xi \ne 0\), \(x \in X\), and \(\ell \ge 0\) we have, by assumption in equation (8),

$$\begin{aligned} |\xi ^{\ell }\widehat{\psi (x)} (\xi )| = |\widehat{\partial ^{\ell }\psi (x)} (\xi )|\le \left\Vert \widehat{\partial ^{\ell }\psi (x)}\right\Vert _{\infty } \le \left\Vert \partial ^{\ell }\psi (x)\right\Vert _{L^1} \le {{\,\mathrm{Max}\,}}_\ell (\psi ), \end{aligned}$$

hence \(\sup _x |{\widehat{\psi }}_\xi (x)| \le {{\,\mathrm{Max}\,}}_\ell (\psi ) \xi ^{-\ell }\). Similarly, for any \(x \ne y \in X\),

$$\begin{aligned} \begin{aligned} |\xi ^{\ell }[\widehat{\psi (x)} (\xi )- \widehat{\psi (y)} (\xi )]|&= |\widehat{\partial ^{\ell }\psi (x)} (\xi ) - \widehat{\partial ^{\ell }\psi (y)} (\xi )|\le \left\Vert \partial ^{\ell }\psi (x) - \partial ^{\ell }\psi (y)\right\Vert _{L^1} \\&\le {{\,\mathrm{Lip}\,}}_\ell (\psi ) d^+_{\theta }(x,y), \end{aligned} \end{aligned}$$

so that, for any fixed \(\xi \ne 0\), we have \(|{\widehat{\psi }}_\xi |_{\theta } \le {{\,\mathrm{Lip}\,}}_\ell (\psi ) \xi ^{-\ell }\). \(\square \)

Let us recall that \(\mathscr {A} \subset L^{\infty }\) denotes the space of Fourier–Stieltjes transforms of complex measures with finite total variation.

Definition 4.3

(Good global observables—one-sided case). Let \(\mathscr {G}^+ \subset L^{\infty }(\nu )\) be the space of bounded functions \(\Phi :X \rightarrow \mathscr {A}\). For \(\Phi \in \mathscr {G}^+\), we define

$$\begin{aligned} \Vert \Phi \Vert _{\mathscr {G}^+} := \sup _{x \in X} \left\Vert \eta _x\right\Vert _{{{\,\mathrm{TV}\,}}}, \end{aligned}$$

where as usual, \(\widehat{\eta _x} = \Phi (x)\).

In the next sections, we will deal only with the non-invertible case of the skew product \(F^+\) and we will often suppress the \(+\) in the notations introduced above, as it should not generate confusion. In Sect. 10, we will return to the invertible setting.

4.2 Collapsed Accessibility

The property we need in the case of one-sided shifts which will replace the accessibility assumption is the following notion of collapsed accessibility.

Definition 4.4

A Lipschitz function \(f :X \rightarrow {\mathbb {R}}\) has the collapsed accessibility property if there are constants C and N such that the following holds: for any \(x \in X, t \in [0,1],\) and \(n \ge 2 N\), there is a sequence of points

$$\begin{aligned} x_1, y_1, x_2, y_2, \ldots y_m, x_{m+1} \end{aligned}$$

such that

  1. (1)

    \(m \le N\) and \(x_1 = x_{m+1} = x\);

  2. (2)

    \(\sigma ^n x_i = \sigma ^n y_i\);

  3. (3)

    \(d(y_i, x_{i+1}) \le C r^n\); and

  4. (4)

    \(t = \sum _{k=1}^m f_n(x_k) - f_n(y_k)\).

The adjective “collapsed” refers to the fact that local stable manifolds are collapsed to points when going from \(\Sigma \times {\mathbb {R}}\) to \(X \times {\mathbb {R}}.\)

In order to prove Theorem 3.9, we will see in Sect. 10 that we can reduce an accessible skew product F to a skew product \(F^+\) over a one-sided shift such that \(f^+\) enjoys the collapsed accessibility property.

4.3 The One-Sided Version of the Main Theorem

We state our main theorem in the case of skew products over one-sided subshifts which have the collapsed accessibility property. In Sect. 10, we will deduce Theorem 3.9 from Theorem 4.5.

Theorem 4.5

(Quantitative global-local mixing for one-sided subshifts). Assume that \(f^+\), defined as in (7), has the collapsed accessibility property. Then, for every \(\psi \in \mathscr {L}^+\), for every \(\Phi \in \mathscr {G}^+\), for any \(k \in {\mathbb {N}}\), and for every \(\varepsilon >0\), there exists a constant \(C=C(\Phi , \psi , k, \varepsilon )>0\) such that for every \(n \in {\mathbb {N}}\),

$$\begin{aligned} |{{\,\mathrm{cov}\,}}(\Phi \circ F^n,\psi )| \le C \left( {{\,\mathrm{LF}\,}}(\Phi , n^{-\frac{1}{2}+ \varepsilon }) + n^{-k} \right) . \end{aligned}$$

The “low frequency” term \({{\,\mathrm{LF}\,}}(\Phi , \cdot )\) in Theorem 4.5 is defined exactly as in (6), except that the integral is on X instead of \(\Sigma \).

4.4 An Expression for the Correlation Function

The main tool to study the correlations is the transfer operator. We recall the relevant definitions.

We denote by \(L = L_\sigma :L^1(\mu ) \rightarrow L^1(\mu )\) the transfer operator for the base dynamics \(\sigma :X \rightarrow X\), namely the operator on \( L^1(\mu )\) defined implicitly by

$$\begin{aligned} \int _{X} (v \circ \sigma ) w \mathop {}\!\mathrm {d}\mu = \int _{X} v \cdot (Lw) \mathop {}\!\mathrm {d}\mu , \end{aligned}$$

for \(v \in L^{\infty }(\mu )\) and \(w \in L^1(\mu )\). Similarly, we denote by \(L_{F^+}:L^1(\nu ) \rightarrow L^1(\nu )\) the transfer operator associated to \(F^+\), that is, the operator which, for every \(\Phi \in L^{\infty }(\nu )\) and \(\psi \in L^1(\nu )\), satisfies

$$\begin{aligned} \int _{X\times {\mathbb {R}}} (\Phi \circ F^+) \psi \mathop {}\!\mathrm {d}\nu = \int _{ X \times {\mathbb {R}}} \Phi \cdot (L_{F^+}\psi ) \mathop {}\!\mathrm {d}\nu . \end{aligned}$$

Explicitly, for any \(n \in {\mathbb {N}}\), we can write

$$\begin{aligned} (L^nw)(x) = \sum _{\sigma ^n y=x}e^{u_n(y)}w(y) \quad \text { and } \quad (L_{F^+}^n\psi ) (x,r) = \sum _{\sigma ^ny=x}e^{u_n(y)}\psi (y, r-f_n(y)).\nonumber \\ \end{aligned}$$
(9)

For any \(z \in {\mathbb {C}}\), we let us further define the twisted transfer operator \({\mathcal {L}}_{z} :L^1(\mu ) \rightarrow L^1(\mu )\) by

$$\begin{aligned} (\mathcal {L}_{z}^nw)(x) = \sum _{\sigma ^ny=x}e^{u_n(y) - i z f_n(y)} w(y), \end{aligned}$$

where u is the potential defining the Gibbs measure and \(u_n\) its cocycle. Notice that all the operators described above restrict to operators acting on \(\mathscr {F}^+_\theta \).

Proposition 4.6

Let \(\psi \in \mathscr {L}^+\) and \(\Phi \in \mathscr {G}^+\). Then, for every \(n \in {\mathbb {N}}\) we have

$$\begin{aligned} \int _{X \times {\mathbb {R}}} ( \Phi \circ (F^+)^n) \cdot {\overline{\psi }} \mathop {}\!\mathrm {d}\nu = \int _X \int _{-\infty }^\infty (\mathcal {L}_\xi ^n {\widehat{\psi }}_\xi )(x) \mathop {}\!\mathrm {d}\eta _x(\xi )\, \mathop {}\!\mathrm {d}\mu (x). \end{aligned}$$

Proof

By definition of the transfer operator \(L_{F^+}\), we can write

$$\begin{aligned} \begin{aligned} \int _{X \times {\mathbb {R}}} \Phi \circ (F^+)^n (x,r) \cdot \overline{\psi (x,r)} \mathop {}\!\mathrm {d}\nu (x,r)&= \int _{X \times {\mathbb {R}}} \Phi (x,r) \cdot L_{F^+}^n\overline{ \psi (x,r) }\mathop {}\!\mathrm {d}\nu \\&= \int _X \int _{-\infty }^\infty \Phi (x,r) \cdot L_{F^+}^n\overline{ \psi (x,r) }\mathop {}\!\mathrm {d}r \mathop {}\!\mathrm {d}\mu , \end{aligned} \end{aligned}$$

where the applicability of the Fubini–Tonelli Theorem follows immediately from the definition of \(\mathscr {G}^+\) and \(\mathscr {L}^+\). Since \(\Phi (x)\) is the Fourier–Stieltjes transform of a measure \(\eta _x\) we get

$$\begin{aligned} \begin{aligned} \int _{X \times {\mathbb {R}}} ( \Phi \circ (F^+)^n) \cdot {\overline{\psi }} \mathop {}\!\mathrm {d}\nu&= \int _X \int _{-\infty }^\infty \left( \int _{-\infty }^{\infty } e^{-ir\xi } \mathop {}\!\mathrm {d}\eta _x(\xi ) \right) \overline{L_{F^+}^n\psi (x,r)}\mathop {}\!\mathrm {d}r \mathop {}\!\mathrm {d}\mu \\&= \int _X \int _{-\infty }^\infty \int _{-\infty }^{\infty } \overline{ e^{ir\xi } L_{F^+}^n\psi (x,r)} \mathop {}\!\mathrm {d}\eta _x(\xi ) \mathop {}\!\mathrm {d}r \mathop {}\!\mathrm {d}\mu . \end{aligned} \end{aligned}$$

For every \(x \in X\), we have

$$\begin{aligned} \begin{aligned} \int _{-\infty }^\infty \int _{-\infty }^{\infty } |L_{F^+}^n\psi (x,r)| \mathop {}\!\mathrm {d}|\eta _x|(\xi ) \mathop {}\!\mathrm {d}r&\le \Vert \Phi \Vert _{\mathscr {G}^+} \int _{-\infty }^\infty |L_{F^+}^n\psi (x,r)| \mathop {}\!\mathrm {d}r \le \Vert \Phi \Vert _{\mathscr {G}^+} \Vert \psi (x)\Vert _1 \\&\le \Vert \Phi \Vert _{\mathscr {G}^+} {{\,\mathrm{Max}\,}}_0(\psi ), \end{aligned} \end{aligned}$$

thus we can again apply the Fubini–Tonelli Theorem to get

$$\begin{aligned} \begin{aligned}&\int _{X \times {\mathbb {R}}} ( \Phi \circ (F^+)^n) \cdot {\overline{\psi }} \mathop {}\!\mathrm {d}\nu = \int _X \int _{-\infty }^\infty \overline{ \left( \int _{-\infty }^{\infty } e^{ir\xi } L_{F^+}^n\psi (x,r) \mathop {}\!\mathrm {d}r\right) } \mathop {}\!\mathrm {d}\eta _x(\xi ) \mathop {}\!\mathrm {d}\mu \\&\quad = \int _X \int _{-\infty }^\infty \overline{ \widehat{L_{F^+}^n\psi } (x,-\xi ) } \mathop {}\!\mathrm {d}\eta _x(\xi ) \mathop {}\!\mathrm {d}\mu = \int _X \int _{-\infty }^\infty \widehat{L_{F^+}^n\psi } (x,\xi ) \mathop {}\!\mathrm {d}\eta _x(\xi ) \mathop {}\!\mathrm {d}\mu . \end{aligned} \end{aligned}$$

The conclusion follows by construction due to the equality

$$\begin{aligned} \widehat{L_{F^+}^n\psi } (x,\xi ) = (\mathcal {L}_{\xi }^n\psi _\xi )(x). \end{aligned}$$

\(\square \)

5 Cancellations for Twisted Transfer Operators

From Proposition 4.6, it is clear that, in order to estimate the correlations, we need to study the twisted transfer operators \(\mathcal {L}_\xi \), for real frequencies \(\xi \in {\mathbb {R}}\). The aim of this section is to show that the collapsed accessibility property can be exploited to obtain some cancellations in the expression of \(\mathcal {L}_\xi \).

Let us fix a complex-valued Lipschitz function \(\tilde{g}:X \rightarrow {\mathbb {C}}\), and let \(|\tilde{g}| = g\). In this section, we use a tilde to denote a complex-valued or “twisted” function, and the same letter without a tilde to denote its absolute value. We denote by \(\mathcal {L} :\mathscr {F}_\theta ^+ \rightarrow \mathscr {F}_\theta ^+\) the operator defined by

$$\begin{aligned} (\mathcal {L} \tilde{v})(x) = \sum _{\sigma y = x} \tilde{g}(y) \cdot \tilde{v}(y), \end{aligned}$$

and by \(L:\mathscr {F}_\theta ^+ \rightarrow \mathscr {F}_\theta ^+\) the positive “untwisted” operator

$$\begin{aligned} (L v)(x) = \sum _{\sigma y = x} g(y) \cdot v(y). \end{aligned}$$

Up to conjugating L with a suitable multiplication operator, we can assume that \(L 1 = 1\), namely

$$\begin{aligned} \sum _{\sigma y = x} g(y) = 1, \end{aligned}$$

for all \(x \in X\).

Notice that the for the operator \(\mathcal {L}_\xi \) defined in the previous section, we have \(\tilde{g}= \exp (u + i \xi f)\), where u is the potential for the Gibbs measure \(\mu \).

One can easily see that \(|\mathcal {L} \tilde{v}(x)| \le Lv(x)\). Moreover, recall that \(| \cdot |_\theta \) is the Lipschitz seminorm defined in (2). Then, the following Lasota–Yorke inequality holds, see [41, Proposition 2.1].

Lemma 5.1

(Basic inequality). There exists a constant \(C_0>0\) such that

$$\begin{aligned} |\mathcal {L} \tilde{v}|_{\theta } \le \theta |\tilde{v}|_{\theta } + R\left\Vert \tilde{v}\right\Vert _{\infty } , \end{aligned}$$

where \(R= C_0 |\tilde{g}|_\theta \).

By induction,

$$\begin{aligned} (L^n v)(x) = \sum _{\sigma ^n y = x} g_n(y) v(y) \quad \text { and }\quad (\mathcal {L}^n \tilde{v})(x) = \sum _{\sigma ^n y = x} \tilde{g}_n(y) \tilde{v}(y) \end{aligned}$$

where \(g_n\) and \(\tilde{g}_n\) are the cocycles

$$\begin{aligned} g_n(x) = g(x) g(\sigma (x)) \cdots g(\sigma ^{n-1}(x)) \quad \text { and }\quad \tilde{g}_n(x) = \tilde{g}(x) \tilde{g}(\sigma (x)) \cdots \tilde{g}(\sigma ^{n-1}(x)). \end{aligned}$$

It follows that for all \(n \ge 1\) we have

$$\begin{aligned} |\mathcal {L}^n \tilde{v}|_{\theta } \le \theta ^n |\tilde{v}|_{\theta } + \frac{R}{1-\theta } \left\Vert \tilde{v}\right\Vert _{\infty }. \end{aligned}$$
(10)

5.1 Collapsed Accessibility and Cancellation Pairs

Let us fix a positive constant \(\varepsilon > 0\), and an integer \(n \ge 1\). We assume \(\varepsilon < \frac{1}{2}\) and \(\varepsilon < 1 - \theta \). Define

$$\begin{aligned} H := \max \left\{ 1, \frac{2 R}{1-\theta } \right\} . \end{aligned}$$

A Lipschitz function \(\tilde{v}: X \rightarrow {\mathbb {C}}\) is a nice observable if \(|\tilde{v}|_\theta \le H\) and \(1 - \varepsilon< v(x) < 1\) for all \(x \in X\) (as always, \(|\tilde{v}| = v\)).

We say \(\mathcal {L}\) has \((\varepsilon ,n)\)-cancellation if for any observable \(\tilde{v}\) with \(|\tilde{v}|_\theta \le H\) and \(0 \le v(x) < 1\) for all \(x \in X\), there is an integer \(0 \le k \le n\) and a point \(x \in X\) such that \( |\mathcal {L}^k \tilde{v}(x)| \le 1 - \varepsilon \). We say \(\mathcal {L}\) has strong \((\varepsilon ,n)\)-cancellation if for every nice observable \(\tilde{v}\), there is a point \(x \in X\) such that \( |\mathcal {L}^n \tilde{v}(x)| \le 1 - \varepsilon \). One can see that strong \((\varepsilon ,n)\)-cancellation implies \((\varepsilon ,n)\)-cancellation.

A pair of points (xy) in X is a stable pair if \(\sigma ^n x = \sigma ^n y\). We say a stable pair (xy) is a cancellation pair for a nice observable \(\tilde{v}\) if

$$\begin{aligned} |\tilde{g}_n(x) \tilde{v}(x) + \tilde{g}_n(y) \tilde{g}_n(y)| \le g_n(x) v(x) + g_n(y) v(y) - \varepsilon . \end{aligned}$$

Lemma 5.2

If (xy) is a cancellation pair for \(\tilde{v}\), then

$$\begin{aligned} |\mathcal {L}^n \tilde{v}(p)| \le 1 - \varepsilon , \end{aligned}$$

where \(p = \sigma ^n x = \sigma ^n y\).

Proof

By definition of cancellation pair, we have

$$\begin{aligned} \begin{aligned} |\mathcal {L}^n \tilde{v}(p)|&\le |\tilde{g}_n(x) \tilde{v}(x) + \tilde{g}_n(y) \tilde{g}_n(y)| + \left|\sum _{\sigma ^n q = p, \, q \ne x,y} \tilde{g}_n(q) \tilde{v}(q) \right|\\&\le g_n(x) v(x) + g_n(y) v(y) - \varepsilon + \sum _{\sigma ^n q = p,\, q \ne x,y} g_n(q) v(q) \\&\le \left( \sum _{\sigma ^n q = p} g_n(q) \right) - \varepsilon = 1-\varepsilon , \end{aligned} \end{aligned}$$

where we used that \(L^n1=1\). \(\square \)

For a stable pair (xy) define the phase of (xy) as

$$\begin{aligned} \arg \left( \frac{\tilde{g}_n(y)}{\tilde{g}_n(x)} \right) . \end{aligned}$$

Here, arg is the complex argument and so the phase is the angle between \(\tilde{g}_n(x)\) and \(\tilde{g}_n(y)\) in the complex plane. For the most part, we can just think of this value as an angle. However, if we include it in an inequality, we will assume it is a real number between \(-\pi \) and \(\pi \).

Define the stable tolerance of (xy) as the number \(0< \delta < \pi \) which satisfies

$$\begin{aligned} 1 - \cos (\delta ) = \varepsilon \left( \frac{1}{g_n(x)} + \frac{1}{g_n(y)} \right) . \end{aligned}$$

Note that the right-hand side must be less than two for this to be well defined. In practice, we will always choose \(\varepsilon \) small enough so that this is the case.

Proposition 5.3

Let (xy) be a stable pair, and \(\tilde{v}\) a nice observable. If (xy) is not a cancellation pair for \(\tilde{v}\), then

$$\begin{aligned} -\delta \le \arg \left( \frac{\tilde{g}_n(x)}{\tilde{g}_n(y)} \frac{\tilde{v}(x)}{\tilde{v}(y)} \right) \le \delta , \end{aligned}$$

where \(\delta \) is the stable tolerance of (xy).

In other words, if s is the phase of (xy),  then

$$\begin{aligned} s + \delta< \arg \left( \frac{\tilde{v}(x)}{\tilde{v}(y)} \right) < s - \delta \end{aligned}$$

ignoring issues of the angle only being defined up to a multiple of \(2\pi \).

To prove the proposition, we first establish the following lemma.

Lemma 5.4

Let \(z_1\) and \(z_2\) be non-zero complex numbers with \(\alpha = \arg (\frac{z_1}{z_2})\). If

$$\begin{aligned} \varepsilon \left( \frac{1}{|z_1|} + \frac{1}{|z_2|} \right) \le 2(1 - \cos (\alpha )) \end{aligned}$$

then

$$\begin{aligned} |z_1 + z_2| \le |z_1| + |z_2| - \varepsilon . \end{aligned}$$

Proof

Write \(z_0 = z_1 + z_2\) and \(r_k = |z_k|\) for \(k = 0,1,2\). We wish to show that \(r_0^2 \le (r_1 + r_2 - \varepsilon )^2\). The cosine rule implies that

$$\begin{aligned} r_0^2 = r_1^2 + r_2^2 + 2 r_1 r_2 \cos (\alpha ) \end{aligned}$$

and so it is enough to show

$$\begin{aligned} 2 r_1 r_2 \cos (\alpha ) \le 2 r_1 r_2 - \varepsilon (r_1 + r_2) + \varepsilon ^2. \end{aligned}$$

This may be rewritten as

$$\begin{aligned} \varepsilon \left( \frac{1}{r_1} + \frac{1}{r_2} \right) = \varepsilon \frac{r_1 + r_2}{r_1 r_2} \le 2 (1 - \cos (\alpha ) ) + \frac{\varepsilon ^2}{r_1 r_2}. \end{aligned}$$

\(\square \)

Proof of Proposition 5.3

We show the contrapositive. Define \(z_1 = \tilde{g}_n(x) \tilde{v}(x)\) and \(z_2 = \tilde{g}_n(y) \tilde{v}(y)\). Then

$$\begin{aligned} |z_1| = g_n(x)v(x) \ge (1 - \varepsilon ) g_n(x), \end{aligned}$$

and a similar estimate holds for \(|z_2|\). Using \(\varepsilon < \frac{1}{2}\) and the definition of the stable tolerance, one sees that

$$\begin{aligned} \varepsilon \left( \frac{1}{|z_1|} + \frac{1}{|z_2|} \right) \le \frac{\varepsilon }{1 - \varepsilon } \left( \frac{1}{g_n(x)} + \frac{1}{g_n(y)} \right) \le 2 (1 - \cos (\delta ) ). \end{aligned}$$

Let \(\alpha \) be the angle between \(z_1\) and \(z_2\). If \(\delta < \alpha \), then \(1 - \cos (\delta ) < 1 - \cos (\alpha )\) and Lemma 5.4 shows that (xy) is a cancellation pair for \(\tilde{v}\). \(\square \)

For an arbitrary pair (xy) of points in X, define the unstable tolerance as \(0 \le \delta < \frac{\pi }{2}\) such that

$$\begin{aligned} \sin (\delta ) = 2 H d(x, y) \end{aligned}$$

Note that x and y must be reasonably close for this to be well defined.

Proposition 5.5

If (xy) is a pair with unstable tolerance \(\delta \) and \(\tilde{v}\) is a nice observable, then

$$\begin{aligned} -\delta \le \arg \left( \frac{\tilde{v}(x)}{\tilde{v}(y)} \right) \le \delta . \end{aligned}$$

Again, we rely on a trigonometric lemma.

Lemma 5.6

Let \(z_1\) and \(z_2\) be non-zero complex numbers with angle \(\alpha = \arg (\frac{z_1}{z_2})\). If \(0< \alpha < \frac{\pi }{4}\) and \(1 - \varepsilon \le |z_i| \le 1\), then

$$\begin{aligned} (1 - \varepsilon ) \sin (\alpha ) < |z_1 - z_2|. \end{aligned}$$

Proof

Assume \(|z_1| > |z_2|\) and consider the acute triangle defined by the points 0, \(z_1\) and \(z_2\) in complex plane. Split this triangle into two right triangles by adding a line segment from \(z_2\) to the opposite side of the triangle. This new segment has length \(|z_2| \sin (\alpha ) \ge (1-\varepsilon ) \sin (\alpha )\) and so the line segment from \(z_1\) to \(z_2\) has length at least \((1 - \varepsilon ) \sin (\alpha )\). \(\square \)

Proof of Proposition 5.5

Let \(z_1 = \tilde{v}(x)\) and \(z_2 = \tilde{v}(y)\) and let \(\alpha \) be the angle between them. The above lemma and the definition of “nice” together imply that

$$\begin{aligned} (1 - \varepsilon ) \sin (\alpha ) \le |z_1 - z_2| \le H d(x, y). \end{aligned}$$

Since \(\varepsilon < \frac{1}{2}\) by assumption, the result follows. \(\square \)

A us-cycle is a (finite) sequence of points in X:

$$\begin{aligned} x_1, y_1, x_2, y_2, \ldots , y_m, x_{m+1} \end{aligned}$$

where \(x_1 = x_{m+1}\) and each pair \((x_k, y_k)\) is a stable pair. The tolerance of the cycle is the sum of the stable tolerances of the pairs

$$\begin{aligned} (x_1, y_1), \ (x_2, y_2), \ \ldots , \ (x_m, y_m) \end{aligned}$$

and the unstable tolerances of the pairs

$$\begin{aligned} (y_1, x_2), \ (y_2, x_3), \ \ldots , \ (y_m, x_{m+1}). \end{aligned}$$

We only consider us-cycles for which this tolerance is well defined. The phase of the cycle is

$$\begin{aligned} \arg \left( \frac{\tilde{g}(y_1)}{\tilde{g}(x_1)} \frac{\tilde{g}(y_2)}{\tilde{g}(x_2)} \cdots \frac{\tilde{g}(y_m)}{\tilde{g}(x_m)} \right) . \end{aligned}$$

That is, the phase of the cycle is the sum of the phases of the individual stable pairs (up to a multiple of \(2\pi \)). As defined, the phase is a number in \((-\pi , \pi ]\). We will only consider cycles where the phase is positive.

Proposition 5.7

If there is a us-cycle where the phase is greater than the tolerance, then \(\mathcal {L}\) has strong \((\varepsilon , n)\)-cancellation.

Proof

Let \(\tilde{v}\) be a nice observable. Our goal is to show that one of the stable pairs in the cycle is a cancelling pair for \(\tilde{v}\). We assume none of them is a cancelling pair and derive a contradiction.

Let S be the phase of the cycle and \(\delta = \delta _s + \delta _u\) be the tolerance, where \(\delta _s\) is the sum of the stable tolerances and \(\delta _u\) is the sum of the unstable tolerances. We are assuming \(0< \delta < S.\)

Proposition 5.3 implies that

$$\begin{aligned} S - \delta _s< \arg \left( \frac{\tilde{v}(x_1)}{\tilde{v}(y_1)} \frac{\tilde{v}(x_2)}{\tilde{v}(y_2)} \cdots \frac{\tilde{v}(x_m)}{\tilde{v}(y_m)} \right) < S + \delta _s \end{aligned}$$

and Proposition 5.5 implies that

$$\begin{aligned} - \delta _u< \arg \left( \frac{\tilde{v}(x_2)}{\tilde{v}(y_1)} \frac{\tilde{v}(x_3)}{\tilde{v}(y_2)} \cdots \frac{\tilde{v}(x_{m+1})}{\tilde{v}(y_m)} \right) < \delta _u. \end{aligned}$$

Since \(x_1 = x_{m+1}\), the complicated product in the middle of each inequality is actually the same complex number and so we get \(S - \delta _s < \delta _u\), a contradiction. \(\square \)

5.2 Cancellation by Frequency

We now apply the results above to the specific case of the operators \(\mathcal {L}_\xi \) defined in the previous section, namely to the case

$$\begin{aligned} (\mathcal {L}_\xi \tilde{v})(x) = \sum _{\sigma y = x} \tilde{g}_\xi (y) \cdot \tilde{v}(x), \end{aligned}$$

where \(\tilde{g}_\xi = \exp (u + i \xi f)\). To simplify the presentation we only consider positive \(\xi \), but analogous results will hold for negative frequencies.

One can show that the Lipschitz norm of \(\tilde{g}_\xi \) satisfies \(|\tilde{g}_\xi |_\theta \le |g|_\theta + \xi |f|_\theta \), and so each twisted operator satisfies a Lasota–Yorke inequality \( |\mathcal {L}_\xi \tilde{v}|_\theta \le \theta |\tilde{v}|_\theta \ + R_\xi \Vert \tilde{v}\Vert _\infty \) and by Lemma 5.1, \(R=R_\xi \) grows linearly in \(\xi \).

The notion of a “nice observable” will also depend on the frequency. In particular, the value H from the previous section depends on \(\xi \) and so \(H = H_\xi = \frac{2}{1-\theta } R_\xi \), which also grows linearly in \(\xi \). Define a constant \(G = \inf \{ \frac{1}{g(x)} : x \in X \}\) and an exponent \(\alpha > 0\) determined by \(\theta ^\alpha G = 1\). Note that G and \(\alpha \) are independent of the frequency.

We now show that accessibility of the skew product leads to cancellation of these twisted operators and we give quantitative estimates of the amount of cancellation.

Proposition 5.8

Suppose \(f^+\) has the collapsed accessibilty property and \(\xi _0 > 0\) is given. Then there are positive constants A and B such that if \(\xi \ge \xi _0\) and

$$\begin{aligned} \varepsilon _\xi = \frac{1}{A G^{n_\xi }} \quad \text {where}\, n_\xi \, \hbox {is the smallest integer which satisfies} \quad \theta ^{n_\xi } < \frac{1}{B \xi }, \end{aligned}$$

then \(\mathcal {L}_\xi \) has strong \((\varepsilon _\xi ,n_\xi )\)-cancellation.

Remark 5.9

One can see from the definitions of \(\varepsilon _\xi \) and \(n_\xi \) that \(\varepsilon _\xi = \frac{1}{A} \theta ^{\alpha n_\xi } \ge \frac{\theta }{A B^\alpha } \xi ^{-\alpha }\).

Proof

Assume without loss of generality that \(0< \xi _0 < \pi \). The overall strategy of the proof is to use collapsed accessibility to show that, for any frequency \(\xi \ge \xi _0\), there is a us-cycle with phase equal to \(\xi _0\) and tolerance less than \(\xi _0\). Proposition 5.7 then gives cancellation.

Let C and N be as given in Definition 4.4 of collapsed accessibility . Then there is a constant \(0< a < 1\) such that any angle \(0< \delta < \pi \) which satisfies either \(1 - \cos (\delta ) \le a\) or \(\sin (\delta ) \le a\) also satisfies \(\delta < \frac{1}{2N} \xi _0\).

Since \(H_\xi \) grows linearly in \(\xi \), there is \(B > 0\) such that \(2 C H_\xi \le a B \xi \), for all \(\xi \ge \xi _0\).

Up to increasing the value of B, we can also ensure that \(n > 2N\) for any integer n which satisfies \(\theta ^n < \frac{1}{B \xi _0}\). Define \(A = \frac{2}{a}\).

Now consider a specific frequency \(\xi \ge \xi _0\) and use \(n = n_\xi \) and \(\varepsilon = \varepsilon _\xi \) defined as in the statement of the proposition. Using this n and \(t = \frac{\xi }{\xi _0}\), there is a sequence of points \(x_1, y_1, x_2, y_2, \ldots , y_m, x_{m+1}\) satisfying Definition 4.4. This sequence is a us-cycle for \(\mathcal {L}_\xi \) and has phase equal to \(\xi _0\). If \(\delta \) is the stable tolerance of a pair \((x_k, y_k),\) then

$$\begin{aligned} 1 - \cos (\delta ) = \varepsilon \left( \frac{1}{g_n(x_k)} + \frac{1}{g_n(y_k)} \right) \le 2 \varepsilon G^n = a. \end{aligned}$$

If instead \(\delta \) is the unstable tolerance of a pair \((y_k, x_{k+1}),\) then

$$\begin{aligned} \sin (\delta ) = 2 H_\xi d(y_k, x_{k+1}) \le 2 C H_\xi \theta ^n \le a B \xi \theta ^n \le a. \end{aligned}$$

Together, these estimates show that the total tolerance of the us-cycle is less than \(\xi _0\) and so Proposition 5.7 gives cancellation. \(\square \)

6 Contraction

In this section, we show how to obtain some estimates on the norm of the operator \(\mathcal {L}_\xi \). For high frequencies, we exploit the cancellations obtained in the previous section, while, for low frequencies, we apply some standard results from the perturbation theory of bounded linear operators.

6.1 High Frequencies

Recall that we defined \(H := \max \left\{ 1, \frac{2 R}{1-\theta } \right\} \). It will be convenient to define the following norm on \(\mathscr {F}^+_\theta \): let

$$\begin{aligned} \Vert {{\tilde{v}}}\Vert _{H} := \max \left\{ \Vert {{\tilde{v}}}\Vert _\infty , \frac{| {{\tilde{v}}} |_\theta }{H}\right\} . \end{aligned}$$

Notice that the norms \(\Vert \cdot \Vert _{H}\) and \(\Vert \cdot \Vert _\theta \) are equivalent, namely

$$\begin{aligned} \Vert {{\tilde{v}}}\Vert _{H} \le \Vert {{\tilde{v}}}\Vert _{\theta } \le 2H \Vert {{\tilde{v}}}\Vert _{H}. \end{aligned}$$

In this section, we will prove the following result.

Proposition 6.1

Suppose that \(f^+\) has the collapsed accessibility property, and let \(\xi _0>0\) be given. Then, there exists positive constants \(A, B>0\) and an exponent \(\beta > 0\) such that for all \(\xi \ge \xi _0\) we have

$$\begin{aligned} \Vert \mathcal {L}_\xi ^{N}\Vert _{H} \le 1- A \xi ^{-\beta }, \end{aligned}$$

for all \(N \ge B |\log \xi |\).

We start by proving some simple preliminary results.

Lemma 6.2

For any given \(\xi >0\), if \({{\tilde{v}}} \in \mathscr {F}_\theta ^+\), then \( \Vert \mathcal {L}_\xi {{\tilde{v}}} \Vert _H \le \Vert {{\tilde{v}}} \Vert _H\).

Proof

Clearly, \(\Vert \mathcal {L}_\xi {{\tilde{v}}} \Vert _\infty \le \Vert {{\tilde{v}}} \Vert _\infty \le \Vert {{\tilde{v}}} \Vert _H\). From the Basic Inequality in Lemma 5.1 we also get

$$\begin{aligned} \frac{|\mathcal {L}_\xi {{\tilde{v}}}|_\theta }{H} \le \theta \frac{|{{\tilde{v}}}|_\theta }{H} + \frac{R}{H} \Vert {{\tilde{v}}}\Vert _\infty \le \left( \theta + \frac{R}{H} \right) \Vert {{\tilde{v}}} \Vert _H \le \Vert {{\tilde{v}}} \Vert _H, \end{aligned}$$

since \(H > {R}/(1-\theta )\). This completes the proof. \(\square \)

Let us recall that, from the definition of Gibbs measure, it follows that there exist constants \(C_u, d\) such that for any ball B(xr) centered at \(x \in X\) with radius \(r \ge 0\) we can bound

$$\begin{aligned} \mu (B(x,r)) \ge C_u r^{d}. \end{aligned}$$
(11)

We will also use the fact that the untwisted transfer operator L on \(\mathscr {F}^+_\theta \) has a spectral gap, namely the following well-known result, see, e.g., [41, Theorem 2.2].

Lemma 6.3

There exist a bounded operator \(\mathcal {N}:\mathscr {F}_{\theta }^+ \rightarrow \mathscr {F}_{\theta }^+\), a real number \(0<\delta <1\), and a constant \(C>0\) such that for all \(n \in {\mathbb {N}}\) we have \(\left\Vert \mathcal {N}^n\right\Vert _{\theta } \le C \delta ^n\), and for all \(\tilde{v}\in \mathscr {F}_{\theta }^+ \),

$$\begin{aligned} L^n(\tilde{v}) = \int _X \tilde{v}\mathop {}\!\mathrm {d}\mu + \mathcal {N}^n(\tilde{v}). \end{aligned}$$

We have the following result.

Lemma 6.4

There exists a constant \(C_1>0\) such that the following holds. For any \(\varepsilon >0\), \(\ell \ge 1\), and \({\tilde{v}} \in \mathscr {F}^+_\theta \) with \(|{{\tilde{v}}}|_\theta \le \ell \), if \(| {{\tilde{v}}} ({{\bar{x}}})| \le 1-\varepsilon \) for a point \({{\bar{x}}} \in X\), then

$$\begin{aligned} \int _X v \mathop {}\!\mathrm {d}\mu \le 1- C_1 \left( \frac{\varepsilon }{\ell } \right) ^{d}\varepsilon . \end{aligned}$$

Proof

Define \(w = 1-v\) and note that \(w({{\bar{x}}}) \ge \varepsilon \) and the Lipschitz semi-norm of w satisfies \(|w|_\theta = |v|_\theta \le |{{\tilde{v}}}|_\theta \le \ell \). If B is the ball centered at \({{\bar{x}}}\) of radius \(r = \frac{\varepsilon }{2 \ell }\), then \(w(x) \ge \frac{\varepsilon }{2}\) for all \(x \in B\), so

$$\begin{aligned} \int _X w \mathop {}\!\mathrm {d}\mu \ge \frac{\varepsilon }{2} \mu (B) \ge \frac{\varepsilon }{2} C_u \left( \frac{\varepsilon }{2\ell } \right) ^{d}. \end{aligned}$$

Since \(\int _X v \mathop {}\!\mathrm {d}\mu = 1- \int _X w \mathop {}\!\mathrm {d}\mu \), the result follows. \(\square \)

Lemma 6.5

There exist constants \({{\bar{A}}}, {{\bar{B}}} >0\) such that the following holds. Assume that \(\mathcal {L}= \mathcal {L}_\xi \) has \((\varepsilon , n)\)-cancellation. Then, for every \({{\tilde{v}}} \in \mathscr {F}_{\theta }^+\) with \(\Vert {{\tilde{v}}}\Vert _{H} \le 1\), and for any \(N \ge N_0 := \lfloor -{{\bar{B}}} \log (\varepsilon /H) \rfloor \), we have

$$\begin{aligned} \Vert \mathcal {L}^{N+n}{{\tilde{v}}}\Vert _{H} \le 1-{{\bar{A}}} \frac{\varepsilon ^{d +1}}{H^{d}}. \end{aligned}$$

Proof

Let \(N_1\) and \(N_2\) be the minimum integers which satisfy

$$\begin{aligned} C H \delta ^{N_1} \le \frac{C_1}{3} \frac{\varepsilon ^{d +1}}{H^{d}} \quad \text { and }\quad 2 \theta ^{N_2} \le \frac{C_1}{3} \frac{\varepsilon ^{d +1}}{H^{d}}, \end{aligned}$$

and let \(N_0 = N_1 + N_2\). It is clear from the definition that there exists a constant \({ {\bar{B}}} >0\) such that \(N_0 = \lfloor -{ {\bar{B}}} \log (\varepsilon /H) \rfloor \).

For every \(x \in X\),

$$\begin{aligned} | \mathcal {L}^{N_1+n} {{\tilde{v}}} (x) | \le |L^{N_1}( | \mathcal {L}^{n} {{\tilde{v}}}|) (x) | \le \int _X | \mathcal {L}^n {{\tilde{v}}} | \mathop {}\!\mathrm {d}\mu + |\mathcal {N}^{N_1}( | \mathcal {L}^{n} {{\tilde{v}}}|) (x) |, \end{aligned}$$

where by Lemma 6.3 and (10),

$$\begin{aligned} |\mathcal {N}^{N_1}( | \mathcal {L}^{n} {{\tilde{v}}}|) (x) | \le C \delta ^{N_1} \Vert \mathcal {L}^{n} {{\tilde{v}}} \Vert _\theta \le C \delta ^{N_1} (1 + \theta ^n |{{\tilde{v}}}|_\theta +H) \le 3C H \delta ^{N_1} \le \frac{C_1}{3} \frac{\varepsilon ^{d +1}}{H^{d}}, \end{aligned}$$

where the last inequality follows from the definition of \(N_1\). From Lemma 6.2, we have \(| \mathcal {L}^n {{\tilde{v}}} |_\theta \le H \Vert {{\tilde{v}}}\Vert _H \le H\). Thus, Lemma 6.4 applied to \(| \mathcal {L}^{n} {{\tilde{v}}}|\) gives us

$$\begin{aligned} | \mathcal {L}^{N_1+n} {{\tilde{v}}} (x) | \le 1- C_1 \left( \frac{\varepsilon }{H} \right) ^{d}\varepsilon +\frac{C_1}{3} \frac{\varepsilon ^{d +1}}{H^{d}}. \end{aligned}$$

Therefore, we obtain

$$\begin{aligned} \Vert \mathcal {L}^{N+n} {{\tilde{v}}} \Vert _\infty \le \Vert \mathcal {L}^{N_1+n} {{\tilde{v}}} \Vert _\infty \le 1- \frac{2C_1}{3} \frac{\varepsilon ^{d +1}}{H^{d}}. \end{aligned}$$
(12)

Finally, the inequality (10) gives us

$$\begin{aligned} | \mathcal {L}^{N+n} {{\tilde{v}}}|_\theta \le \theta ^{N_2} | \mathcal {L}^{N-N_2+n} {{\tilde{v}}}|_\theta + H \Vert \mathscr {L}^{N-N_2+n} {{\tilde{v}}} \Vert _\infty \le H \left( 2 \theta ^{N_2} + \Vert \mathscr {L}^{N_1+n} {{\tilde{v}}} \Vert _\infty \right) . \end{aligned}$$

By the definition of \(N_2\) and (12), we conclude

$$\begin{aligned} | \mathcal {L}^{N+n} {{\tilde{v}}}|_\theta \le H \left( 2\theta ^{N_2} + \Vert \mathcal {L}^{N_1+n} {{\tilde{v}}} \Vert _\infty \right) \le H \left( 1- \frac{C_1}{3} \frac{\varepsilon ^{d +1}}{H^{d}}\right) . \end{aligned}$$

The inequality above and (12) conclude the proof. \(\square \)

We are in position to complete the proof of Proposition 6.1

Proof of Proposition 6.1

By Proposition 5.8, \(\mathcal {L}_\xi \) has \((\varepsilon _\xi , n_\xi )\)-cancellations, with \(\varepsilon \ge A_0 \xi ^{-\alpha }\) and \(n_\xi \le B_0 |\log \xi |\), for some positive constants \(A_0, B_0\). Therefore, by Lemma 6.5, for every \(N \ge N_0 + n_\xi \) we have

$$\begin{aligned} \Vert \mathcal {L}_\xi ^{N}\Vert _{H} \le 1-{{\bar{A}}} \frac{\varepsilon ^{d +1}}{H^{d}} \le 1-A \xi ^{-(\alpha d + \alpha +d)}, \end{aligned}$$

for some constant \(A>0\). By the definitions of \(N_0\) and \(n_\xi \), there exists a constant \(B>0\) such that \(N_0 + n_\xi \le B |\log \xi |\). \(\square \)

6.2 Low Frequencies

We now want to estimate the norm of \(\mathcal {L}_\xi \) for small \(\xi \in {\mathbb {R}}\). Let us notice that there exists \(\xi _0 >0\) such that for all \( 0 \le \xi \le \xi _0\) we have \(H = \max \left\{ 1, \frac{2 R}{1-\theta } \right\} = 1\), so that \(\Vert \cdot \Vert _H \le \Vert \cdot \Vert _\theta \le 2 \Vert \cdot \Vert _H\). We will prove the following bound.

Proposition 6.6

There exist \(\kappa >0\) and a constant \(A_\kappa >0\) such that, for all \(0< \xi < \kappa \) and for all \(n \ge 0\), we have

$$\begin{aligned} \Vert \mathcal {L}_\xi ^n \Vert _H \le 4 (1-A_\kappa \xi ^2)^n. \end{aligned}$$

Let us recall that the family of operators \(z \mapsto \mathcal {L}_z\) is analytic for \(z \in {\mathbb {C}}\). This ensures that we can apply classical results from analytic perturbation theory to study the spectrum of bounded linear operators, see in particular [32, Theorem VII.1.8]. In our case, since the operator \(\mathcal {L}_0 = L\) has a spectral gap, we can deduce the following result, see [41, Chapter 4], [26, Proposition 2.3] and [48, p.15].

Theorem 6.7

There exists a \(\kappa >0\) such that the twisted transfer operator \(\mathcal {L}_z\) on \(\mathscr {F}^+_\theta \) has a spectral gap for all \(|z|<\kappa \). Moreover, there exist \(\lambda _z \in {\mathbb {C}}\) and linear operators \(\mathcal {P}_z\) and \(\mathcal {N}_z\) such that \(\mathcal {L}_z = \lambda _z \mathcal {P}_z + \mathcal {N}_z\) and which satisfy the following properties:

  1. (1)

    \(\lambda _z\), \(\mathcal {P}_z\) and \(\mathcal {N}_z\) are analytic on the disk \(\{|z| < \kappa \}\),

  2. (2)

    \(\mathcal {P}_z\) is a projection and its range has dimension 1,

  3. (3)

    \(\mathcal {P}_z\mathcal {N}_z = \mathcal {N}_z\mathcal {P}_z = 0\),

  4. (4)

    the spectral radius \(\rho (\mathcal {N}_z)\) of \(\mathcal {N}_z\) satisfies \(\rho (\mathcal {N}_z) < \lambda _z - \delta \), for some \(\delta \) independent of z.

In our case, we restrict to real frequencies \(0< \xi < \kappa \). For the proof of the following lemma, see [41, Chapter 4] and [48, Section 4].

Lemma 6.8

With the notation of Theorem 6.7, there exist constants \(A_\kappa , B_\kappa >0\) such that for all \(0< \xi < \kappa \) we have

$$\begin{aligned} \left|\lambda _\xi - (1 - 2A_\kappa \xi ^2 ) \right|\le B_\kappa \xi ^3. \end{aligned}$$

The fact that \(A_\kappa \) is strictly positive follows from the fact that \(f^+\) is not cohomologous to zero, see Lemma 3.1.

Proof of Proposition 6.6

By Theorem 6.7, up to choosing a smaller \(\kappa \), we can assume that \(\rho (\mathscr {N}_\xi ) < |\lambda _\xi | \le 1- A_\kappa \xi ^2\) for all \(0 < \xi \le \kappa \). Therefore, for any \({\tilde{v}} \in \mathscr {F}^+_\theta \), we have

$$\begin{aligned} \Vert \mathcal {L}_\xi ^n {{\tilde{v}}} \Vert _H \le \Vert \mathcal {L}_\xi ^n {{\tilde{v}}} \Vert _\theta \le (|\lambda _\xi |^n + \rho (\mathscr {N}_\xi )^n) \Vert {{\tilde{v}}} \Vert _\theta \le 4(1-A_\kappa \xi ^2)^n \Vert {{\tilde{v}}} \Vert _H, \end{aligned}$$

which proves the result. \(\square \)

7 Rapid Decay

In Sect. 8, we will use the contraction results established for the twisted transfer operator \(\mathcal {L}_\xi \) in the previous section in order to prove rapid mixing. In this section, we give several technical propositions in an abstract setting which encapsulate most of the difficult inequalities involved in the proof.

Definition 7.1

Consider a function \(w : A \subseteq (0,\infty ) \rightarrow {\mathbb {R}}\). We say \(w(\xi )\) decays rapidly in \(\xi \) if for each \(\ell \ge 1,\) there is a constant C such that \(|w(\xi )| \le C \xi ^{-\ell }\) for all \(\xi \). We say a sequence \(\{s_n\}\) decays rapidly in n if for each \(\ell \ge 1\), there is a constant C such that \(|s_n| \le C n^{-\ell }\) for all n.

Proposition 7.2

Suppose A, B, \(\beta \), and \(\xi _0\) are positive constants and that \(\{w_n\}\) is a decreasing sequence of nonnegative functions of the form \(w_n :[\xi _0, \infty ) \rightarrow [0, 1]\). If \(w_0(\xi )\) decays rapidly in \(\xi \) and

$$\begin{aligned} w_{n+N}(\xi ) \le (1 - A \xi ^{-\beta }) w_n(\xi ) \quad \text {for all} \quad N > B \log (\xi ), \end{aligned}$$

then the sequence \(\{s_n\}\) defined by \(s_n = \sup _\xi w_n(\xi )\) decays rapidly in n.

In order to prove this, we first give a lemma which establishes for each fixed \(\xi \) an exponential rate of decay of the sequence \(\{w_n(\xi )\}\).

Lemma 7.3

In the setting of Proposition 7.2, there are constants D and \(\gamma \) such that \(w_{n+K}(\xi ) < \frac{1}{e} w_n(\xi )\) for all \(K > D \xi ^\gamma \).

Proof

Consider a specific \(\xi \) and let k and N be the smallest integers such that \(k > \frac{1}{A} \xi ^\beta \) and \(N > B \log (\xi )\). Then \((1 - A \xi ^{-\beta })^k< \exp (-k A \xi ^{-\beta }) < \exp (-1)\) and so

$$\begin{aligned} w_{n+ k N}(\xi ) \le (1 - A \xi ^{-\beta })^k w_n(\xi ) \le \frac{1}{e} w_n(\xi ). \end{aligned}$$

If we choose an exponent \(\gamma > \beta \), then there is a constant D such that

$$\begin{aligned} k N \le \left( \frac{1}{A} \xi ^\beta + 1\right) (B \log \xi + 1) \le D \xi ^\gamma . \end{aligned}$$

Moreover, this constant D may be chosen uniformly for all \(\xi \). If \(K > D \xi ^\gamma \), then \(K > k N\) and \(w_{n+K}(\xi ) \le w_{n + k N}(\xi )\). \(\square \)

Proof of Proposition 7.2

We will show for any \(q > 0\) that there is a constant Q such that \(s_n < \varepsilon \) for all \(0< \varepsilon < 1\) and \(n > Q \varepsilon ^{-q}\). One can see that this condition implies that \(\{s_n\}\) decays rapidly. Let D and \(\gamma \) be as in the above lemma and choose an integer \(\ell > \gamma / q\). As \(w_0(\xi )\) decays rapidly in \(\xi \), there is C such that \(w_0(\xi ) < C \xi ^{-\ell }\) for all \(\xi \) in the domain. For a given \(\varepsilon > 0\):

  • let \(a > 0\) be such that \(C a^{-\ell } = \varepsilon \),

  • let j be the smallest integer such that \(e^{-j} < \varepsilon \), and

  • let K be the smallest integer such that \(K > D a^\gamma \).

Now consider a frequency \(\xi \). If \(\xi > a\), then \(w_{j K}(\xi ) \le w_0(\xi ) < \varepsilon \). If instead \(\xi \le a\), then \(K > D \xi ^\gamma \) which implies \(w_{j K}(\xi ) \le e^{-j} w_0(\xi ) < \varepsilon \). Together, these imply that \(s_n < \varepsilon \) for all \(n > j K\). Since \(a^\gamma = \frac{1}{C} \varepsilon ^{-\gamma /\ell }\) and \(q > \gamma /\ell \), there is a constant Q such that

$$\begin{aligned} j K \le ( \log (\varepsilon ^{-1}) + 1 )( D a^\gamma + 1 ) < Q \varepsilon ^{-q} \end{aligned}$$

holds uniformly for all \(\varepsilon \). \(\square \)

Proposition 7.4

Suppose A and \(\xi _0\) are positive constants, \(0< \alpha < \frac{1}{2}\), and \(\{w_n\}\) is a decreasing sequence of nonnegative functions of the form \(w_n :(0, \xi _0] \rightarrow [0, 1]\). If \(w_{n+k}(\xi ) \le 4(1 - A \xi ^{2})^k w_n(\xi )\) for all \(\xi \), k and n, then the sequence \(\{s_n\}\) defined by

$$\begin{aligned} s_n = \sup \ \{ w_n(\xi ) \ : \ n^{-\alpha } \le \xi \le \xi _0 \} \end{aligned}$$

decays rapidly in n.

Proof

If \(n^{-\alpha } \le \xi \), then \(w_n(\xi ) \le 4(1 - A \xi ^2)^n \le 4\exp (- n A \xi ^2) \le 4\exp (- n^{1-2\alpha } A)\), and, since \(1 - 2\alpha > 0\), one can show that \(4\exp (- n^{1-2\alpha } A)\) decays rapidly in n. \(\square \)

Propositions 7.2 and 7.4 are enough to establish rapid mixing in the setting of skew products over one-sided shifts. However, to handle two-sided shifts, we will need the following more technical results.

Proposition 7.5

Let A, B, \(\beta \), \(\xi _0\) and \(\theta \) be positive constants. Let \(v_{n,m} :[\xi _0, \infty ) \rightarrow [0, \infty )\) be a collection of functions defined for all integers \(n \ge 0\) and \(m \ge 0\) and let \(w :[\xi _0, \infty ) \rightarrow [0, \infty )\) be a bounded function. Suppose for all \(n,m \ge 0\) and \(\xi \ge \xi _0\) that

  1. (1)

    \(v_{n,m}(\xi ) \le v_{n+1,m}(\xi )\),

  2. (2)

    \(v_{n+N,m}(\xi ) \le (1 - A \xi ^{-\beta })v_{n,m}(\xi )\) when \(N > B \log (\xi )\),

  3. (3)

    \(v_{0,m}(\xi ) \le \theta ^{-m} w(\xi )\), and

  4. (4)

    \(w(\xi )\) decays rapidly in \(\xi \).

Then, for any \(c > 0,\) the sequence \(\{t_n\}\) defined by

$$\begin{aligned} t_n = \sup \ \{ v_{n,m}(\xi ) \ : \ \xi > \xi _0 \quad \text {and} \quad m < c \log (n) \} \end{aligned}$$

decays rapidly in n.

Proof

As \(w(\xi )\) is bounded, we may without loss of generality assume that \(w(\xi ) \le 1\) for all \(\xi \). Define \(w_n(\xi ) = \sup _m \theta ^m v_{n,m}(\xi )\). One can verify that \(\{w_n\}\) satisfies the hypotheses of Proposition 7.2. Hence \(\sup _\xi w_n(\xi )\) decays rapidly in n, meaning that for a given \(\ell \), there is C such that \(v_{n,m}(\xi ) < C \theta ^{-m} n^{-\ell }\) for all m and n. If \(m < c \log (n)\), then

$$\begin{aligned} \theta ^{-m}< \theta ^{-c \log (n)} = n^{-c \log (\theta )} \, \Rightarrow \, v_{n,m}(\xi ) < C n^{-\ell - c \log (\theta )}. \end{aligned}$$

From this, one can see that \(\{t_n\}\) decays rapidly in n. \(\square \)

Proposition 7.6

Let A, \(\beta \), \(\xi _0\) and \(\theta \) be positive constants, and \(0< \alpha < \frac{1}{2}\). Let \(v_{n,m} :(0, \xi _0] \rightarrow [0, \infty )\) be a collection of functions defined for all integers \(n \ge 0\) and \(m \ge 0\), and let \(w :(0, \xi _0] \rightarrow [0, \infty )\) be a bounded function. Suppose for all \(n,m,k \ge 0\) and \(\xi \le \xi _0\) that

  1. (1)

    \(v_{n,m}(\xi ) \le v_{n+1,m}(\xi )\),

  2. (2)

    \(v_{n+k,m}(\xi ) \le 4(1 - A \xi ^{2})^k v_{n,m}(\xi )\),

  3. (3)

    \(v_{0,m}(\xi ) \le \theta ^{-m} w(\xi )\).

Then for any \(c > 0,\) the sequence \(\{t_n\}\) defined by

$$\begin{aligned} t_n = \sup \ \{ v_{n,m}(\xi ) \ : \ n^{-\alpha }< \xi \le \xi _0 \ \ \text {and} \ \ m < c \log (n) \} \end{aligned}$$

decays rapidly in n.

Proof

This follows from Proposition 7.4 using the proof of Proposition 7.5.

\(\square \)

8 Proof of Theorem 4.5

This section is devoted to the proof of Theorem 4.5.

Let \(\psi \in \mathscr {L}^+\) and \(\Phi \in \mathscr {G}^+\) be given, and fix \(k \in {\mathbb {N}}\) and \(0< \alpha < 1/2\). Recall that the good global observable \(\Phi \) defines a complex measure \(\eta _x\) for each \(x \in X\) and that there is a uniform constant \(M = \Vert \Phi \Vert _{\mathscr {G}^+}\) such that \( \Vert \eta _x \Vert _{{{\,\mathrm{TV}\,}}} \le M\) for all \(x \in X\). The Fourier transform of the good local observable \(\psi \) is a function of the form \({\widehat{\psi }} :X \times {\mathbb {R}}\rightarrow {\mathbb {C}}\) where, for each frequency \(\xi \), the function \({\widehat{\psi }}_\xi :X \rightarrow {\mathbb {C}}\) defined by \({\widehat{\psi }}_\xi (x) = \widehat{\psi (x)}(\xi )\) is Hölder and lies in \(\mathscr {F}^+_\theta \).

By Proposition 4.6, we have

$$\begin{aligned} {{\,\mathrm{cov}\,}}(\Phi \circ F^n,\psi ) = \int _X \int _{-\infty }^\infty (\mathcal {L}_\xi ^n {\widehat{\psi }}_\xi )(x) \mathop {}\!\mathrm {d}\eta _x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) - \nu _{{{\,\mathrm{av}\,}}}(\Phi )\nu (\psi ). \end{aligned}$$

We will estimate the correlations by splitting the frequencies \(\xi \in {\mathbb {R}}\) into the cases \(\xi = 0\), \(0< |\xi | < n^{-\alpha }\), and \(|\xi | > n^{-\alpha }\). In fact, we only consider \(\xi \ge 0\) as the estimates for \(\xi < 0\) are analogous.

The proof of Theorem 4.5 follows from the next three lemmas.

Lemma 8.1

There exist constants \(C >0\) and \(0< \delta <1\) (depending from L, as given in Lemma 6.3) such that

$$\begin{aligned} \left|\int _X \int _{\{0\}}(\mathcal {L}_\xi ^n {\widehat{\psi }}_\xi )(x) \mathop {}\!\mathrm {d}\eta _x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) - \nu _{{{\,\mathrm{av}\,}}}(\Phi )\nu (\psi ) \right|\le C M ({{\,\mathrm{Max}\,}}_0(\psi ) + {{\,\mathrm{Lip}\,}}_0(\psi ) ) \delta ^n, \end{aligned}$$

for all n.

Proof

Recalling that \(\mathcal {L}_0 = L\) is the transfer operator associated to \(\sigma \), we have

$$\begin{aligned} \int _{\{0\}}(\mathcal {L}_\xi ^n {\widehat{\psi }}_\xi )(x) \mathop {}\!\mathrm {d}\eta _x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) = \eta _x(\{0\}) (L^n{\widehat{\psi }}_0)(x). \end{aligned}$$

By Lemma 6.3, there exists a constant \(C>0\) and \(0<\delta <1\) such that

$$\begin{aligned} \left|(L^n{\widehat{\psi }}_0)(x) - \int _X {\widehat{\psi }}_0(x) \mathop {}\!\mathrm {d}\mu (x) \right|= \left|(L^n{\widehat{\psi }}_0)(x) - \nu (\psi )\right|\le C \delta ^n \Vert {\widehat{\psi }}_0 \Vert _\theta , \end{aligned}$$

where we used that \({\widehat{\psi }}_0(x) = \int _{\mathbb {R}}\psi (x,r) \mathop {}\!\mathrm {d}r\). Hence, by Lemmas 3.6 and 4.2, we conclude

$$\begin{aligned} \begin{aligned}&\left|\int _X \int _{\{0\}}(\mathcal {L}_\xi ^n {\widehat{\psi }}_\xi )(x) \mathop {}\!\mathrm {d}\eta _x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) - \nu _{{{\,\mathrm{av}\,}}}(\Phi )\nu (\psi ) \right|\\&\quad \le C \left( \int _X |\eta _x|(\{0\}) \mathop {}\!\mathrm {d}\mu (x) \right) \Vert {\widehat{\psi }}_0 \Vert _\theta \delta ^n \le C M ({{\,\mathrm{Max}\,}}_0(\psi ) + {{\,\mathrm{Lip}\,}}_0(\psi ) ) \delta ^n. \end{aligned} \end{aligned}$$

\(\square \)

Lemma 8.2

We have that

$$\begin{aligned} \left|\int _X \int _{(0,n^{-\alpha })}(\mathcal {L}_\xi ^n {\widehat{\psi }}_\xi )(x) \mathop {}\!\mathrm {d}\eta _x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) \right|\le {{\,\mathrm{Max}\,}}_0(\psi ) {{\,\mathrm{LF}\,}}(\Phi , n^{-\alpha }), \end{aligned}$$

for all n.

Proof

Since, by Lemma 4.2, we have \(\Vert \mathcal {L}_\xi ^n {\widehat{\psi }}_\xi \Vert _\infty \le \Vert {\widehat{\psi }}_\xi \Vert _\infty \le {{\,\mathrm{Max}\,}}_0(\psi )\), we obtain

$$\begin{aligned} \begin{aligned} \left|\int _X \int _{(0,n^{-\alpha })}(\mathcal {L}_\xi ^n {\widehat{\psi }}_\xi )(x) \mathop {}\!\mathrm {d}\eta _x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) \right|&\le {{\,\mathrm{Max}\,}}_0(\psi ) \int _X |\eta _x| \Big ( (0, n^{-\alpha })\Big ) \mathop {}\!\mathrm {d}\mu (x) \\&\le {{\,\mathrm{Max}\,}}_0(\psi ) {{\,\mathrm{LF}\,}}(\Phi , n^{-\alpha }), \end{aligned} \end{aligned}$$

which settles the proof. \(\square \)

Lemma 8.3

The sequence

$$\begin{aligned} \left\{ \int _{X} \int _{[n^{-\alpha }, \infty )} \mathcal {L}_\xi ^n {\widehat{\psi }}_\xi \, \mathop {}\!\mathrm {d}\eta _{x} (\xi )\, \mathop {}\!\mathrm {d}\mu (x) \right\} _{n \ge 0} \end{aligned}$$

decays rapidly in n.

Proof

For each n, define a function \(w_n :[0, \infty ) \rightarrow [0, \infty )\) by

$$\begin{aligned} w_n(\xi ) = \Vert \mathcal {L}_\xi ^n {\widehat{\psi }}_\xi \Vert _H. \end{aligned}$$

By Lemma 6.2, \(w_n\) is a decreasing sequence of functions, and by Lemma 4.2, \(w_0(\xi )\) is a bounded function which (in the notation of Sect. 7) decays rapidly in \(\xi \). Up to rescaling \(\psi \), we may freely assume that \(w_0\) takes values in [0, 1].

Proposition 6.6 implies that \(w_n\) restricted to \((0,\kappa ]\) satisfies the hypotheses of Proposition 7.4. We then fix \(\xi _0 = \kappa \), so that Proposition 6.1 implies that \(w_n\) restricted to \([\xi _0, \infty )\) satisfies the hypotheses of Proposition 7.2. Hence, the sequence defined by

$$\begin{aligned} s_n = \sup \ \{ w_n(\xi ) \ : \ n^{-\alpha } \le \xi < \infty \} \end{aligned}$$

decays rapidly in n. Note that \( \Vert \mathcal {L}_\xi ^n {\widehat{\psi }}_\xi \Vert _\infty \le \Vert \mathcal {L}_\xi ^n {\widehat{\psi }}_\xi \Vert _H\) and so, for each n,

$$\begin{aligned} \left|\int _{[n^{-\alpha }, \infty )} \mathcal {L}_\xi ^n {\widehat{\psi }}_\xi \, \mathop {}\!\mathrm {d}\eta _{x} (\xi )\right|\le \Vert \eta _x \Vert _{{{\,\mathrm{TV}\,}}} \, s_n \end{aligned}$$

As \(\mu \) is a probability measure, it follows that

$$\begin{aligned} \left|\int _X \int _{[n^{-\alpha }, \infty )} \mathcal {L}_\xi ^n {\widehat{\psi }}_\xi \, \mathop {}\!\mathrm {d}\eta _{x} (\xi ) \mathop {}\!\mathrm {d}\mu (x) \right|\le M s_n, \end{aligned}$$

where M is the uniform bound on \( \Vert \eta _x \Vert _{{{\,\mathrm{TV}\,}}}\). \(\square \)

9 From Accessibility to Collapsed Accessibility

In this section, we relate the notion of accessibility for a skew product F as in (3) to the property of collapsed accessibility defined in Sect. 4.

For a two sided shift \(\sigma :\Sigma \rightarrow \Sigma \), let X be the corresponding one sided shift and let \(\pi :\Sigma \rightarrow X\) be the projection. Note that \(\pi \) is a continuous, surjective, open map. We also write \(x^+\) for \(\pi (x).\)

For \(x \in \Sigma \), define \(W^s_0(x) = \pi ^{-1} \pi (x)\). In other words, \(y \in W^s_0(x)\) if and only if \(x^+ = y^+\). For \(n \in {\mathbb {Z}}\), define \(W^s_n(x) = \sigma ^{-n} W^s_0( \sigma ^n x)\) and note that

$$\begin{aligned} W^s_0(x) \subset W^s_1(x) \subset W^s_2(x) \subset \cdots \end{aligned}$$

is an increasing sequence whose union is \(W^s(x)\). For a subset \(U \subset \Sigma \), write

$$\begin{aligned} W^s_n(U) = \bigcup _{x \in U} W^s_n(U). \end{aligned}$$

Lemma 9.1

If \(U \subset \Sigma \) is open, then \(W^s_n(U)\) is open for all n. If \(K \subset \Sigma \) is compact, then \(W^s_n(K)\) is compact for all n.

Proof

Since \(\pi \) is an open map, \(W^s_0(U) = \pi ^{-1} \pi (U)\) is an open set. Since \(\sigma ^n\) is a diffeomorphism \(W^s_n(U) = \sigma ^{-n}(W^s_0(\sigma ^n U))\) is also open. A similar proof holds for compact sets. \(\square \)

Lemma 9.2

For points x and y in \(\Sigma \) and \(n \in {\mathbb {Z}},\) the following are equivalent:

  1. (1)

    \(y \in W^s_n(x)\),

  2. (2)

    \({\text {dist}}(\sigma ^{n+k} x, \sigma ^{n+k} y) \le \theta ^k\) for all \(k \ge 0\).

Proof

One can show that each of these conditions is equivalent to the sequences of symbols for x and y satisfying \(x_m = y_m\) for all \(m \ge n\). \(\square \)

Instead of projecting onto the future \(x \mapsto x^+\), we can analogously project onto the past \(x \mapsto x^-\). Define local unstable manifolds by \(y \in W^u_0(x)\) if and only if \(x^- = y^-\), and for \(n \in {\mathbb {Z}}\) define \(W^u_n(x) = \sigma ^n W^s_0( \sigma ^{-n} x)\). Analogous versions of the above lemmas hold for these manifolds.

Let us now consider the skew product (3). Writing \(p = (x,s)\) and \(q = (y,t)\), we define local stable manifolds by \(p \in W^s_n(q)\) if and only if \(p \in W^s(q)\) and \(x \in W^s_n(y)\). Define local unstable manifolds analogously. For points p and q and an integer \(n > 0\), a us-N-path from p to q is a sequence

$$\begin{aligned} p = p_0, p_1, \ldots p_n = q \end{aligned}$$

such that \(n \le N\) and for each \(0 \le k < n\) either \(p_{k+1} \in W^s_N(p_k)\) or \(p_{k+1} \in W^u_N(p_k)\).

For a point p, define \(AC_N(x)\) by \(q \in AC_N(x)\) if and only if there is a us-N-path from p to q. Note that \(AC_N(x)\) form an increasing sequence whose union is AC(p).

For a subset \(U \subset \Sigma \times {\mathbb {R}},\) define

$$\begin{aligned} AC_N(U) = \bigcup _{x \in U} AC_N(x). \end{aligned}$$

Lemma 9.3

If \(U \subset \Sigma \times {\mathbb {R}}\) is open, then \(AC_N(U)\) is open for all \(n \ge 0\). If \(K \subset \Sigma \times {\mathbb {R}}\) is compact, then \(AC_N(K)\) is compact for all \(n \ge 0\).

Proof

This follows directly from Lemma 9.1. \(\square \)

Proposition 9.4

Let K be a compact subset of \(\Sigma \times {\mathbb {R}}\) such that \(\overline{{\text {int}}(K)} = K\). If \(p \in \Sigma \times {\mathbb {R}}\) is such that \(K \subset AC(p)\), then there is N such that \(K \subset AC_N(p)\).

Proof

Since \(AC_N(p)\) is an increasing sequence of compact sets and K is a Baire space, there is \(N_1\) such that \(AC_{N_1}(p)\) contains a non-empty open subset \(U \subset K\). Since \(AC_N(U)\) is an increasing sequence of open sets whose union contains the compact set K, there is \(N_2\) such that \(K \subset AC_{N_2}(U)\). Then \(K \subset AC_{N_1+N_2}(p)\). \(\square \)

We have the following result.

Proposition 9.5

Let \(f :X \rightarrow {\mathbb {R}}\) be a Lipschitz function. If the skew product

$$\begin{aligned} F : \Sigma \times {\mathbb {R}}\rightarrow \Sigma \times {\mathbb {R}}, \quad (x, t) \mapsto (\sigma x, t + f(x^+) ) \end{aligned}$$

is accessible, then f has the collapsed accessibility property.

Proof

Let \(K = \Sigma \times [0,1]\). By Proposition 9.4, there is a uniform constant N such that \(AC_N(p)\) contains K for any \(p \in K\). With N fixed, let \(x \in X, t \in [0,1],\) and \(n \ge 1\) be given. Then there is a sequence of points \( p_1, q_1, p_2, q_2, \ldots , q_m, p_{m+1} \) such that

  1. (1)

    \(m \le N, p_1 = F^{N-n}(x,0)\), and \(p_{m+1} = F^{N-n}(x,t)\);

  2. (2)

    \(p_k \in W^s_N(q_k)\); and

  3. (3)

    \(p_{k+1} \in W^u_N(q_k)\).

Applying \(F^{N-n}\) to this sequence, we define \((a_k, s_k) = F^{N-n}(p_k)\) and \((b_k, t_k) = F^{N-n}(q_k)\) which satisfy \(b_k \in W^s_n(a_k)\) and \(a_{k+1} \in W^u_{2N-n}(b_k)\).

As \((a_k, s_k)\) and \((b_k, t_k)\) are on the same stable manifold in \(\Sigma \times {\mathbb {R}}\), it follows that

$$\begin{aligned} t_k - s_k = f_n(b_k^+) - f_n(a_k^+). \end{aligned}$$

By the unstable analogue of Lemma 9.2, one can show that \(d(b_k, a_{k+1}) \le \theta ^{n-2n+1}\). That is, \(d(b_k, a_{k+1}) \le C \theta ^n\) where \(C = \theta ^{1-2N}\). One can then check that \(x_k = a_k^+\) and \(y_k = b_k^+\) satisfy all of the conditions in the definition of collapsed accessibility. \(\square \)

10 Proof of Theorem 3.9

We now prove Theorem 3.9. The strategy of the proof is to reduce the problem to the setting of Theorem 4.5.

10.1 Step 1: f Only Depends on Future Coordinates

Let us start with a preliminary step: we show that we can assume that the function f in (3) only depends on the future coordinates. From [41, Proposition 1.2], we inherit the following result.

Lemma 10.1

There exist \(h \in \mathscr {F}_{\sqrt{\theta }}\) and \(f^+ \in \mathscr {F}^+_{\sqrt{\theta }}\) such that \(f = f^+ + h - h\circ \sigma \).

When reducing to a one-sided shift, we will encounter some loss in regularity as in the previous lemma: the functions h and \(f^+\) are Holder with exponent 1/2. We can however replace \(\theta \) with \(\sqrt{\theta }\) in the definition of the distance \(d_\theta \) to make them Lipschitz. We remark that this is not an issue, and we will freely replace \(\theta \) with a suitable choice that makes the functions Lipschitz.

For any \(\Phi \in \mathscr {G}\) and \(\psi \in \mathscr {L}\), using Lemma 10.1, we can write

$$\begin{aligned} \begin{aligned} {{\,\mathrm{cov}\,}}(\Phi \circ F^n,\psi )&= \int _\Sigma \int _{-\infty }^\infty \Phi (\sigma ^n x, r+f_n(x)) \cdot \overline{\psi (x,r)} \mathop {}\!\mathrm {d}r \mathop {}\!\mathrm {d}\mu (x) \\&= \int _\Sigma \int _{-\infty }^\infty \Phi (\sigma ^n x, r+f^+_n(x) +h(x) -h(\sigma ^nx)) \cdot \overline{\psi (x,r)} \mathop {}\!\mathrm {d}r \mathop {}\!\mathrm {d}\mu (x). \end{aligned} \end{aligned}$$

Let us define \(\Phi _h(x,r) = \Phi (x, r-h(x))\) and \(\psi _h(x,r)=\psi (x,r-h(x))\). We change variable \(s=r+h(x)\) and we get

$$\begin{aligned} \begin{aligned} {{\,\mathrm{cov}\,}}(\Phi \circ F^n,\psi )&= \int _\Sigma \int _{-\infty }^\infty \Phi (\sigma ^n x, s+f^+_n(x) -h(\sigma ^nx)) \cdot \overline{\psi (x,s-h(x))} \mathop {}\!\mathrm {d}s \mathop {}\!\mathrm {d}\mu (x) \\&= \int _\Sigma \int _{-\infty }^\infty (\Phi _h \circ F_1^n)(x,r) \cdot \overline{\psi _h(x,r)} \mathop {}\!\mathrm {d}r \mathop {}\!\mathrm {d}\mu (x), \end{aligned} \end{aligned}$$

where the skew product \(F_1\) is defined by \(F(x,r) = (\sigma x, r + f^+(x))\). The map \(H(x,r) = (x,r+h(x))\) used in the change of variable above is a conjugacy between F and \(F_1\), namely \(H \circ F = F_1 \circ H\). Moreover, H is uniformly continuous (more precisely, it is Lipschitz with respect to the distance \(d_{\sqrt{\theta }}\), exactly as h), hence it preserves stable and unstable manifolds. In particular, \(F_1\) is accessible.

The initial claim follows from the following lemma, whose proof is contained in “Appendix C.”

Lemma 10.2

With the notation above, \(\psi _h \in \mathscr {L}\) with \(\nu (\psi _h) = \nu (\psi )\), and \(\Phi _h \in \mathscr {G}\) with \(\nu _{{{\,\mathrm{av}\,}}}(\Phi _h) = \nu _{{{\,\mathrm{av}\,}}}(\Phi )\). Moreover, for every \(x \in \Sigma \), we have \(|(\eta _h)_x| = |\eta _x|\), where \(\widehat{(\eta _h)_x} = \Phi _h(x)\) and \(\widehat{\eta _x}=\Phi (x)\).

10.2 Step 2: Observables Only Depend on Future Coordinates

In the previous subsection, we have seen that we can assume that \(f=f^+ \in \mathscr {F}^+_{\theta }\) (up to replacing \(\theta \) with \(\sqrt{\theta }\)). We now show that we can replace the observables \(\Phi = \Phi _h \in \mathscr {G}\) and \(\psi = \psi _h \in \mathscr {L}\) with observables in \(\mathscr {G}^+\) and in \(\mathscr {L}^+\) respectively: this is the content of Proposition 10.3. The proof follows the same lines as in [18]; however in our case there are some additional difficulties in showing that the functions defined belong to \(\mathscr {G}^+\) and \(\mathscr {L}^+\). In particular, we will need to use the assumption (TC) to ensure some compactness property in \(\mathscr {A}\). We postpone the proof to “Appendix C.”

Proposition 10.3

Let \(\Phi \in \mathscr {G}\) and \(\psi \in \mathscr {L}\). There exist constants \(K, M(\Phi ) \ge 0\), sequences \(\{\Phi _m\}_{m \in {\mathbb {N}}} \subset \mathscr {G}^+\), \(\{\psi \}_{m \in {\mathbb {N}}} \subset \mathscr {L}^+\), and, for every \(\ell \in {\mathbb {N}}\), there exist constants \(M(\psi , \ell )\) and \(L(\psi , \ell )\) such that the following properties hold for all \(\ell , m, n \in {\mathbb {N}}\) and \(x \in X\):

  1. (i)

    \(\nu _{{{\,\mathrm{av}\,}}}(\Phi _m)=\nu _{{{\,\mathrm{av}\,}}}(\Phi )\) and \(\nu (\psi _m) = \nu (\psi )\),

  2. (ii)

    \(\Vert \Phi \circ F^m(x, \cdot ) - \Phi _m(x,\cdot ) \Vert \le M(\Phi ) \theta ^m\), and \(\Vert \Phi (x)\Vert \le M(\Phi )\),

  3. (iii)

    \({{\,\mathrm{Max}\,}}_\ell (\psi _m) \le M(\psi , \ell )\) and \({{\,\mathrm{Lip}\,}}_\ell (\psi _m) \le \theta ^{-m} L(\psi , \ell )\),

  4. (iv)

    \(|{{\,\mathrm{cov}\,}}(\Phi \circ F^n,\psi ) - {{\,\mathrm{cov}\,}}(\Phi _m \circ (F^+)^n,\psi _m)| \le K \theta ^m\).

From Proposition 9.5, it follows that the function \(f^+\) in the definition of the one-sided skew product \(F^+\) has the collapsed accessibility property.

10.3 Step 3: End of the Proof

We are now ready to prove Theorem 3.9. Let \(\Phi \in \mathscr {G}\) and \(\psi \in \mathscr {L}\), and fix \(k \in {\mathbb {N}}\) and \(0<\alpha <1/2\). Consider the sequence of functions \(\{\psi \}_{m \in {\mathbb {N}}} \subset \mathscr {L}^+\) given by Proposition 10.3. By Lemma 4.2, their Fourier transforms \((\widehat{\psi _m})_\xi \) satisfy

$$\begin{aligned} \Vert (\widehat{\psi _m})_\xi \Vert _\infty \le M(\psi , \ell ) \xi ^{-\ell } \quad \text { and }\quad |(\widehat{\psi _m})_\xi |_\theta \le \theta ^{-m} L(\psi , \ell ) \xi ^{-\ell }. \end{aligned}$$

If we define a function \(w :(0, \infty ) \rightarrow [0, \infty )\) by \(w(\xi ) = \sup _m \theta ^{m}\Vert (\widehat{\psi _m})_\xi \Vert _H\), then these estimates imply that \(w(\xi )\) decays rapidly in \(\xi \) in the sense of Sect. 7. We further define functions \(v_{n,m}:(0, \infty ) \rightarrow [0, \infty )\) by \(v_{n,m}(\xi )=\Vert \mathcal {L}_\xi (\widehat{\psi _m})_\xi \Vert _H\), and we notice that \(v_{n,m}\) and w satisfy the hypotheses of Proposition 7.5 and Proposition 7.6. Consequently, for any \(c>0\), the sequence \(\{t_n\}_{n \in {\mathbb {N}}}\) defined by

$$\begin{aligned} t_n = \sup \ \{v_{n,m}(\xi )\ :\ n^{-\alpha } \le \xi< \infty \quad \text { and }\quad m < c\log (n) \} \end{aligned}$$

decays rapidly in n.

Fix \(n \in {\mathbb {N}}\) and let m be the largest integer such that \(m < c \log (n)\), where \(c = k/(-\log (\theta ))\); in particular

$$\begin{aligned} n^{-k} = \theta ^{c \log (n)} < \theta ^m \le \theta ^{c \log (n)-1} = \theta ^{-1} n^{-k}. \end{aligned}$$

By Proposition 10.3-(iv), we get

$$\begin{aligned} | {{\,\mathrm{cov}\,}}(\Phi \circ F^n,\psi ) | \le | {{\,\mathrm{cov}\,}}(\Phi _m \circ (F^+)^n,\psi _m)| + K \theta ^{-1} n^{-k}, \end{aligned}$$

hence it suffices to bound the first summand in the right-hand side above. By Proposition 4.6, we have

$$\begin{aligned} \begin{aligned} | {{\,\mathrm{cov}\,}}(\Phi _m \circ (F^+)^n,\psi _m)| \le&\left|\int _X \int _{\{0\}}(\mathcal {L}_\xi ^n (\widehat{\psi _m})_\xi )(x) \mathop {}\!\mathrm {d}(\eta _m)_x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) - \nu _{{{\,\mathrm{av}\,}}}(\Phi _m)\nu (\psi _m) \right|\\&+ \left|\int _X \int _{(0,n^{-\alpha })}(\mathcal {L}_\xi ^n (\widehat{\psi _m})_\xi )(x) \mathop {}\!\mathrm {d}(\eta _m)_x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) \right|\\&+ \left|\int _X \int _{[n^{-\alpha }, \infty )}(\mathcal {L}_\xi ^n (\widehat{\psi _m})_\xi )(x) \mathop {}\!\mathrm {d}(\eta _m)_x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) \right|. \end{aligned} \end{aligned}$$

The last summand in the right-hand side above is bounded by \(M t_n\), hence decays rapidly. The first term, by Lemma 8.1, is bounded by

$$\begin{aligned} C \Vert \Phi _m \Vert _{\mathscr {G}^+} ({{\,\mathrm{Max}\,}}_0(\psi _m) + {{\,\mathrm{Lip}\,}}_0(\psi _m) ) \delta ^n \le C M(\Phi ) (M(\psi ,0)+L(\psi ,0)) n^k \delta ^n, \end{aligned}$$

which decays rapidly as well. Finally, for the second term, Lemma 8.2 implies

$$\begin{aligned} \left|\int _X \int _{(0,n^{-\alpha })}(\mathcal {L}_\xi ^n (\widehat{\psi _m})_\xi )(x) \mathop {}\!\mathrm {d}(\eta _m)_x(\xi )\, \mathop {}\!\mathrm {d}\mu (x) \right|\le M(\psi ,0) {{\,\mathrm{LF}\,}}(\Phi _m, n^{-\alpha }). \end{aligned}$$

In order to coclude the proof of Theorem 3.9, it suffices to establish the following lemma.

Lemma 10.4

With the notation above, for any \(r>0\) we have

$$\begin{aligned} |{{\,\mathrm{LF}\,}}(\Phi _m, r) - {{\,\mathrm{LF}\,}}(\Phi , r) | \le M(\Phi ) \theta ^{-1} n^{-k}. \end{aligned}$$

Proof

Note that the measure associated to \((\Phi \circ F^m)(x,\cdot )\) is \(e^{-i \xi f_m(x)} \mathop {}\!\mathrm {d}\eta _{\sigma ^m x}(\xi )\), whose variation is \(|\eta _{\sigma ^m x}|\). Let us denote \(R= (-r,r)\setminus \{0\} \subset {\mathbb {R}}\). Then, by Proposition 10.3-(ii),

$$\begin{aligned} \begin{aligned}&|{{\,\mathrm{LF}\,}}(\Phi _m, r) - {{\,\mathrm{LF}\,}}(\Phi , r) | = \left|\int _X |(\eta _m)_x|(R) \mathop {}\!\mathrm {d}\mu (x) - \int _\Sigma |\eta _x|(R) \mathop {}\!\mathrm {d}\mu (x) \right|\\&\quad \le \left|\int _\Sigma |\eta _{\sigma ^m x}|(R) \mathop {}\!\mathrm {d}\mu (x) - \int _\Sigma |\eta _x|(R) \mathop {}\!\mathrm {d}\mu (x) \right|\\&\qquad + \max _{x \in X} \Vert (\eta _m)_x - e^{-i \xi f_m(x)} \eta _{\sigma ^m x} \Vert _{{{\,\mathrm{TV}\,}}} \\&\quad = \max _{x \in X} \Vert \Phi _m(x, \cdot ) - \Phi \circ F^m(x,\cdot ) \Vert \le M(\Phi ) \theta ^m \le M(\Phi ) \theta ^{-1} n^{-k}. \end{aligned} \end{aligned}$$

\(\square \)