1 Introduction

Let and be two probability spaces. A coupling of the probability measures μ 1 and μ 2 is a probability measure μ on the product measurable space whose marginal probabilities are μ 1 and μ 2, respectively. We denote the set of coupling of μ 1 and μ 2 by . Accordingly, a coupling of two Euclidean Brownian motions on \({\mathbb{R}}^{n}\) starting from x 1 and x 2, respectively, is a \(C({\mathbb{R}}_{+}, {\mathbb{R}}^{n}\times{\mathbb{R}}^{n})\)-valued random variable (X 1,X 2) on a probability space such that the components X 1 and X 2 have the law of Brownian motion starting from x 1 and x 2, respectively. In this case, we say simply that (X 1,X 2) is a coupling of Brownian motions from (x 1,x 2).

In the present work we discuss the uniqueness problem of maximal couplings of Euclidean Brownian motion. As usual, the maximality of a coupling is defined as a coupling for which the coupling inequality (see below) becomes an equality. It is well known that the mirror coupling is a maximal coupling. We show by an example that in general a maximal coupling need not be unique. To prove a uniqueness result, we consider a more restricted class of couplings, that of Markovian couplings. In this class we show that the mirror coupling is the unique maximal coupling. This will be done by two methods. The first method is a martingale argument. This method, although the simpler of the two, depends on the linear structure of the Euclidean state space, thus has a rather limited scope of application to other more general settings. In the second method, we use the Markovian hypothesis to reduce the problem to a mass transport problem on the state space. In the Euclidean case under consideration this mass transportation problem has a well-known solution. This second method demonstrate an interesting connection between maximal Markov coupling and mass transportation with a cost function defined by the transition density function (the heat kernel), and has the potential of generalization to more general settings (e.g., Brownian motion on a Riemannian manifold). For this and other closely related works shortly after the initial work of the current paper was done see Kuwada [6, 7] and Kuwada and Sturm [8, 9].

2 Maximal Coupling

Let

$$p (t,x,y) = \biggl(\frac{1}{2\pi t} \biggr)^{n/2} e^{-\vert x-y\vert^2/2t} $$

be the transition density function of the standard Brownian motion on \({\mathbb{R}}^{n}\). Here |x| is the Euclidean length of a vector \(x\in{\mathbb{R}}^{n}\). Define the function

$$\phi_t(r) = \frac{2}{\sqrt{2\pi t}}\int_0^{r/2}e^{-\rho^2/2t} \,d\rho. $$

When t=0, we use the convention

$$\phi_0(r)= \begin{cases} 0, & r = 0,\\ 1,& r>0. \end{cases} $$

The probabilistic significance of this function is that it is the tail probability of the first passage time of a one-dimensional standard Brownian motion from 0 to r/2:

$$ {\mathbb{P}} \{ \tau_{r/2}\ge t \} = \phi_t(r). $$
(2.1)

It is easy to verify that

$$ \phi_t\bigl(\vert x_1- x_2\vert\bigr)=\frac{1}{2}\int _{{\mathbb{R}}^n}\big\vert p (t, x_1, y) - p (t, x_2, y)\big\vert \,dy. $$
(2.2)

The significance of this equality will be clear once we identify τ r/2 as the smallest (in the sense of distribution) coupling time of two Brownian motions on \({\mathbb{R}}^{n}\) starting apart at a distance r.

Fix two distinct points x 1 and x 2 in \({\mathbb{R}}^{n}\). Let X=(X 1,X 2) be a coupling of Euclidean Brownian motions from (x 1,x 2). This simply means that the laws of X 1={X 1(t)} and X 2={X 2(t)} are Brownian motions starting from x 1 and x 2, respectively. The coupling time T(X 1,X 2) is the the earliest time after which the two Brownian motions coincide

$$T(X_1, X_2) = \inf \bigl\{ t>0: X_1(s) =X_2(s)\text{ for all } s\ge t \bigr\} . $$

Note that T(X 1,X 2) in general is not the first time the two processes meet and therefore is not a stopping time.

The following well-known coupling inequality gives a lower bound for the tail probability of the coupling time. Similar inequalities hold under more general settings and its proof is well known (see Lindvall [10]). We include it for the sake of completeness.

Proposition 2.1

Let (X 1,X 2) be a coupling of Brownian motions from (x 1,x 2). Then

$${\mathbb{P}} \bigl\{ T(X_1, X_2)\ge t \bigr\} \ge \phi_t\bigl(\vert x_1-x_2\vert\bigr). $$

Proof

For any , we have

Hence,

In the last step we have used (2.2). □

In view of the coupling inequality, a coupling for which the coupling inequality is an equality for all time clearly has a great significance. Such a coupling is called a maximal coupling. More precisely, a coupling (X 1,X 2) of Brownian motions from (x 1,x 2) is called maximal if

$${\mathbb{P}} \bigl\{ T(X_1, X_2)\ge t \bigr\} = \phi_{t}\bigl(\vert x_1-x_2\vert\bigr), $$

for all t>0.

Maximal couplings for Markov chains and more general discrete stochastic processes have been studied in the literature (see Goldstein [4], Griffeath [5], and the discussion in Lindvall [10]). For Euclidean Brownian motion, it is well known (Lindvall and Rogers [11]) that the mirror coupling, which we will define shortly, is a maximal coupling.

Let H be the hyperplane bisecting the line segment [x 1,x 2]:

$$H = \bigl\{ x\in{\mathbb{R}}^n: \langle x-x_0, n\rangle= 0 \bigr\} , $$

where x 0=(x 1+x 2)/2 is the middle point and n=(x 1x 2)/|x 1x 2| the unit vector in the direction of the line segment. Let be the mirror reflection with respect to the hyperplane H:

We now describe the mirror coupling. Let

$$\tau= \inf \bigl\{ t\ge0: X_1(t)\in H \bigr\} $$

be the first hitting time of H by X 1. From (2.1) we know that

$$ {\mathbb{P}} \{ \tau\ge t \} = \phi_t\bigl(\vert x_1-x_2 \vert \bigr). $$
(2.3)

A coupling (X 1,X 2) of Brownian motions from (x 1,x 2) is a mirror coupling (or X 2 is the mirror coupling of X 1) if X 2 is the mirror reflection of X 1 with respect to H before time τ and coincides with X 1 afterwards; namely,

In this case the coupling time T(X 1,X 2)=τ. From (2.3) the mirror coupling is indeed a maximal coupling, a well-known fact (see Lindvall [10] and Lindvall and Rogers [11]).

It was believed that the mirror coupling is the unique maximal coupling of Euclidean Brownian motion. This, however, is not the case, as has been recently discovered by the authors and others (including Pat Fitzsimmons and Wilfrid Kendall). Here we describe Fitzsimmons’ counterexample in one dimension. Let

$$l = \sup \bigl\{ t\le\tau: X_1(t) = x_1 \bigr\} $$

be the last time the Brownian motion X 1 is at x 1 before it hits the middle plane H (i.e., before time τ). We let X 2 to be the time reversal of X 1 before time l, the mirror reflection of X 1 between l and τ and X 1 after τ; namely,

$$X_2(t) = \begin{cases} x_2-x_1+X_1(l-t), &t\in[0,l];\\ x_1+x_2 - X_1(t),&t\in[l, \tau];\\ X_1(t),&t\in[\tau, \infty). \end{cases} $$

Of course X 2 is not the mirror coupling of X 1. On the other hand, by Williams’ decomposition of Brownian path {X 1(t),0≤tτ} (see Revuz and Yor [12, pp. 244–245 and pp. 304–305]), X 2 is a Brownian motion starting from x 2. The coupling time for (X 1,X 2) is again τ, which shows that the coupling is indeed maximal.

In order to recover the uniqueness, we need to consider a smaller class of couplings.

Definition 2.2

Let X=(X 1,X 2) be a coupling of Brownian motions. Let be the filtration of σ-algebras generated by X. We say that X is a Markovian coupling if for each s≥0, conditioned on the σ-algebra , the shifted process

$$\bigl\{ \bigl(X_1(t+s), X_2(t+s)\bigr),\ t\ge0 \bigr\} $$

is still a coupling of Brownian motions (now from (X 1(s),X 2(s))).

A few comments about this definition are in order. The condition that X=(X 1,X 2) is a Markovian coupling only requires that, conditioned on , the law of each time-shifted component is that of a Brownian motion. In particular, (X 1,X 2) is a Markovian coupling as soon as each component is separately a Brownian motion with respect to a common filtration. This is the case if, for instance, , i.e., the second Brownian motion is defined progressively (without looking forward) by the first Brownian motion. It should be pointed out that the definition does not imply that (X 1,X 2) is a Markov process. It is instructive to compare our maximal Markovian coupling with other types of coupling; e.g., efficient coupling of Burdzy and Kendall [1] and ρ-optimal coupling of Chen [2].

The main result of this paper is the following.

Theorem 2.3

Let \(x_{1}, x_{2}\in{\mathbb{R}}^{n}\). The mirror coupling is the only maximal Markovian coupling of n-dimensional Brownian motions starting from (x 1,x 2).

We will give two proofs of this theorem. The first one is based on the fact that the Markovian condition implies that the joint process is a martingale. The second method uses the Markovian condition to reduce the problem to the uniqueness of a mass transportation problem, whose solution is well known. This proof is more interesting from an analytic point of view. From the second proof it will be clear that a stronger result holds, namely if a Markovian coupling is maximal at one fixed time t then it must be the mirror coupling up to time t.

3 Proof of the Uniqueness Using Martingales

Without loss of generality we assume that the space dimension is one. Let be the filtration generated by the joint process X=(X 1,X 2). The Markovian hypothesis implies that each component is a Brownian motion with respect to . Therefore, X is a continuous -martingale and so is the process X 1X 2. Let

$$\sigma_t = \frac{1}{4} \langle X_1-X_2 \rangle(t), $$

where 〈Z〉 denotes the quadratic variation process of a martingale Z. By Lévy’s criterion there is a Brownian motion W such that

$$X_1(t) - X_2(t) = 2W(\sigma_t). $$

Since both X 1 and X 2 are Brownian motions, by the Kunita–Watanabe inequality,

$$ \bigl \vert \langle X_1, X_2\rangle(t)\bigr \vert \le \int_0^t \sqrt{d\langle X_1 \rangle_s d \langle X_2\rangle_s} = t. $$

Hence,

$$ \sigma_t = \frac{\langle X_1\rangle(t) +\langle X_2\rangle(t) - 2\langle X_1, X_2\rangle(t)}{4}\le t. $$
(3.1)

Now let

$$\tau_1 = \inf \bigl\{ t\ge0: X_1(t)= X_2(t) \bigr\} \quad\text{and}\quad \tau_2 = \inf \bigl\{ t\ge0: W(t) = 0 \bigr\} . $$

It is clear that T(X 1,X 2)≥τ 1 and τ 2 is the first passage time of Brownian motion W from |x 1x 2|/2 to 0. The maximality of the coupling (X 1,X 2) means that T(X 1,X 2) and τ 2 have the same distribution. On the other hand, by definition \(\sigma_{\tau_{1}} =\tau_{2}\), hence by (3.1),

$$T(X_1, X_2)\ge\tau_1\ge \sigma_{\tau_1}=\tau_2. $$

Since T(X 1,X 2) and τ 2 have the same distribution, we must have

$$T(X_1, X_2) = \tau_2 = \sigma_{\tau_1}= \tau_1. $$

Therefore, the coupling time coincides with the first meeting time of X 1 and X 2, and before they meet the equality must hold in (3.1), i.e., 〈X 1,X 2〉(t)=−t. It follows that for 0≤tτ 1,

$$X_2(t) = X_2(0) +X_1(0) - X_1(t) = 2x_0 - X_1(t), $$

which simply means that X 2 is the mirror coupling of X 1.

4 Optimal Coupling of Gaussian Distributions

The second proof we give in the next section, although longer than the first one, can potentially be applied to a more general state space without a linear structure (a Riemannian manifold, for example). The basic idea is to use the Markovian hypothesis to reduce the problem to the uniqueness of a very special mass transportation problem on the state space with a cost function determined by the transition density function. In this section we discuss this mass transportation problem. For general theory see, e.g., Gangbo and McCann [3, Theorem 1.4] and Villani [14, Sect. 4.3, Theorem 3].

Given t≥0 and \(x \in{\mathbb{R}}\) we use N(x,t) to denote the Gaussian distribution of mean x and variance t. The density function is p(t,x,z). A probability measure μ on \({\mathbb{R}}^{2}\) is called a coupling of N(x 1,t) and N(x 2,t) if they are the marginal distributions of μ. We use to denote the set of such couplings. The mirror coupling m(x 1,x 2;t), which we define shortly, is a distinguished member of .

We may regard a coupling as the joint distribution of a \({\mathbb{R}}^{2}\)-valued random variable Z=(Z 1,Z 2). Intuitively, in the mirror coupling Z 2 coincides with Z 1 as much as possible, and if this cannot be done then , the mirror image of Z 1 with respect to x 0=(x 1+x 2)/2. Thus the mirror coupling m(x 1,x 2;t) can be described as follows:

Equivalently, we can write

where

$$h_0(z) = p (t, x_1,z)\wedge p (t, x_2,z), $$

and

$$h_1(z)= p (t, x_1,z) - h_0(z). $$

It is clear that m(x 1,x 2;t) is concentrated on the union of the two lines:

on which it has the 1-dimensional densities h 0(z) and h 1(z), respectively.

Let ϕ be a nonnegative function on [0,∞) such that ϕ(0)=0. The transportation cost of a coupling with the cost function ϕ is defined by

$$C_\phi(\mu) = \int_{{\mathbb{R}}^2} \phi\bigl(\vert x-y\vert\bigr) \mu(dy_1dy_2). $$

The results we will need for studying maximal couplings of Euclidean Brownian motion are contained in the following two theorems.

Theorem 4.1

Let ϕ be a strictly increasing, strictly concave cost function. Let m=m(x 1,x 2;t) be the mirror coupling. Then C ϕ (μ)≥C ϕ (m) for all and the equality holds if and only if μ=m.

Proof

Let

$$\mu_1 = p (t, x_1, z) dz \quad\mbox{and}\quad \mu_2 = p (t,x_2, z) dz $$

be the probability measures for Z 1 and Z 2. Suppose that μ is a probability measure on \({\mathbb{R}}^{2}\) at which the minimum is attained. Let

$$D = \bigl\{ (x,x): x\in{\mathbb{R}} \bigr\} $$

be the diagonal in \({\mathbb{R}}\times{\mathbb{R}}\). We first show that the restriction of μ to D is

$$ \mu \vert_D (dz) =\nu_0(dz):= h_0(z) dz, $$
(4.1)

where h 0(z):=p(t,x 1,z)∧p(t,x 2,z) as before.

Since the marginal distributions of μ are p(t,x 1,z)dz and p(t,x 2,z)dz, we must have μ| D ν 0. We need to show that the equality holds.

We first explain the argument intuitively. We regard μ as a transport from the mass μ 1 to the mass μ 2. Suppose that the strict inequality holds at a point (y 0,y 0). From the fact that the first marginal distribution of μ is p(t,x 1,z)dz>h 0(z)dz we see that there must be a point \(y_{2}\not=y_{0}\) such that (y 0,y 2) is in the support of μ. Similarly, there must be a point \(y_{1}\not=y_{0}\) such that (y 1,y 0) is in the support of μ. This means that a positive mass is moved from y 1 to y 0 and then from y 0 to y 2. But then μ cannot be optimal because from the inequality

$$\phi\bigl(\vert y_1-y_0\vert\bigr)+\phi\bigl(\vert y_0-y_2\vert\bigr)> \phi\bigl(\vert y_1-y_2 \vert\bigr), $$

which is a consequence of the strict monotonicity and strict concavity of ϕ, it is more efficient to transport the mass directly from y 1 to y 2.

To proceed rigorously, we write μ in the following forms:

$$ \mu(dy_1dy_2)=k_1(y_1,dy_2) \mu_1(dy_1)=k_2(y_2,dy_1) \mu_2(dy_2) , $$
(4.2)

where k 1 and k 2 are appropriate Markovian kernels on \({\mathbb{R}}\). Let

$$\nu_1=\mu_1-\nu_0,\quad\quad\nu_2= \mu_2-\nu_0 $$

and

Then a straightforward calculation shows that ν is a coupling of μ 1 and μ 2 and the following equality holds:

Since ϕ is strictly increasing and strictly concave, and ϕ(0)=0, the right side is always nonpositive and is equal to zero only if y 0 is equal to either y 1 or y 2 almost surely with respect to the integrating measure. Thus with respect to the measure ν 0, either k 1(y,⋅) is concentrated on {y} or k 2(y,⋅) is concentrated on {y}. Let A be the subset of \({\mathbb{R}}\) on which the former holds. Then we have from (4.2) that

$$\mu\vert_{D\cap A}\ge\mu_1\ge\nu_0\quad\text{and} \quad\mu\vert_{D\cap A^c}\ge\mu_2\ge\nu_0. $$

It follows that μ| D ν 0 and therefore μ| D =ν 0, which is what we wanted to prove.

We now investigate μ off diagonal. Recall that

$$\mu_1=\nu_0+\nu_1,\qquad \mu_2=\nu_0+\nu_2. $$

It is known from what we have shown that an optimal μ always leaves the part ν 0 unchanged. Moreover, the measures ν 1 and ν 2 are supported, respectively, on the two half intervals S 1 and S 2 separated by the point (x 1+x 2)/2. In this case the intuitive idea of the proof is that transporting a mass from a point y 1S 1 to y 2S 2 costs the same as transporting the same mass from to , but the two transports together are more expensive than the two transports of y 1 to and of y 2 to .

To make this argument rigorous, we first note that with the notation established in the first part of the proof μ can be written as

$$\mu(dy_1dy_2)=\delta_{y_1}(dy_2) \nu_0(dy_1)+ k_3(y_1,dy_2) \nu_1(dy_1). $$

Here y 2S 2 almost surely with respect to k 1(y 1,⋅) whenever y 1S 1. Comparing the transportation costs of μ with that of the mirror coupling m yields the equality

Again, by the strict concavity of ϕ, the right side is always nonpositive and vanishes only if almost surely with respect to the integrating measure. This means that almost surely with respect to ν 1, the measure k 3(y,⋅) is concentrated on . Combining the two parts, we see that the optimal coupling μ must be the mirror coupling. □

Remark 4.2

The main point of the theorem is not the existence of a unique transport map, which can be deduced from general mass transportation theory. Rather it is the fact that the transport map is always the mirror map , independent of the cost function as long as it is strictly increasing and strictly concave. This feature of the transport map will be crucial in our application.

We now turn to the family (2.2) of cost functions {ϕ t ,t>0} defined by the transition density function.

Theorem 4.3

If the cost function is ϕ s , then the cost of the mirror coupling m(x 1,x 2;t) is ϕ s+t (|x 1x 2|); namely,

$$C_{\phi_s}\bigl(m(x_1, x_2; t)\bigr) = \phi_{s+t}\bigl(\vert x_1-x_2\vert\bigr). $$

Proof

Let B be a standard Brownian motion and τ its hitting time of the point 0. We know that \(\phi_{s}(2\rho) = {\mathbb{P}}_{\rho} \{ \tau\ge s \} \) for all positive ρ. Using the Markov property of Brownian motion we have

If (Z 1,Z 2) are mirror coupled with the law m(x 1,x 2;t), then \(\vert Z_{1}-Z_{2}\vert I_{ \{ \vert Z_{1}-Z_{2}\vert> 0 \} }\) and 2B t I {τ>t} have the same law, hence

$$\phi_{s+t}(2r) = {\mathbb{E}}\phi_s\bigl(\vert Z_1-Z_2\vert\bigr). $$

The right side is precisely the cost of the mirror coupling with the cost function ϕ s . □

For calculations related to the above proof, see Sturm [13, Example 4.6].

5 Second Proof of the Uniqueness

Let s and t be positive and assume that the coupling X=(X 1,X 2) is a coupling of Brownian motions from (x 1,x 2) which is maximal at time s+t. This means that

$$\phi_{s+t} \bigl(\vert x_1-x_2\vert \bigr) = {\mathbb{P}} \bigl\{ T(X_1, X_2)> s+t \bigr\} . $$

If X 1(s+t)≠X 2(t+s), certainly the coupling time T(X 1,X 2)>s+t, hence

$$\phi_{s+t} \bigl(\vert x_1-x_2\vert \bigr) \ge{\mathbb{P}} \bigl\{ X_1(s+t)\neq X_2(t+s) \bigr\} . $$

Since the coupling is Markovian, conditioned on the random variables X 1(s+t) and X 2(s+t) have the Gaussian distribution of the variance t and means X 1(s) and X 2(s), respectively. The probability of these two random variables being different is at least 1/2 of the total variation of the difference of their distributions (see the proof of Proposition 2.1), hence

It follows that

$$ \phi_{s+t}\bigl(\vert x_1-x_2\vert\bigr) \ge {\mathbb{E}} \phi_t\bigl(\vert X_1(s)-X_2(s)\vert\bigr). $$
(5.1)

We recognize that the right side is the cost of a coupling of two Gaussian random variables X 1 and X 2 with distributions N(x 1,t) and N(x 2,t), respectively. The cost function is ϕ t . By Theorems 4.1 and 4.3, the minimum cost is attained only when X 1(s) and X 2(s) are mirror coupled and in this case the total cost is equal to ϕ s+t (|x 1x 2|). It follows that

$$\phi_{s+t}\bigl(\vert x_1-x_2\vert\bigr) = {\mathbb{E}} \phi_t \bigl(\bigl \vert X_1(s) - X_2(s) \bigr \vert \bigr) $$

and X 1(s) and X 2(s) must be mirror coupled. To sum up, we have shown that if the coupling X=(X 1,X 2) is maximal at a time t, then X 1(s) and X 2(s) must be mirror coupled Gaussian random variables for all 0≤st.

Now suppose that X=(X 1,X 2) is a maximal Markovian coupling (for all time). Then by what we have proved, X 1(t) and X 2(t) must be mirror coupled for any time t. Thus we must have either X 2(t)=X 1(t) or . Therefore before the first time they meet, we must always have the second alternative. It follows that the first time they meet must be the first passage time of X 1 to the middle point (x 1+x 2)/2 and, by the maximality of the coupling (X 1,X 2), they must coincide afterwards. This means exactly that (X 1,X 2) is a mirror coupling. This completes the proof of the main Theorem 2.3.

6 Concluding Remarks

Given the disparity in terms of length and level of sophistication of the two proofs, we need to justify the presentation of the second proof. The first proof uses the linear structure of Euclidean space and becomes meaningless if we try to use the method to discuss maximal coupling of Brownian motions on a Riemannian manifold. We believe that the second method can be explored to deal with a Riemannian manifold with sufficient symmetry, for example, a complete simple connected manifold of constant curvature. The second proof also raises the interesting problem of studying mass transportation problem associated with the cost function

$$\phi_t (x,y) = \frac{1}{2}\int_M \big\vert p(t,x,z) - p(t,y, z)\big\vert \,dz, $$

where p(t,z 1,z 2) is the transition density function of Brownian motion on an arbitrary Riemannian manifold M, i.e., the heat kernel on M.

However, it is not known how to construct a maximal coupling of Brownian motions on a general Riemannian manifold, even if the coupling starts from two simple points. Such a coupling may not be Markovian. This point of view is supported by several constructions of maximal couplings for Markov chains and more general random sequences or processes in the literature (see, e.g., Griffeath [5], Goldstein [4], and Lindvall [10]).

Finally, neither proof we presented here applies to the case of general initial distributions (μ 1,μ 2) on \({\mathbb{R}}^{n}\). It is not clear if a maximal Markovian coupling always exists in this generality. It may happen that the tail probability \({\mathbb{P}} \{ T(X_{1}, X_{2})\ge t \} \) of the coupling time can be minimized for each fixed t but not at the same coupling simultaneously for all t. In this respect, we can obtain some positive results by taking advantage of certain situations in which the unique minimizers are independent of the choice of strictly concave function ϕ. This is the case, for example, if (μ 1μ 2)+ is supported on a half space and (μ 1μ 2) is the reflection of (μ 1μ 2)+ in the other half space, or if (μ 1μ 2)+ is supported on an open ball and (μ 1μ 2) is the spherical image of (μ 1μ 2) (see Gangbo and McCann [3, Example 1.5]). Suppose that there is a measure m which uniquely minimizes the cost C ϕ (μ) for . It can be shown that the unique maximal Markovian coupling of Brownian motions starting from the initial distributions (μ 1,μ 2) and it is given by

where denotes the law of the mirror coupling of Brownian motions from (x 1,x 2).

The second proof can be generalized to certain Riemannian manifolds with symmetry such as complete simply connected manifold of constant curvature (space forms).