1 Introduction

In this article we study solutions \(X_t \in {\mathbb { R}}^d\) of the Itô stochastic differential equation

$$\begin{aligned} \mathrm{d}X_t = b(X_t) \,\mathrm{d}t + \sqrt{2}\,\sigma (X_t) \,\mathrm{d}W_t, \end{aligned}$$
(1.1)

where \((W_t, {\mathcal {F}}^W_t)\) is a standard Brownian motion in \({\mathbb { R}}^d\), defined on a probability space \((\Omega , {\mathcal {F}}, {\mathbb { P}})\). This diffusion process in \({\mathbb { R}}^d\) has generator

$$\begin{aligned} L u = {{\mathrm{tr}}}(a \nabla ^2 u) + b \cdot \nabla u, \end{aligned}$$

where \(a:= \sigma \sigma ^{\mathrm {T}}\) is a symmetric matrix. We suppose that \(\sigma (x)\) is smooth and that \(a(x)\) is uniformly positive definite and bounded:

$$\begin{aligned} \lambda |\xi |^2 \le \langle \xi , a(x) \xi \rangle \le \Lambda |\xi |^2, \quad \forall \,\xi \in {\mathbb { R}}^d, \quad \forall \;x \in {\mathbb { R}}^d \end{aligned}$$
(1.2)

holds for some \(\Lambda > \lambda > 0\). Although the vector field \(b\) may not be bounded, we suppose that \(b\) is smooth and satisfies conditions that guarantee the ergodicity of the Markov process \(X_t\) and the existence of a unique invariant probability distribution \(\rho (x) > 0\) satisfying the adjoint equation

$$\begin{aligned} L^* \rho = ( a_{ij}(x) \rho (x))_{x_i x_j} - \nabla \cdot (b(x) \rho (x)) = 0. \end{aligned}$$
(1.3)

We also assume that for some \(\alpha > 1\),

$$\begin{aligned} \sup _{|x| < R} {\mathbb {E}}[\,\tau _1^{\alpha }\;|\; X_0 = x] < +\infty \end{aligned}$$
(1.4)

for all \(R > 0\), where \(\tau _1\) is the first hitting time of \(X_t\) to the unit ball \(\{ z \in {\mathbb { R}}^d \;|\; |z| \le 1\}\). For example, it follows from Theorems 2 and 3 of [34] that these assumptions will hold if

$$\begin{aligned} \limsup _{ m \rightarrow +\infty } \;\sup _{|x| = m} x \cdot b(x) < -r \end{aligned}$$

for some \(r > 1 + (d/2)\).

Suppose that \(A, B \subset {\mathbb { R}}^d\) are two bounded open sets with smooth boundary and such that \(\bar{A}\) and \(\bar{B}\) are disjoint. Because the process is ergodic, \(X_t\) will visit both \(A\) and \(B\) infinitely often. Inspired by the transition path theory developed by Weinan and Vanden-Eijnden [15, 24] (see also the review article [16]), our main interest is in those segments of the trajectory \(t \mapsto X_t\) which pass from \(A\) to \(B\). These transition paths and are defined precisely as follows. First, for \(k \ge 0\), define the hitting times \(\tau _{A,k}^+\) and \(\tau _{B,k}^+\) inductively by

$$\begin{aligned}&\tau _{A,0}^+ = \inf \left\{ t \ge 0 \mid X_t \in \bar{A}\right\} , \\&\tau _{B,0}^+ = \inf \left\{ t > \tau _{A,0}^+ \mid X_t \in \bar{B}\right\} , \end{aligned}$$

and for \( k \ge 0\),

$$\begin{aligned}&\tau _{A,k+1}^+ = \inf \left\{ t > \tau _{B,k}^+ \mid X_t \in \bar{A}\right\} , \\&\tau _{B,k+1}^+ = \inf \left\{ t > \tau _{A,k+1}^+ \mid X_t \in \bar{B}\right\} . \end{aligned}$$

We will call these the entrance times. Then define the exit times

$$\begin{aligned}&\tau _{A,k}^- = \sup \left\{ t < \tau _{B,k}^+ \mid X_t \in \bar{A}\right\} , \\&\tau _{B,k}^- = \sup \left\{ t < \tau _{A,k+1}^+ \mid X_t \in \bar{B}\right\} . \end{aligned}$$

These times are all finite with probability one, and \(\tau _{A,k}^+ \le \tau _{A,k}^- < \tau _{B,k}^+ \le \tau _{B,k}^- < \tau _{A,k+1}^+\) for all \(k \ge 0\) (see Fig. 1). If \(t \in [\tau _{A,k}^-, \tau _{B,k}^+]\) for some \(k\), we say that the path \(X_t\) is \(A \rightarrow B\) reactive. Let \(\Theta = (\bar{A \cup B})^{c}\), and hence \(\partial \, \Theta = \partial A \cup \partial B\). For \(k \in {\mathbb {N}}\), the continuous process \(Y^k:[0,\infty ) \rightarrow \bar{\Theta }\) defined by

$$\begin{aligned} Y_t^k = X_{(t + \tau _{A,k}^-) \wedge \tau _{B,k}^+} \end{aligned}$$
(1.5)

is the \(k{\text {th}}\) \(A \rightarrow B\) reactive trajectory or transition path. Observe that \(Y_0^k = X_{\tau _{A,k}^-} \in \partial A\), that \(Y^k_t = X_{\tau _{B,k}^+} \in \partial B\) for all \(t \ge \tau _{B,k}^+ - \tau _{A,k}^-\), and that \(Y^k_t \in \Theta \) for all \(t \in (0,\tau _{B,k}^+ - \tau _{A,k}^-)\). Unlike the entrance times, the exit times \(\tau _{A, k}^-\) and \(\tau _{B, k}^-\) are not stopping times with respect to the natural filtration. So, one cannot apply the strong Markov property to \(X_t\) at times \(\tau _{A,k}^-\) and \(\tau _{B,k}^-\). Indeed, the law of the process \(Y_t^k\) is very different from that of the process \(X_t\) starting at a point in \(\partial A\).

Fig. 1
figure 1

Illustration of a trajectory with entrance and exit times. The transition path from \(A\) to \(B\) is marked in red (color figure online)

Our main results describe the probability law of these transition paths in terms of a transition path process, which is a strong solution to an auxiliary stochastic differential equation. In particular, empirical samples of the reactive portions of \(X_t\) may be regarded as sampling from the transition path process. The motivation comes from the study of chemical reactions and thermally activated processes where understanding these reactive trajectories are crucial [6, 12]. In these applications, the domains \(A\) and \(B\) are usually chosen as regions in configurational space corresponding to reactant and product states. Mathematically, our results fit into the framework of the transition path theory [15, 16, 24].

Having identified the transition path process, we can compute statistics of the transition paths by sampling directly from the transition path SDE, rather than using acceptance/rejection methods or very long-time integration on the original SDE. Our theoretical results might be used to analyze numerical methods of sampling reactive trajectories.

We will now describe our main results and their relation to other works. Proofs are deferred to later sections.

1.1 The transition path process

Our definition of the transition path process is motivated by the Doob \(h\)-transform as follows. Let \(\tau _A\) and \(\tau _B\) denote the first hitting time of \(X_t\) to the sets \(\bar{A}\) and \(\bar{B}\), respectively:

$$\begin{aligned} \begin{aligned} \tau _A&= \inf \left\{ t \ge 0 \mid X_t \in \bar{A} \right\} , \\ \tau _B&= \inf \left\{ t \ge 0 \mid X_t \in \bar{B} \right\} . \end{aligned} \end{aligned}$$
(1.6)

Let \(q(x)\ge 0\) be the forward committor function:

$$\begin{aligned} q(x) = {\mathbb { P}}(\tau _A > \tau _B \mid X_0 = x), \end{aligned}$$
(1.7)

which satisfies \(L q(x) = 0\) for \(x \in \Theta = (\bar{A \cup B})^{c}\) and

$$\begin{aligned} q(x) = {\left\{ \begin{array}{ll} 0, &{} x \in \bar{A}, \\ 1, &{} x \in \bar{B}. \end{array}\right. } \end{aligned}$$
(1.8)

By the maximum principle, \(q(x) > 0\) for all \(x \in \Theta \). By the Hopf lemma we also have

$$\begin{aligned} \sup _{x \in \partial A} \widehat{n}(x) \cdot \nabla q(x) < 0, \qquad \inf _{x \in \partial B} \widehat{n}(x) \cdot \nabla q(x) > 0, \end{aligned}$$
(1.9)

where \(\widehat{n}(x)\) will denote the unit normal exterior to \(\Theta \) (pointing into \(A\) and \(B\)). For \(x \in \Theta \), consider the stopped process \(X_{t \wedge \tau _A \wedge \tau _B}\) with \(X_0 = x\), and let \({\mathcal {P}}_x\) denote the corresponding measure on \({\mathcal {X}} = C([0,\infty ), \bar{\Theta })\):

$$\begin{aligned} {\mathcal {P}}_x(U) = {\mathbb { P}}( X \in U \;|\; X_0 = x), \quad \forall \; U \in {\mathcal {B}} \end{aligned}$$

where \({\mathcal {B}}\) is the Borel \(\sigma \)-algebra on \({\mathcal {X}}\). If \(\Lambda _{AB}\) denotes the event that \(\tau _A > \tau _B\), the measure \({\mathcal {Q}}_x^q\) on \(({\mathcal {X}},{\mathcal {B}})\) defined by

$$\begin{aligned} \dfrac{\mathrm{d}{\mathcal {Q}}_x^q}{\mathrm{d}{\mathcal {P}}_x} = \dfrac{{\mathbb {I}}_{\Lambda _{AB}}}{{\mathcal {P}}_x(\Lambda _{AB})} = \dfrac{{\mathbb {I}}_{\Lambda _{AB}}}{q(x)} \end{aligned}$$

is absolutely continuous with respect to \({\mathcal {P}}_x\), if \(x \in \Theta \). By the Doob \(h\)-transform (see e.g. [28, Theorem 7.2.2]), we know that \({\mathcal {Q}}_x^q\) defines a diffusion process \(Y_t\) on \(C([0,\infty ), \bar{\Theta })\) with generator:

$$\begin{aligned} L^q f = \frac{1}{q} L(qf) = {{\mathrm{tr}}}(a \nabla ^2 f) + (b \cdot \nabla f) + \frac{2 a \nabla q}{q} \cdot \nabla f = Lf + \frac{2a \nabla q}{q} \cdot \nabla f.\quad \end{aligned}$$
(1.10)

So, the effect of conditioning on the event \(\tau _B < \tau _A\) is to introduce an additional drift term. For \(x \in \Theta \), the transition probability for \(Y_t\) is

$$\begin{aligned} p^q(t,x,dy) = \frac{1}{q(x)} p(t,x,dy)q(y) \end{aligned}$$
(1.11)

where \(p(t,x,dy)\) is the transition probability for \(X_t\) killed at \(\partial B\) [28, Theorem 4.1.1].

This observation suggests that the \(A\rightarrow B\) reactive trajectories should have the same law as a solution to the SDE

$$\begin{aligned} \mathrm{d}Y_t = \left( b(Y_t) + \dfrac{2a(Y_t)\nabla q(Y_t) }{q(Y_t)}\right) \,\mathrm{d}t + \sqrt{2}\, \sigma (Y_t) \,\mathrm{d}\widehat{W}_t, \end{aligned}$$
(1.12)

originating at a point \(Y_0 = y_0 \in \partial A\) and terminating at a point in \(\partial B\). While the SDE (1.12) admits strong solutions for \(y_0 \in \Theta \) since \(q(x) > 0\) in \(\Theta \), the drift term becomes singular at the boundary of \(A\), where \(q\) vanishes. Our first result is the following theorem which shows that there is still a unique strong solution to this SDE even for initial condition lying in \(\partial A\). For convenience, let us define the vector field

$$\begin{aligned} K(y) = \left( b(y) + \frac{2 a(y)\nabla q(y)}{q(y)}\right) . \end{aligned}$$
(1.13)

Theorem 1.1

Let \((\widehat{W},{\mathcal {F}}^{\widehat{W}}_t)\) be a standard Brownian motion in \({\mathbb { R}}^d\), defined on a probability space \((\widehat{\Omega }, \widehat{\mathcal {F}},{\mathbb {Q}})\). Let \(\xi :\widehat{\Omega } \rightarrow \bar{\Theta }\) be a random variable defined on the same probability space and independent of \(\widehat{W}\). There is a unique, continuous process \(Y_t:[0,\infty ) \rightarrow \bar{\Theta }\) which is adapted to the augmented filtration \(\widehat{\mathcal {F}}_t\) and satisfying the following, \({\mathbb {Q}}\)-almost surely:

$$\begin{aligned} Y_t = \xi + \int _0^{t \wedge \tau _B} K(Y_s) \,\mathrm{d}s + \int _0^{t \wedge \tau _B} \sqrt{2}\, \sigma (Y_s) \,\mathrm{d}\widehat{W}_s, \quad t \ge 0 \end{aligned}$$
(1.14)

where

$$\begin{aligned} \tau _B = \inf \{ t > 0 \mid Y_t \in \bar{B}\}. \end{aligned}$$

Moreover, \(Y_t \not \in \bar{A}\) for all \(t > 0\).

The augmented filtration is defined in the usual way, \(\widehat{\mathcal {F}}_t\) being the \(\sigma \)-algebra generated by \({\mathcal {F}}_t^{\widehat{W}}\), \(Y_0\), and the appropriate collection of null sets so that \(\widehat{\mathcal {F}}_t\) is both left- and right- continuous. We will use \(\widehat{{\mathbb {E}}}\) to denote expectation with respect to the probability measure \({\mathbb {Q}}\).

Observe that if \(d = 1\), \(\sigma = 1/\sqrt{2}\) is constant, and \(b \equiv 0\), then \(q(x)\) is a linear function, and (1.12) corresponds to a Bessel process of dimension 3. For example, if \(A = (-\infty ,0)\), \(B = (1,\infty )\), we have

$$\begin{aligned} \mathrm{d}Y_t = \frac{1}{Y_t} \,\mathrm{d}t + \,\mathrm{d}\widehat{W}_t, \end{aligned}$$

and the function \(Z_t = (Y_t)^2\) satisfies the degenerate diffusion equation

$$\begin{aligned} \mathrm{d}Z_t = 3 \,\mathrm{d}t + 2 \sqrt{Z_t} \,\mathrm{d}\widehat{W}_t. \end{aligned}$$
(1.15)

In this simple case, existence and uniqueness of a strong solution starting at \(Y_0 = 0\) can be shown using arguments involving Brownian local time (see [23, 30]). However, those arguments are not applicable to the more general setting we consider here. The work most closely related to Theorem 1.1 in a higher dimensional setting may be that of DeBlaissie [14] who proved pathwise uniqueness for certain SDEs having diffusion coefficients that degenerate like \(\sqrt{d(Z_t)}\) where \(d(z)\) is the distance to the domain boundary (as in (1.15)). In an earlier work, Athreya et al. [1] proved uniqueness for the martingale problem associated with a similarly degenerate diffusion in a positive orthant in \({\mathbb { R}}^d\). Nevertheless, those analyses do not apply to the case (1.12) considered here.

The next theorem shows that the law of the reactive trajectories is that of the process \(Y_t\) with appropriate initial condition. For this reason, we will call the process \(Y_t\) the transition path process.

Theorem 1.2

Let \(X_t\) satisfy the SDE (1.1). Let \(Y^k\) denote the \(k{\text {th}}\) \(A \rightarrow B\) reactive trajectory defined by (1.5). Let \(Y\) be defined as in Theorem 1.1. Then for any bounded and continuous functional \(F:C([0,\infty )) \rightarrow {\mathbb { R}}\), we have

$$\begin{aligned} {\mathbb {E}}[F(Y^k)] = \widehat{{\mathbb {E}}}\left[ F(Y) \mid Y_0 \sim X_{\tau _{A,k}^-} \right] . \end{aligned}$$

The processes \(X_t\) and \(Y^k_t\) may be defined on a probability space that is different from the one on which \(Y_t\) is defined. The notation \(Y_0 \sim X_{\tau _{A,k}^-}\) used in Theorem 1.2 means that \(Y_0\) has the same law as \(X_{\tau _{A,k}^-}\), meaning \({\mathbb {Q}}(Y_0 \in U) = {\mathbb { P}}(X_{\tau _{A,k}^-} \in U)\) for any Borel set \(U \subset {\mathbb { R}}^d\).

1.2 Reactive exit and entrance distributions

The distribution of the random points \(X_{\tau _{A,k}^-}\) will depend in the initial condition \(X_0\). From the point of view of sampling the transition paths, however, there is a very natural distribution to consider for \(Y_0\), which is related to the “equilibrium measure” in the potential theory for diffusion processes [7, 8, 32]. To motivate this distribution formally, let \(h > 0\) and consider the regularized hitting times

$$\begin{aligned} \tau _{A,h}&= \inf \left\{ t \ge h \mid X_t \in \bar{A} \right\} \end{aligned}$$
(1.16)
$$\begin{aligned} \tau _{B,h}&= \inf \left\{ t \ge h \mid X_t \in \bar{B} \right\} , \end{aligned}$$
(1.17)

where \(X_t\) satisfies (1.1). Then define

$$\begin{aligned} q_h(x) = {\mathbb { P}}(\tau _{A,h} > \tau _{B,h} \mid X_0 = x). \end{aligned}$$

This is the probability that at some time \(s \in [0,h]\), the path \(X_t\) starting from \(x \in \partial A\) becomes a transition path, not returning to \(\bar{A}\) before hitting \(\bar{B}\). With this in mind, the quantity

$$\begin{aligned} \eta _{A,h}(x) = h^{-1} \rho (x) {\mathbb { P}}(\tau _{A,h} > \tau _{B,h} \mid X_0 = x) = h^{-1} \rho (x) q_h(x), \end{aligned}$$

may be interpreted as a rate at which transition paths exit \(A\), when the system is in equilibrium. Therefore, a natural choice for an initial distribution for \(Y_0 \in \partial A\) is:

$$\begin{aligned} \eta _{A}(x) = \lim _{h \rightarrow 0} \eta _{A,h}. \end{aligned}$$

By the Markov property, we have

$$\begin{aligned} q_h(x) = \int _{{\mathbb { R}}^d} {\mathbb { P}}(\tau _A > \tau _B\;|\; X_0 = y) \rho (h,x,y) \,\mathrm{d}y = {\mathbb {E}}[q(X_h)\mid X_0 = x] \end{aligned}$$
(1.18)

where \(\rho (t,x,\cdot )\) is the density for \(X_t\), given \(X_0 = x\). Therefore, for any \(x \in \partial A\) we have

$$\begin{aligned} \lim _{h \rightarrow 0} h^{-1} q_h(x) = \lim _{h \rightarrow 0} h^{-1}{\mathbb {E}}[q(X_h) - q(X_0)\;|\;X_0 = x] = L q(x), \end{aligned}$$

in the sense of distributions, although \(q\) is not \(C^2\) on \(\partial \Theta = \partial A \cup \partial B\). Hence \(\eta _{A,h}(x) \rightarrow \eta _A(x) = \rho (x)Lq(x)\) for \(x \in \partial A\). The distribution \(Lq\) is supported on \(\partial \Theta \). If \(\phi \) is a smooth test function supported on a set \(B_r(x)\), a small neighborhood of \(x \in \partial A\), then we have

$$\begin{aligned} \begin{aligned} \langle L q, \phi \rangle&= \int _{{\mathbb { R}}^d} q(x) L^{*} \phi (x) \,\mathrm{d}x \\&= \int _{B_r(x) \cap \Theta } L q(x) \phi (x) \,\mathrm{d}x + \int _{(\partial A) \cap B_r(x)} \bigl ( q \widehat{n} \cdot {{\mathrm{div}}}(a \phi ) - (\widehat{n} \cdot a \nabla q) \phi \\&\quad + q \widehat{n} \cdot b\, \phi \bigr ) \,\mathrm{d}\sigma _A(x) \end{aligned} \end{aligned}$$

where \(\widehat{n}(x)\) is the unit normal vector exterior to \(\Theta \), and \(\,\mathrm{d}\sigma _A\) is the surface measure on \(\partial A\). Since \(q = 0\) on \(\partial A\) and \(Lq = 0\) on \(\Theta \), this implies,

$$\begin{aligned} \langle L q, \phi \rangle = - \int _{(\partial A) \cap B_r(x)} \phi \, \widehat{n} \cdot a \nabla q \,\mathrm{d}\sigma _A(x). \end{aligned}$$

That is (after a similar calculation for points on \(\partial B\)),

$$\begin{aligned} L q(x) = - \widehat{n}(x) \cdot a(x)\nabla q(x) \,\mathrm{d}\sigma _A(x) - \widehat{n}(x) \cdot a(x) \nabla q(x) \,\mathrm{d}\sigma _B(x), \end{aligned}$$
(1.19)

in the sense of distributions. Restricting on \(\partial A\), we get

$$\begin{aligned} \eta _A = - \rho (x) \widehat{n}(x) \cdot a(x)\nabla q(x) \,\mathrm{d}\sigma _A(x). \end{aligned}$$
(1.20)

By switching the role of \(A\) and \(B\) in the above discussion, it is also natural to define a measure on \(\partial B\) as

$$\begin{aligned} \eta _B = \rho (x) \widehat{n}(x) \cdot a(x) \nabla q(x) \,\mathrm{d}\sigma _B(x). \end{aligned}$$
(1.21)

Note that \(1 - q\) gives the forward committor function for the transition from \(B\) to \(A\) and that \(L q(x) = \eta _A(\mathrm{d}x) - \eta _B(\mathrm{d}x)\). Although the distributions \(\eta _A\) and \(\eta _B\) are positive (by (1.9)), they need not be probability distributions. Nevertheless, the mass of the two measures is the same.

Lemma 1.3

The measures \(\eta _A\) and \(\eta _B\) satisfy \(\eta _A(\partial A) = \eta _B(\partial B)\). That is,

$$\begin{aligned} \int _{\partial A} \rho (x) \widehat{n}(x)\cdot a(x) \nabla q(x) \,\mathrm{d}\sigma _A(x) + \int _{\partial B} \rho (x) \widehat{n}(x)\cdot a(x) \nabla q(x) \,\mathrm{d}\sigma _B(x) = 0. \end{aligned}$$
(1.22)

This computation motivates us to define

$$\begin{aligned}&\eta _A^-(\mathrm{d}x) = \frac{1}{\nu } \eta _A(\mathrm{d}x) = - \frac{1}{\nu } \rho (x) \widehat{n}(x) \cdot a(x) \nabla q(x) \,\mathrm{d}\sigma _A(x), \end{aligned}$$
(1.23)
$$\begin{aligned}&\eta _B^-(\mathrm{d}x) = \frac{1}{\nu } \eta _B(\mathrm{d}x) = \frac{1}{\nu } \rho (x) \widehat{n}(x) \cdot a(x) \nabla q(x) \,\mathrm{d}\sigma _B(x), \end{aligned}$$
(1.24)

We call these distributions the reactive exit distribution on \(\partial A\) and on \(\partial B\), respectively. The constant \(\nu \) is a normalizing constant so that \(\eta _A^-\) and \(\eta _B^-\) define probability measures on \(\partial A\) and \(\partial B\). By Lemma 1.3, the normalizing constant is the same for both measures. Our next result relates the reactive exit distribution on \(\partial A\) to the empirical reactive exit distribution on \(\partial A\), defined by

$$\begin{aligned} \mu _{A,N}^- = \frac{1}{N} \sum _{k=0}^{N-1} \delta _{X_{\tau _{A,k}^-}}(x). \end{aligned}$$
(1.25)

Proposition 1.4

Let \(\mu _{A,N}^-\) be the empirical reactive exit distribution on \(\partial A\) defined by (1.25). Then \(\mu _{A,N}^-\) converges weakly to \(\eta _A^-\) as \(N \rightarrow \infty \). That is, for any continuous and bounded \(f:\partial A \rightarrow {\mathbb { R}}\)

$$\begin{aligned} \lim _{N \rightarrow \infty } \int _{\partial A} f(x) \,\mathrm{d}\mu _{A,N}^-(x) = \int _{\partial A} f(x) \,\mathrm{d}\eta _A^-(x) \end{aligned}$$

holds \({\mathbb { P}}\)-almost surely.

A similar statement holds for the reactive exit distribution on \(\partial B\) and the empirical distribution of the points \(X_{\tau _{B,k}^-}\). The reactive exit distribution \(\eta _A^-(\mathrm{d}x)\) is related to the equilibrium measure \(e_{A, B}(\mathrm{d}x)\) in the potential theory for diffusion processes [7, 8], [32, Section 2.3]. In fact, the committor function \(q\) is known as the equilibrium potential in those works, and the equilibrium measure \(e_{A, B}(\mathrm{d}x)\) is given by \(Lq\) restricted on \(\partial A\) (see equation (2.11) of [7]). Specifically, we have

$$\begin{aligned} \eta _A^-(\mathrm{d}x) = \frac{1}{\nu } \rho (x) e_{A, B}(\mathrm{d}x). \end{aligned}$$
(1.26)

The reactive exit distribution was also used in the milestoning algorithm as in [35]. To the best of our knowledge, Proposition 1.4 for the first time characterizes the equilibrium measure from a dynamic perspective. In the case that the drift \(b(x) = - \nabla V(x)\) is a gradient field and \(\sigma = \sqrt{\epsilon } I\) is a multiple of the identity matrix, the constant \(\nu \) is related to the capacity of the sets \(A\) and \(B\):

$$\begin{aligned} \nu = Z^{-1} \text {cap}_{A}(B), \quad Z = \int _{{\mathbb { R}}^d} e^{-V(x)/\epsilon } \,dx. \end{aligned}$$

(See definition (2.13) of [7] for \(\text {cap}_{A}(B)\).) The results we present here do not require that \(b(x)\) is a gradient field; nevertheless, the constant \(\nu \) still admits the integral representation given below in Proposition 1.8.

We also identify the limit of the empirical reactive entrance distribution on \(\partial B\), defined as

$$\begin{aligned} \mu _{B, N}^{+} = \frac{1}{N} \sum _{k=0}^{N-1} \delta _{X_{\tau _{B, k}^+}}(x). \end{aligned}$$
(1.27)

To describe its limit as \(N \rightarrow \infty \), let us denote by \(\widetilde{L}\) the adjoint of \(L\) in \(L^2({\mathbb { R}}^d, \rho (x) \mathrm{d}x)\), given by

$$\begin{aligned} \widetilde{L} u = - b \cdot \nabla u + \frac{2}{\rho } {{\mathrm{div}}}( a \rho ) \cdot \nabla u + {{\mathrm{tr}}}(a \nabla ^2 u). \end{aligned}$$
(1.28)

This corresponds to the generator of the time-reversed process \(t \mapsto X_{T - t}\) [19]. Note that \(\widetilde{L} = L\) if the SDE (1.1) is reversible, i.e. \(L\) is self-adjoint in \(L^2({\mathbb { R}}^d, \rho (x) \,\mathrm{d}x)\). In addition to the forward committor function \(q(x)\) (recall (1.7)), we also define the backward committor function \(\widetilde{q}(x)\) to be the unique solution of

$$\begin{aligned} \widetilde{L} \widetilde{q} = 0, \quad x \in \Theta \end{aligned}$$

with boundary condition

$$\begin{aligned} \widetilde{q}(x) = {\left\{ \begin{array}{ll} 1, &{} x \in \partial A \\ 0, &{} x \in \partial B. \end{array}\right. } \end{aligned}$$

In terms of \(\widetilde{q}\), we define the reactive entrance distribution on \(\partial B\) as

$$\begin{aligned} \eta _B^+(\mathrm{d}x) = - \frac{1}{\nu } \rho (x) \widehat{n}(x) \cdot a(x) \nabla \widetilde{q}(x) \,\mathrm{d}\sigma _B(x) \end{aligned}$$
(1.29)

and analogously the reactive entrance distribution on \(\partial A\)

$$\begin{aligned} \eta _A^+(\mathrm{d}x) = \frac{1}{\nu } \rho (x) \widehat{n}(x) \cdot a(x) \nabla \widetilde{q}(x) \,\mathrm{d}\sigma _A(x). \end{aligned}$$
(1.30)

Again, \(\nu \) is a normalizing constant so that these are probability measures; \(\nu \) is the same as the constant in (1.23). The following proposition justifies the definition of the reactive entrance distribution.

Proposition 1.5

Let \(\mu _{B,N}^+\) be the empirical reactive entrance distribution on \(\partial B\) defined by (1.27). Then \(\mu _{B,N}^+\) converges weakly to \(\eta _B^+\) as \(N \rightarrow \infty \). That is, for any continuous and bounded \(f:\partial B \rightarrow {\mathbb { R}}\)

$$\begin{aligned} \lim _{N \rightarrow \infty } \int _{\partial B} f(x) \,\mathrm{d}\mu _{B,N}^+(x) = \int _{\partial B} f(x) \,\mathrm{d}\eta _B^+(x) \end{aligned}$$

holds \({\mathbb { P}}\)-almost surely.

A similar statement holds for the reactive entrance distribution on \(\partial A\) and the empirical distribution of the points \(X_{\tau _{A, k}^+}\).

Remark 1.6

If the SDE (1.1) is reversible, we have \(\widetilde{q} = 1 - q\), and hence \(\eta _A^+(\mathrm{d}x) = \eta _A^-(\mathrm{d}x)\) and \(\eta _B^+(\mathrm{d}x) = \eta _B^-(\mathrm{d}x)\).

In view of Proposition 1.4, \(\eta _A^-\) is a natural choice for the distribution of \(Y_0\). With this choice, the transition path process \(Y_t\) characterizes the empirical distribution of \(A \rightarrow B\) reactive trajectories, as the next theorem shows:

Theorem 1.7

Let \(X_t\) satisfy the SDE (1.1). Let \(Y^k\) denote the \(k{\text {th}}\) \(A \rightarrow B\) reactive trajectory defined by (1.5). Let \(Y\) be the unique process defined by Theorem 1.1 with initial distribution \(Y_0 \sim \eta _A^-(\mathrm{d}x)\) on \(\partial A\) defined by (1.23), and let \({\mathcal {Q}}_{\eta _A^-}\) denote the law of this process on \({\mathcal {X}} = C([0,\infty ))\). Then for any \(F \in L^1({\mathcal {X}},{\mathcal {B}},{\mathcal {Q}}_{\eta _A^-})\), the limit

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{k=0}^{N-1} F( Y^k ) = \widehat{{\mathbb {E}}}[ F(Y)] \end{aligned}$$

holds \({\mathbb { P}}\)-almost surely.

In particular, the limit \( \widehat{{\mathbb {E}}}[F(Y)]\) is independent of \(X_0\). Using Theorem 1.7, several interesting statistics of the transition paths can be expressed in terms of the quantities we have defined. Actually, Proposition 1.4 is an immediate corollary of Theorem 1.7, by choosing \(F(Y^k) = f(Y^k_0)\), so we will not give a separate proof of Proposition 1.4.

1.3 Reaction rate

Let \(N_T\) be the number of \(A \rightarrow B\) reactive trajectories up to time \(T\):

$$\begin{aligned} N_T = 1 + \max _k \left\{ k \ge 0 \mid \tau _{B, k}^+ \le T \right\} . \end{aligned}$$

The reaction rate \(\nu _R\) is defined by the limit

$$\begin{aligned} \nu _R = \lim _{T \rightarrow \infty } \frac{N_T}{T} = \lim _{k \rightarrow \infty } \frac{k}{\tau _{B, k}^+}, \end{aligned}$$
(1.31)

and it is the rate of the transition from \(A\) to \(B\). Also, the limits

$$\begin{aligned} T_{AB} := \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{k=0}^{N-1} \left( \tau _{B,k}^+ - \tau _{A,k}^+\right) \end{aligned}$$
(1.32)

and

$$\begin{aligned} T_{BA} := \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{k=0}^{N-1} \left( \tau _{A,k+1}^+ - \tau _{B,k}^+\right) \end{aligned}$$
(1.33)

are the expected reaction times from \(A \rightarrow B\) and \(B \rightarrow A\), respectively. The reaction rate from \(A \rightarrow B\) and \(B\rightarrow A\) are then given by \(k_{AB} = T_{AB}^{-1}\) and \(k_{BA} = T_{BA}^{-1}\). Another interesting quantity is the expected crossover time from \(A \rightarrow B\)

$$\begin{aligned} C_{AB} := \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{k=0}^{N-1} \left( \tau _{B,k}^+ - \tau _{A,k}^-\right) , \end{aligned}$$
(1.34)

which is the typical duration of the \(A \rightarrow B\) reactive intervals. Observe that \(C_{AB} < T_{AB}\). Similarly, we define

$$\begin{aligned} C_{BA} := \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{k=0}^{N-1} \left( \tau _{A,k+1}^+ - \tau _{B,k}^-\right) . \end{aligned}$$
(1.35)

The next result identifies these limits in terms of the committor functions and the reactive exit and entrance distributions.

Proposition 1.8

The limits (1.31), (1.32), (1.33), (1.34), and (1.35) hold \({\mathbb { P}}\)-almost surely, and

$$\begin{aligned}&\nu _R = \nu = \int _{{\mathbb {R}}^d} \rho (x) \nabla q(x) \cdot a(x) \nabla q(x) \,\mathrm{d}x.\\&T_{AB} = \int _{\partial A} \eta _A^+(\mathrm{d}x) u_B(x) = \frac{1}{\nu _R} \int _{{\mathbb {R}}^d} \rho (x) \widetilde{q}(x) \,\mathrm{d}x. \\&T_{BA} = \int _{\partial B} \eta _B^+(\mathrm{d}x) u_A(x) = \frac{1}{\nu _R} \int _{{\mathbb {R}}^d} \rho (x) ( 1- \widetilde{q}(x)) \,\mathrm{d}x.\\&C_{AB} = \int _{\partial A} \eta _A^-(\mathrm{d}x) v_B(x) = \frac{1}{\nu _R} \int _{{\mathbb {R}}^d} \rho (x)q(x)\widetilde{q}(x) \,\mathrm{d}x. \\&C_{BA} = \int _{\partial B} \eta _B^-(\mathrm{d}x) v_A(x) = \frac{1}{\nu _R} \int _{{\mathbb {R}}^d} \rho (x)(1 - q(x))(1 - \widetilde{q}(x))\,\mathrm{d}x. \end{aligned}$$

Here \(u_B(x) = {\mathbb {E}}[\tau ^X_{B} \;|\; X_0 = x ]\) is the mean first hitting time of \(X_t\) to \(\bar{B}\), and \(v_B(x) = \widehat{{\mathbb {E}}}[ \tau ^Y_B \;|\; Y_0 = x]\) is the mean first hitting time of \(Y_t\) to \(\bar{B}\). Similarly, if \(q\) is replaced by \((1 - q)\) in the definition of \(Y\), then \(v_{A}(x) = \widehat{{\mathbb {E}}}[ \tau ^Y_A \;|\; Y_0 = x]\). Recall that \(\nu \) is the normalizing factor for the reactive exit and entrance distributions.

The formulas for \(\nu _R\), \(T_{AB}\), and \(T_{BA}\) were obtained in [15]. We believe the formulas for \(C_{AB}\) and \(C_{BA}\) are new. We also note that the crossover time for the transition path process in one dimension was recently studied in [4, 10] by other methods.

1.4 Density of transition paths

We now consider the distribution \(\rho _R\) as defined in [15]:

$$\begin{aligned} \rho _R(z) = \lim _{T \rightarrow \infty } \frac{1}{T} \int _0^T \delta (z - X_t) {\mathbb {I}}_{R}(t) \,\mathrm{d}t, \quad z \in \Theta , \end{aligned}$$
(1.36)

where \(R\) is the random set of times at which \(X_t\) is reactive:

$$\begin{aligned} R = \bigcup _{k = 0}^\infty [\tau _{A,k}^-,\tau _{B,k}^+]. \end{aligned}$$

This distribution on \(\Theta \) can be viewed as the density of transition paths. By Proposition 1.8, and Theorem 1.7, we can describe \(\rho _R\) in terms of the transition density for \(Y_t\). Specifically, for any continuous and bounded function \(f:{\mathbb { R}}^d \rightarrow {\mathbb { R}}\), we have

$$\begin{aligned} \begin{aligned} \int _{\Theta } f(z) \rho _R(z) \,\mathrm{d}z&= \nu _R \lim _{T \rightarrow \infty } \frac{1}{N_T} \int _0^T f(X_t) {\mathbb {I}}_{R}(t) \,\mathrm{d}t \\&= \nu _R \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{k=0}^{N-1} \int _0^{\tau _{B, k}^+-\tau _{A, k}^-} f\bigl (Y^k_t\bigr ) \,\mathrm{d}t \\&= \nu _R \, \widehat{{\mathbb {E}}}\left[ \int _0^{t_B} f(Y_t) \,\mathrm{d}t \mid Y_0 \sim \eta _A^- \right] \\&= \nu _R \int _0^\infty \int _{\Theta } Q_R(t,\eta _A^-, z) f(z)\,\mathrm{d}z\, \,\mathrm{d}t. \end{aligned} \end{aligned}$$

Here \(Q_R(t,\eta _A^-,z)\) is the density of \(Y_t\), with \(Y_0 \sim \eta _A^-\), and killed at \(\partial B\)

$$\begin{aligned} Q_R(t, \eta _A^-, z) = {\mathbb {Q}}(Y_t \in \mathrm{d}z, \, t < t_{B} \mid Y_0 \sim \eta _A^-), \end{aligned}$$
(1.37)

and \(t_{B}\) is the first hitting time of \(Y_t\) to \(\bar{B}\). Hence, for \(z \in \Theta \),

$$\begin{aligned} \begin{aligned} \rho _R(z)&= \nu _R \int _0^\infty Q_R(t,\eta _A^-, z) \,\mathrm{d}t. \end{aligned} \end{aligned}$$
(1.38)

Proposition 1.9

For all \(z \in \Theta \),

$$\begin{aligned} \rho _R(z) = \rho (z) q(z) \widetilde{q}(z). \end{aligned}$$
(1.39)

This formula for \(\rho _R\) was first derived in [15, 22].

1.5 Current of transition paths

The density \(Q_R(t,\eta _A^-,z)\) satisfies the adjoint equation

$$\begin{aligned} \frac{\partial }{\partial t} Q_R(t,\eta _A^-,z) = (L^q)^{*} Q_R(t,\eta _A^-,z), \quad z \in \Theta \end{aligned}$$

where \((L^q)^{*}\) is the adjoint of \(L^q\):

$$\begin{aligned} (L^q)^{*} u = \sum _{i,j} (a_{ij}(z) u(z))_{z_i z_j} - \sum _{i} (K_i(z) u(z))_{z_i} \end{aligned}$$

and \(K\) is defined by (1.13). Integrating from \(t = 0\) to \(t = \infty \) we see that \(\rho _R(z)\) satisfies

$$\begin{aligned} (L^q)^{*} \rho _R(z) = 0, \quad z \in \Theta . \end{aligned}$$

In divergence form, this equation is

$$\begin{aligned} \nabla _z \cdot J_R(z) = 0, \end{aligned}$$
(1.40)

where the vector field

$$\begin{aligned} J_R(z)&= \rho _R(z) \biggl (b(z) - \frac{2 a \nabla q(z)}{q(z)}\biggr ) + {{\mathrm{div}}}( a(z)\rho _R(z)) \nonumber \\&= \Bigl (b(z) \rho (z) - {{\mathrm{div}}}\bigl (a(z) \rho (z)\bigr )\Bigr ) q(z) \widetilde{q}(z) \nonumber \\&- \rho (z) a(z) \Bigl ( \widetilde{q}(z) \nabla q(z) - q(z) \nabla \widetilde{q}(z) \Bigr ). \end{aligned}$$
(1.41)

is continuous over \(\bar{\Theta }\). The vector field \(J_R(z)\), identified in [15], may be regarded as the current of transition paths (see Remark 1.13). Observe that if the SDE (1.1) is reversible, we have \(\widetilde{q} = 1 - q\) and

$$\begin{aligned} b(z) \rho (z) - {{\mathrm{div}}}(a(z) \rho (z)) = 0, \end{aligned}$$

and hence the current given by (1.41) simplifies to

$$\begin{aligned} J_R(z) = \rho (z) a(z) \nabla q(z). \end{aligned}$$

This was observed already in [15]. The current was also discussed in potential theory in the context of reversible Markov chains, see e.g. [9].

On the boundary, the current (1.41) is related to the reactive exit and entrance distributions.

Proposition 1.10

We have

$$\begin{aligned} J_R = \rho a \nabla q \; \text {on }\partial A, \quad \text {and} \quad J_R = - \rho a \nabla \widetilde{q}, \; \text {on }\partial B, \end{aligned}$$

and hence,

$$\begin{aligned} \eta _A^-(\mathrm{d}x) = - \nu _R^{-1} \widehat{n}(x) \cdot J_R(x) \,\mathrm{d}\sigma _A(x) \quad \text {and} \quad \eta _B^+(\mathrm{d}x) = \nu _R^{-1} \widehat{n}(x) \cdot J_R(x) \,\mathrm{d}\sigma _B(x). \end{aligned}$$

As an immediate corollary, we have an additional formula for the reaction rate.

Corollary 1.11

Let \(S\) be a set with smooth boundary that contains \(A\) and separates \(A\) and \(B\), we have

$$\begin{aligned} \nu _R = \int _{\partial S} \widehat{n}(x) \cdot J_R(x) \,\mathrm{d}\sigma _{S}(x), \end{aligned}$$
(1.42)

where \(\widehat{n}\) is the unit normal vector exterior to \(S\).

The current \(J_R\) generates a (deterministic) flow in \(\bar{\Theta }\) stopped at \(\partial B\):

$$\begin{aligned} \frac{\,\mathrm{d}Z_t^z}{\,\mathrm{d}t} = J_R(Z_t^z), \quad \text {for } 0 \le t \le t_B, \quad Z_0^z = z \end{aligned}$$
(1.43)

where \(t_B = t_B(z)\) is the time at which \(Z_t\) reaches \(\partial B\). As \(J_R\) is divergence free in \(\Theta \), \(J_R \cdot \widehat{n} < 0\) on \(\partial A\), and \(J_R \cdot \widehat{n} > 0\) on \(\partial B\), \(t_B(z)\) is finite for any \(z \in \bar{\Theta }\). The flow naturally defines a map \(\Phi _{J_R}: \partial A \rightarrow \partial B\): given any point \(z \in \partial A\), we define

$$\begin{aligned} \Phi _{J_R}(z) = Z_{t_B}^z \in \partial B. \end{aligned}$$
(1.44)

Proposition 1.12

For any \(f \in C^1({\mathbb { R}}^d)\),

$$\begin{aligned} \int _{\partial B} f(x) \eta _B^+(\mathrm{d}x) - \int _{\partial A} f(x) \eta _A^-(\mathrm{d}x) = \frac{1}{\nu _R} \int _{\Theta } J_R \cdot \nabla f \,\mathrm{d}x. \end{aligned}$$
(1.45)

In particular,

$$\begin{aligned} \Phi _{J_R, *}(\eta _A^-) = \eta _B^+, \end{aligned}$$

where \(\Phi _{J_R, *}(\eta _A^-)\) is the pushforward of the measure \(\eta _A^-\) by the map \(\Phi _{J_R}\).

Hence, \(J_R\) characterizes “the flow of reactive trajectories” from \(A\) to \(B\).

Remark 1.13

Note that by Propositions 1.4 and 1.5, the left hand side of (1.45) is equal, \({\mathbb { P}}\)-almost surely, to the limit

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n=0}^{N-1} \left( f(X_{\tau _{B, n}^+}) - f(X_{\tau _{A, n}^-}) \right) . \end{aligned}$$

If \(X_t\) was differentiable, we would have

$$\begin{aligned} \begin{aligned}&\lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n=0}^{N-1} \left( f(X_{\tau _{B, n}^+}) - f(X_{\tau _{A, n}^-}) \right) = \lim _{T \rightarrow \infty } \frac{1}{\nu _R} \frac{1}{T} \int _0^{T} 1_R(t) \frac{\mathrm{d}}{\mathrm{d}t} f(X_t) \,\mathrm{d}t \\&\quad \quad \text {``} = \frac{1}{\nu _R} \int _{\Theta } \,\mathrm{d}x \nabla f(x) \cdot \lim _{T \rightarrow \infty } \frac{1}{T} \int _0^T \dot{X}_t \delta (x - X_t) 1_R(t) \,\mathrm{d}t \ \text {''}, \end{aligned} \end{aligned}$$

Combining this with Proposition 1.12, we arrive at a formal characterization of \(J_R\)

$$\begin{aligned} J_R \text {``} = \lim _{T \rightarrow \infty } \frac{1}{T} \int _0^T \dot{X}_t \delta (x - X_t) 1_R(t) \,\mathrm{d}t\ \text {''}. \end{aligned}$$

This formal expression was used in [15] to define \(J_R\).

1.6 Related work

As we have mentioned, our work is closely related to the transition path theory developed by Weinan and Vanden-Eijnden [15, 16, 24], which is a framework for studying the transition paths. In particular, based on the committor function, formula for reaction rate, density and current of transition paths were obtained in [15]. Our main motivation is to understand the probability law of the transition paths. The main results Theorems 1.1, 1.2, and 1.7 identify an SDE which characterizes the law of the transition paths in \(C([0,\infty ))\). Therefore, as an application of these results, we are able to give rigorous proofs for the formula for reaction rate, density and current of transition paths in [15]. We note that in the discrete case, a generator analogous to (1.10) was also proposed very recently in [33] for Markov jumping processes.

Our results may be useful in the design of numerical path-sampling algorithms. Specifically, the results indicate that with knowledge of the committor function \(q(x)\) one can bias the sampling of \(X_t\) in order to directly sample the reactive trajectories, without an acceptance/rejection procedure. Of course, this assumes knowledge of the committor function, which is certainly non-trivial as it involves solving a high dimensional PDE; \(q(x)\) is explicit only in the simplest of cases (such as when \(d=1\)). We refer to [16, 27] and references therein for efforts in numerical approximations of committor functions. Nevertheless, our theoretical results might be used to analyze methods of sampling reactive trajectories. In particular, it would be important to know what sort of approximation of \(q\) could be used to efficiently sample the reactive trajectories. This issue is related to importance sampling algorithms for rare events (see e.g. [13, 36]). We plan to explore these issues more in future works.

The transition paths start at \(\partial A\) and terminate at \(\partial B\), and hence they can be viewed as paths of a bridge process between \(\bar{A}\) and \(\bar{B}\). In this perspective, our work is related to the conditional path sampling for SDEs studied in [20, 21, 29, 31]. In those works, stochastic partial differential equations were proposed to sample SDE paths with fixed end points. However, the paths considered were different from the transition paths as their time duration is fixed a priori. It would be interesting to explore SPDE-based sampling strategies for the transition path process identified in Theorem 1.1.

Let us also point out that in the work we present here we do not assume that the noise \(\sigma \) is small, as is the case in the asymptotic results of [7, 8, 10], which we have mentioned already, and also in some other works, such as the large deviation theory of Freidlin and Wentzell [17].

After this paper was submitted for publication, both Sznitman and one of the editors brought to our attention the relevant work of Meyer et al. [25]. If we define the non-decreasing processes

$$\begin{aligned} V_t^A&= \# \left\{ k \in {\mathbb {Z}}^+ \;|\; \tau _{A,k}^+ \le t \right\} , \nonumber \\ V_t^B&= \# \left\{ k \in {\mathbb {Z}}^+ \;|\; \tau _{B,k}^+ \le t \right\} , \nonumber \end{aligned}$$

where \({\mathbb {Z}}^+\) is the set of non-negative integers, then the triple \((X_t,V_t^A,V_t^B)\) is a Markov process on \({\mathbb { R}}^d \times {\mathbb {Z}}^+ \times {\mathbb {Z}}^+\). Moreover, the exit times \(\tau _{A,k}^-\) defined above coincide with the random times \(L_k = \sup \{ t \ge 0 \;|\; (X_t,V_t^A,V_t^B) \in \bar{A} \times \{ k + 1\} \times \{k\} \}\). Although it is not a stopping time, \(L_k\) is a coterminal time, as defined in [25]. Theorem 5.1 of [25] applied to \((X_{t+ L_k},V_{t+L_k}^A,V_{t + L_k}^B)\) then implies that for \(t > 0\), \(Y_t^k\) is a strong Markov process with transition probability (1.11). In particular, this implies that for any \(t_0 > 0\) and any bounded and continuous functional \(F: C([t_0, \infty )) \rightarrow {\mathbb {R}}\), we have

$$\begin{aligned} {\mathbb {E}}[ F(Y^k_{t}, t \ge t_0)] = \widehat{\mathbb {E}} \bigl [ F(Y_{t}, t \ge t_0) \mid Y_{t_0} = Y^k_{t_0} \bigr ]. \end{aligned}$$

This is similar to but weaker than the statement of Theorem 1.2, which also applies to \(t_0 = 0\). Moreover, the results of [25] do not identify the reactive exit distribution, which plays an important role in Theorem 1.7.

The rest of the paper is organized as follows. Theorems 1.1 and 1.2 are proved in Sect. 2. In Sect. 3 we prove Lemma 1.3, Proposition 1.5 and Theorem 1.7 related to the reactive entrance and exit distributions. As we have mentioned, Proposition 1.4 follows immediately from Theorem 1.7, so we do not give a separate proof of it. Propositions 1.8, 1.9, 1.10, Corollary 1.11, and Proposition 1.12 are proved in Sect. 4.

2 The transition path process

Proof of Theorem 1.1

Without loss of generality, we prove the theorem in the case that \(\xi \equiv y_0\) is a single point in \(\bar{\Theta }\). The interesting aspect of the theorem is that \(y_0\) is allowed to be on \(\partial \Theta \), since the drift term is singular at \(\partial \Theta \). If we assume that \(y_0 \in \Theta \), then existence of a unique strong solution up to the time \(\tau _A \wedge \tau _B\) follows from standard arguments, since \(K(y)\) is Lipschitz continuous in the interior of \(\Theta \). That is, if \(y_0 \in \Theta \), there is a unique, continuous \(\widehat{\mathcal {F}}_t\)-adapted process \(Y_t\) which satisfies

$$\begin{aligned} Y_t = y_0 + \int _0^{t \wedge (\tau _A \wedge \tau _B)} K(Y_s) \,\mathrm{d}s + \int _0^{t \wedge (\tau _A \wedge \tau _B)} \sqrt{2}\, \sigma (Y_s) \,\mathrm{d}\widehat{W}_s, \quad t \ge 0. \end{aligned}$$
(2.1)

Moreover, if \(y_0 \in \Theta \), then we must have \(\tau _A > \tau _B > 0\) almost surely. This follows from an argument similar to the proof of [23, Proposition 3.3.22, p. 161]. Specifically, we consider the process \(z_t = 1/q(Y_t) \in {\mathbb { R}}\), which satisfies

$$\begin{aligned} z_{t \wedge \tau } = z_0 - \int _0^{t \wedge \tau } \sqrt{2} (z_s)^2 \nabla q \cdot \sigma \,\mathrm{d}\widehat{W}_s \end{aligned}$$

where \(\tau = \tau _B \wedge \tau _\epsilon \) with \(\tau _\epsilon = \inf \{ t > 0 \mid q(Y_t) = \epsilon \}\). Since \(\tau < \infty \) with probability one, we have

$$\begin{aligned} z_0 = \widehat{{\mathbb {E}}}[z_{t \wedge \tau }] = \frac{1}{q(\epsilon )} {\mathbb {Q}}( \tau _\epsilon < \tau _B) + {\mathbb {Q}}(\tau _\epsilon > \tau _B). \end{aligned}$$

Hence \({\mathbb {Q}}( \tau _\epsilon < \tau _B) \le q(\epsilon )(z_0-1)\). So, \({\mathbb {Q}}(\tau _A < \tau _B) \le \lim _{\epsilon \rightarrow 0} {\mathbb {Q}}( \tau _\epsilon < \tau _B)=0\).

Now suppose \(y_0 \in \partial A\). In consideration of the comments above, it suffices to prove the desired result with \(\tau _B\) replaced by \(\tau _r\), the first hitting time to \(\partial B_r(y_0) \cap \Theta \), where \(B_r(y_0)\) is a ball of radius \(r > 0\) centered at \(y_0\). Thus, we want to prove existence and pathwise uniqueness of a continuous \(\widehat{\mathcal {F}}_t\)-adapted process \(Y_t:[0,\infty ) \rightarrow \bar{\Theta }\) satisfying

$$\begin{aligned} Y_t = y_0 + \int _0^{t \wedge \tau _r} K(Y_s) \,\mathrm{d}s + \int _0^{t \wedge \tau _r} \sqrt{2}\, \sigma (Y_s) \,\mathrm{d}\widehat{W}_s, \end{aligned}$$
(2.2)

where

$$\begin{aligned} \tau _r = \inf \left\{ t \ge 0 \mid Y_t \in \partial B_r(y_0) \cap \Theta \, \right\} . \end{aligned}$$

It will be very useful to define a new coordinate system in the set \(B_r^+(y_0) = B_r(y_0) \cap \Theta \) and to consider the problem in these new coordinates. For \(r > 0\) small enough we can define a \(C^3\) map \((h^{(1)}(y),\dots ,h^{(d-1)}(y),q(y)): \overline{B_r^+(y_0)} \rightarrow {\mathbb { R}}^{d-1} \times [0,\infty )\), such that the scalar functions \(h^{(i)}(y): \overline{B_r^+(y_0)} \rightarrow {\mathbb { R}}\) satisfy

$$\begin{aligned} \langle \nabla h^{(i)}(y), a(y) \nabla q(y) \rangle = 0, \quad \forall \;y \in \overline{B_r^+(y_0)}, \qquad i = 1,\dots ,d-1. \end{aligned}$$
(2.3)

Furthermore, the map may be constructed so that it is invertible on its range and that the inverse is \(C^3\). The existence of such a map follows from the regularity of \(\partial A\), the regularity of \(q\), and the fact that \(\langle \widehat{n}, a \nabla q \rangle \ne 0\) on \(\partial A\) by (1.9).

For two initial points \(x_1, x_2 \in \Theta \), let \(Y^{x_1}_t\) and \(Y^{x_2}_t\) denote the unique solutions to (2.1) with \(Y^{x_1}_0 = x_1\) and \(Y^{x_2}_0 = x_2\) respectively. That is,

$$\begin{aligned} Y^{x}_t = x + \int _0^{t \wedge \tau ^x_{B}} K(Y^x_s) \,\mathrm{d}s + \int _0^{t \wedge \tau ^x_{B}} \sqrt{2}\, \sigma (Y^x_s)\,\mathrm{d}\widehat{W}_s, \quad t \ge 0, \end{aligned}$$
(2.4)

where \(\tau _B^x\) is the first hitting time of \(Y^x_t\) to \(\partial B\). Changing to the coordinate system defined by \((h^{(1)}(y),\dots ,h^{(d-1)}(y),q(y))\), we denote

$$\begin{aligned} (h_{1,t},q_{1,t}) = (h(Y^{x_1}_t), q(Y^{x_1}_t)) \quad \text {and} \quad (h_{2,t},q_{2,t}) = (h(Y^{x_2}_t), q(Y^{x_2}_t)). \end{aligned}$$

Let \(\tau _r^{1}\) and \(\tau _r^{2}\) denote the first hitting times of \(Y^{x_1}_t\) and \(Y^{x_2}_t\) to the set \(\partial B_r(y_0) \cap \Theta \). The processes \((h_{1,t},q_{1,t})\) and \((h_{2,t},q_{2,t})\) are well-defined up to the times \(\tau ^1_r\) and \(\tau ^2_r\), respectively.

We can control the difference between \((h_{1,t},q_{1,t})\) and \((h_{2,t},q_{2,t})\):

Lemma 2.1

There is a constant \(C\) such that for all \(x_1, x_2 \in B_{r/2}(y_0) \cap \Theta \)

$$\begin{aligned} \widehat{{\mathbb {E}}}\left[ \max _{t \in [0,T] }(q_{1,t\wedge \tau } - q_{2,t \wedge \tau })^2 \right] \le C |x_1 - x_2|^{1/2}, \end{aligned}$$

and

$$\begin{aligned} \widehat{{\mathbb {E}}}\left[ \max _{t \in [0,T] } |h_{1,t\wedge \tau } - h_{2,t \wedge \tau } |^2 \right] \le C |x_1 - x_2|, \end{aligned}$$

where \(\tau = \tau ^{1}_r \wedge \tau ^{2}_r\).

The proof of Lemma 2.1 will be postponed. One immediate corollary is the following.

Corollary 2.2

There is a constant \(C\) such that for all \(x_1, x_2 \in B_{r/2}(y_0) \cap \Theta \)

$$\begin{aligned} {\mathbb {Q}}\biggl ( \max _{0 \le t \le (T \wedge \tau )} |Y^{x_1}_t - Y^{x_2}_t| > \alpha \biggr ) \le C \alpha ^{-2} |x_{1} - x_{2}|^{1/2}, \quad \forall \; \alpha > 0, \end{aligned}$$
(2.5)

where \(\tau = \tau ^{1}_r \wedge \tau ^{2}_r\).

Proof

On the closed set \(\{ z \in {\mathbb { R}}^{d} \mid z = (h(y),q(y)), \ y \in \overline{B_r^+(y_0)} \}\), the map \(y \mapsto (h(y),q(y))\) is invertible with a continuously differentiable inverse. Hence there is a constant \(C\), depending only on the map \(y \mapsto (h(y),q(y))\) such that

$$\begin{aligned} |Y^{x_1}_t - Y^{x_2}_t| \le C \left( |h_{1,t} - h_{2,t}| + |q_{1,t} - q_{2,t}|\right) , \quad \forall \; t \in [0,\tau ]. \end{aligned}$$

By combining this bound with Chebychev’s inequality and Lemma 2.1 we obtain (2.5). \(\square \)

Now suppose \(y_0 \in \partial A\). Let \(\{x_n\}_{n=1}^\infty \subset \Theta \) be a given sequence such that \(x_n \rightarrow y_0\) as \(n \rightarrow \infty \). For each \(n\), define \(Y^{x_n}_t\) by (2.4), and let \(\tau ^n_r\) denote the first hitting time of \(Y^{x_n}_t\) to \(\partial B_r(y_0) \cap \Theta \). We may choose the points \(x_n\) so that \(|x_n - y_0| \le 25^{-n}\). Define \(\widehat{\tau }^n = \tau ^{n+1}_r \wedge \tau _r^{n}\). Applying Corollary 2.2, we conclude

$$\begin{aligned} {\mathbb {Q}}\biggl ( \max _{0 \le t \le (T \wedge \widehat{\tau }^n)} |Y^{x_{n+1}}_t - Y^{x_n}_t| > 2^{-n} \biggr ) \le C 2^{2n} 5^{-n}. \end{aligned}$$

Therefore, by the Borel–Cantelli lemma, the series

$$\begin{aligned} \sum _{n =1}^\infty \max _{0 \le t \le (T \wedge \widehat{\tau }^n)} |Y^{x_{n+1}}_t - Y^{x_n}_t| < \infty \end{aligned}$$
(2.6)

with probability one. Let us define

$$\begin{aligned} \tau _r = \liminf _{n \rightarrow \infty } \tau _r^{n} = \liminf _{n \rightarrow \infty } \widehat{\tau }^n. \end{aligned}$$
(2.7)

We will prove that \(\tau _r\) is positive:

Lemma 2.3

For all \(r > 0\) sufficiently small, \({\mathbb {Q}}(\tau _r > 0 ) = 1\).

In view of (2.6) and Lemma 2.3, we conclude that there must be a continuous process \(Y_t\) such that, with probability one,

$$\begin{aligned} Y^{x_n}_t \rightarrow Y_t \end{aligned}$$

uniformly on compact subsets of \([0,\tau _r)\), as \(n \rightarrow \infty \). Let us define

$$\begin{aligned} \bar{\tau }_{r/2} = \inf \{ t \ge 0 \mid Y_t \in \partial B_{r/2}(y_0) \cap \Theta \}. \end{aligned}$$
(2.8)

Lemma 2.3

For all \(r > 0\) sufficiently small, \({\mathbb {Q}}(\bar{\tau }_{r/2} \in (0,\tau _r)) = 1\), and \(\bar{\tau }_{r/2}\) is stopping time with respect to \(\widehat{\mathcal {F}}_t\).

We will postpone the proof of Lemmas 2.3 and 2.4. Since \(\bar{\tau }_{r/2} < \tau _r\), \(Y^{x_n}_t \rightarrow Y_t\) uniformly on \([0,\bar{\tau }_{r/2}]\). Let us now replace \(Y_t\) by the stopped process \(Y_{t \wedge \bar{\tau }_{r/2}}\). Since each \(Y^{x_n}_t\) is \(\widehat{\mathcal {F}}_t\)-adapted, so is the limit \(Y_t\). We claim that \(Y_t\) satisfies

$$\begin{aligned} Y_t = y_0 + \int _0^{t \wedge \bar{\tau }_{r/2}} K(Y_s) \,\mathrm{d}s + \int _0^{t \wedge \bar{\tau }_{r/2}} \sqrt{2}\, \sigma (Y_s)\,\mathrm{d}\widehat{W}_s, \quad t \ge 0. \end{aligned}$$
(2.9)

Since \(Y^{x_n}_t \rightarrow Y_t\) uniformly on \([0,\bar{\tau }_{r/2}]\), we have \((q(Y^{x_n}_t), h(Y^{x_n}_t)) \rightarrow (q(Y_t),h(Y_t))\) uniformly on \([0,\bar{\tau }_{r/2}]\), and \((q_t,h_t) = (q(Y_t),h(Y_t))\) satisfies

$$\begin{aligned} h_t = h_0 + \int _0^{t \wedge \bar{\tau }_{r/2}} f(q_s,h_s) \,\mathrm{d}s + \int _0^{t \wedge \bar{\tau }_{r/2}} m(q_s,h_s) \,\mathrm{d}\widehat{W}_s , \end{aligned}$$
(2.10)

and

$$\begin{aligned} q_t - \int _0^{t \wedge \bar{\tau }_{r/2}} g(q_s,h_s) \cdot \,\mathrm{d}\widehat{W}_s = \lim _{n \rightarrow \infty } \int _0^{t \wedge \tau ^{n}_r} \frac{|g(q_s^{x_n},h_s^{x_n})|^2}{q_s^{x_n}} \,\mathrm{d}s. \end{aligned}$$
(2.11)

for all \(t \in [0,\bar{\tau }_{r/2}]\), where \((q_t^{x_n},h_t^{x_n}) = (q(Y^{x_n}_t), h(Y^{x_n}_t))\). (Recall \(q_0 = 0\).) Since \(q_s^{x_n} > 0\), the last limit can be bounded below using Fatou’s lemma:

$$\begin{aligned} q_t - \int _0^{t \wedge \bar{\tau }_{r/2}} g(q_s,h_s) \cdot \,\mathrm{d}\widehat{W}_s \ge \int _0^{t \wedge \bar{\tau }_{r/2}} \liminf _{n \rightarrow \infty } \frac{|g(q_s^{x_n},h_s^{x_n})|^2}{q_s^{x_n}} \,\mathrm{d}s = \int _0^{t \wedge \bar{\tau }_{r/2}} \frac{|g(q_s,h_s)|^2}{q_s} \,\mathrm{d}s. \end{aligned}$$
(2.12)

Recall that \(|g(q_s,h_2)|^2 \ge C_r > 0\). In particular, with probability one, the random set \(H = \{ s \in [0,\bar{\tau }_{r/2}] \mid q_s = 0 \}\) must have zero Lebesgue measure; if that were not the case, then we would have

$$\begin{aligned} - \int _0^{t \wedge \bar{\tau }_{r/2}} g(q_s,h_s) \cdot \,\mathrm{d}\widehat{W}_s = +\infty , \end{aligned}$$

for all \(t\) in a set of positive Lebesgue measure, an event which happens with zero probability. Therefore, by Fubini’s theorem,

$$\begin{aligned} 0 = \widehat{{\mathbb {E}}} \int _0^T {\mathbb {I}}_H(s) \,\mathrm{d}s = \int _0^T {\mathbb {Q}}( s < \bar{\tau }_{r/2} \,,\; q_s = 0 ) \,\mathrm{d}s \end{aligned}$$

which implies that \({\mathbb {Q}}( s < \bar{\tau }_{r/2}, \; q_s = 0 ) = 0\) for almost every \(s \ge 0\). Since \(\bar{\tau }_{r/2} > 0\) almost surely, this implies that we may choose a deterministic sequence of times \(t_n \in (0, 1/n]\) such that, almost surely, \(q_{t_n} > 0\) for \(n\) sufficiently large. By then applying the same argument as when \(y_0 \in \Theta \), we conclude that \(q_t > 0\) for all \(t > t_n\). Hence, \(q_t > 0\) for all \(t > 0\) must hold with probability one.

Since \(q_t\) is continuous, we now know that for any \(\epsilon > 0\),

$$\begin{aligned} \min _{ t > \epsilon } q_t > 0. \end{aligned}$$

holds with probability one. In particular,

$$\begin{aligned} \liminf _{n \rightarrow \infty } \min _{ t > \epsilon } q_t^{x_n} > 0, \end{aligned}$$

so that

$$\begin{aligned} \lim _{n \rightarrow \infty } \int _\epsilon ^{t \wedge \tau ^{n}} \frac{|g(q_s^{x_n},h_s^{x_n})|^2}{q_s^{x_n}} \,\mathrm{d}s = \int _\epsilon ^{t \wedge \bar{\tau }_{r/2}} \frac{|g(q_s,h_s)|^2}{q_s} \,\mathrm{d}s, \end{aligned}$$

almost surely. Since \(q_t\) is continuous at \(t = 0\), we also know that

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \lim _{n \rightarrow \infty } \int _0^{t \wedge \tau ^{n} \wedge \epsilon } \frac{|g(q_s^{x_n},h_s^{x_n})|^2}{q_s^{x_n}} \,\mathrm{d}s = \lim _{\epsilon \rightarrow 0} \left( q_\epsilon - \int _0^{t \wedge \bar{\tau }_{r/2} \wedge \epsilon } g(q_s,h_s) \cdot \mathrm{d}\widehat{W}_s \right) = 0 \end{aligned}$$

almost surely. Returning to (2.11) we now conclude that

$$\begin{aligned} q_t - \int _0^{t \wedge \bar{\tau }_{r/2}} g(q_s,h_s) \cdot d\widehat{W}_s&= \lim _{\epsilon \rightarrow 0} \lim _{n \rightarrow \infty } \int _0^{t \wedge \tau ^{n} \wedge \epsilon } \frac{|g(q_s^{x_n},h_s^{x_n})|^2}{q_s^{x_n}} \,\mathrm{d}s \nonumber \\&+ \lim _{\epsilon \rightarrow 0} \lim _{n \rightarrow \infty } \int _\epsilon ^{t \wedge \tau ^{n}} \frac{|g(q_s^{x_n},h_s^{x_n})|^2}{q_s^{x_n}}\,\mathrm{d}s \nonumber \\&= \lim _{\epsilon \rightarrow 0} \int _\epsilon ^{t \wedge \bar{\tau }_{r/2}} \frac{|g(q_s,h_s)|^2}{q_s} \,\mathrm{d}s \nonumber \\&= \int _0^{t \wedge \bar{\tau }_{r/2}} \frac{|g(q_s,h_s)|^2}{q_s} \,\mathrm{d}s \end{aligned}$$
(2.13)

holds with probability one. Equation (2.9) for \(Y_t\) now follows from (2.10) and (2.13) by changing coordinates.

Except for the proofs of Lemmas 2.1, 2.3, and 2.4, we have now established existence of a strong solution \(Y_t\) to (2.2) (with \(r\) replaced by \(r/2\)). The uniqueness of the solution follows by the same arguments. Suppose that \(Y^{1}_t\) and \(Y^{2}_t\) both solve (2.2) with the same Brownian motion and the same initial point \(Y^1_0 = Y^2_0 = y_0\). Then Corollary 2.2 implies that, \({\mathbb {Q}}\) almost surely, \(Y^{1}_t = Y^{2}_t\) for all \(t \in [0,\tau ^1_r \wedge \tau ^2_r]\) where \(\tau ^1_r\) and \(\tau ^2_r\) are the corresponding hitting times to \(\partial B_r(y_0) \cap \Theta \). In particular, \(\tau ^1_r = \tau ^2_r\). This proves pathwise uniqueness. \(\square \)

We now prove Lemmas 2.1, 2.3 and 2.4 to complete the proof of Theorem 1.1.

Proof of Lemma 2.1

By Itô’s formula the process \((h_1,q_1) = (h_{1,t},q_{1,t})\) satisfies

$$\begin{aligned} \,\mathrm{d}h_1&= f(q_1,h_1) \,\mathrm{d}t + m(q_1,h_1) \,\mathrm{d}\widehat{W}_t, \end{aligned}$$
(2.14)
$$\begin{aligned} \,\mathrm{d}q_1&= \frac{|g(q_1,h_1)|^2}{q_1} \,\mathrm{d}t + g(q_1,h_1) \cdot \mathrm{d}\widehat{W}_t, \end{aligned}$$
(2.15)

for \(0 \le t \le \tau ^{1}_r\), where the functions \(g = \sqrt{2} (\nabla q)^{\mathrm {T}} \sigma \in {\mathbb { R}}^{d}\), \(f = L h \in {\mathbb { R}}^{d-1}\), and \(m = \sqrt{2}(\nabla h)^{\mathrm {T}} \sigma \in {\mathbb { R}}^{(d-1) \times d}\), are all Lipschitz continuous in their arguments over \(\bar{B_r^+}\). Similarly, \((h_2,q_2) = (h_{2,t},q_{2,t})\) satisfies

$$\begin{aligned} \,\mathrm{d}h_2&= f(q_2,h_2) \,\mathrm{d}t + m(q_2,h_2) \,\mathrm{d}\widehat{W}_t \end{aligned}$$
(2.16)
$$\begin{aligned} \,\mathrm{d}q_2&= \frac{|g(q_2,h_2)|^2}{q_2} \,\mathrm{d}t + g(q_2,h_2) \cdot \mathrm{d}\widehat{W}_t, \end{aligned}$$
(2.17)

for \(0 \le t \le \tau ^{2}_r\). Notice that the choice of coordinates satisfying (2.3) has eliminated a potentially singular drift term in the equations for \(h_{1,t}\) and \(h_{2,t}\). On the other hand, the drift term in the equations for \(q_1\) and \(q_2\) blows up near the boundary \(q = 0\). Indeed, if \(r > 0\) is small enough, by (1.9) there is a constant \(C_r > 0\) such that

$$\begin{aligned} \inf _{y \in \bar{B^+_r}} 2 \langle \nabla q(y)), a(y) \nabla q(y) \rangle \ge 2 \lambda \inf _{y \in \bar{B^+_r}} |\nabla q(y) | \ge C_r. \end{aligned}$$
(2.18)

Hence,

$$\begin{aligned} \begin{aligned} |g(q_{1,t}, h_{1,t})|^2&= 2 \langle \nabla q(Y^{x_1}_t), a(Y^{x_1}_t) \nabla q(Y^{x_1}_t) \rangle \ge 2 \lambda \inf _{y \in \bar{B^+_r}} |\nabla q(y) | \ge C_r > 0. \end{aligned} \end{aligned}$$
(2.19)

Letting \(\tau = \tau _r^1 \wedge \tau _r^2\) and using (2.14) and (2.16), we compute

$$\begin{aligned} \mathrm{d}\, |h_1 - h_2|^2&= 2 (h_1 - h_2)^{\mathrm {T}} (f(q_1,h_1) - f(q_2,h_2)) \,\mathrm{d}t \\&\quad + 2 (h_1 - h_2)^{\mathrm {T}} (m(q_1,h_1) - m(q_2,h_2)) \,\mathrm{d}\widehat{W}_t \\&\quad + {{\mathrm{tr}}}\left( (m(q_1,h_1) - m(q_2,h_2))(m(q_1,h_1) - m(q_2,h_2))^{\mathrm {T}}\right) \,\mathrm{d}t \end{aligned}$$

for \(0 \le t \le \tau \). In particular,

$$\begin{aligned} \widehat{{\mathbb {E}}}\, \left[ |h_{1,t\wedge \tau } - h_{2,t \wedge \tau }|^2\right]&\le C \int _{0}^{t} \widehat{{\mathbb {E}}}\, \left[ {\mathbb {I}}_{[0,\tau ]}(s) (q_{1,s} - q_{2,s})^2\right] \,\mathrm{d}s \nonumber \\&+ C \int _{0}^{t} \widehat{{\mathbb {E}}}\,\left[ {\mathbb {I}}_{[0,\tau ]}(s)|h_{1,s} - h_{2,s}|^2\right] \,\mathrm{d}s + C|x_1 - x_2|, \nonumber \\&\le C \int _{0}^{t} \widehat{{\mathbb {E}}}\,\left[ (q_{1,s\wedge \tau } - q_{2,s\wedge \tau })^2\right] \,\mathrm{d}s \nonumber \\&+ C \int _{0}^{t} \widehat{{\mathbb {E}}}\,\left[ |h_{1,s\wedge \tau } - h_{2,s\wedge \tau }|^2\right] \,\mathrm{d}s + C|x_1 - x_2|,\nonumber \\ \end{aligned}$$
(2.20)

holds for all \(t \ge 0\).

From (2.15) and (2.17) we also compute

$$\begin{aligned} \mathrm{d}\, (q_1 - q_2)^2&= 2 (q_1 - q_2) \mathrm{d}(q_1 - q_2) + |g_1 - g_2|^2 \,\mathrm{d}t \nonumber \\&= 2 (q_1 - q_2) \left( \frac{|g_1|^2}{q_1} - \frac{|g_2|^2}{q_2}\right) \mathrm{d}t \nonumber \\&+ 2 (q_1 - q_2)(g_1 - g_2) \cdot \mathrm{d}\widehat{W}_t + |g_1 - g_2|^2 \,\mathrm{d}t \end{aligned}$$
(2.21)

for \(0 \le t \le \tau \), where we have used the notation \(g_1 = g(q_1,h_1)\) and \(g_2 = g(q_2,h_2)\). We claim that there is a constant \(C\), depending only on \(r\), such that

$$\begin{aligned} 2 (q_1 - q_2) \left( \frac{|g_1|^2}{q_1} - \frac{|g_2|^2}{q_2}\right) \le C (|q_1 - q_2|^2 + |h_1 - h_2|^2) \end{aligned}$$
(2.22)

holds for all \(t \le \tau \), with probability one. Both sides of (2.22) are invariant when \((q_1,h_1)\) and \((q_2,h_2)\) are interchanged. So, we may assume \(q_1 \le q_2\) without loss of generality. We consider the following two possibilities. First, suppose that

$$\begin{aligned} 0 \le q_1 \bigl ||g_1 |^2 - |g_2 |^2\bigr |\le (q_2 - q_1) |g_1 |^2. \end{aligned}$$
(2.23)

Using this and \(q_1 \le q_2\) we have

$$\begin{aligned} 2 (q_1 - q_2) \left( \frac{|g_1 |^2}{q_1} - \frac{|g_2|^2}{q_2}\right)&= 2 \frac{(q_1 - q_2)}{q_1 q_2} \left( q_2 |g_1|^2 - q_1 |g_2|^2 \right) \nonumber \\&=2 \frac{(q_1 - q_2)}{q_1 q_2} \left( (q_2 - q_1)|g_1|^2 - q_1(|g_2|^2 - |g_1|^2) \right) \nonumber \\&\mathop {\le }\limits ^{(2.23)}0. \end{aligned}$$
(2.24)

The other possibility is

$$\begin{aligned} 0 \le (q_2 - q_1) |g_1 |^2 \le q_1 \bigl ||g_1 |^2 - |g_2 |^2\bigr |. \end{aligned}$$
(2.25)

In this case, we have (also using \(q_1 \le q_2\))

$$\begin{aligned} 2 (q_1 - q_2) \left( \frac{|g_1|^2}{q_1} - \frac{|g_2|^2}{q_2}\right)&= 2 \frac{(q_1 - q_2)}{q_1 q_2} \left( (q_2 - q_1)|g_1|^2 - q_1( |g_2|^2 - |g_1|^2) \right) \nonumber \\&\le - 2 \frac{(q_1 - q_2)}{q_1 q_2} q_1 ( |g_2|^2 - |g_1|^2) \nonumber \\&\le 2 \frac{|q_1 - q_2 |}{|q_2 |} \bigl ||g_2 |^2 - |g_1 |^2 \bigr |\nonumber \\&\le 2 \frac{|q_1 - q_2 |}{|q_1 |} \bigl ||g_2 |^2 - |g_1 |^2 \bigr |\nonumber \\&\mathop {\le }\limits ^{(2.25)} 2 \frac{( |g_2|^2 - |g_1|^2)^2}{|g_1|^2}. \end{aligned}$$
(2.26)

Therefore, since \(|g_1 | \ge C_r > 0\) (by 2.19), we must have

$$\begin{aligned} 2 (q_1 - q_2) \left( \frac{|g_1|^2}{q_1} - \frac{|g_2|^2}{q_2}\right) \le 2 C_r^{-2} ( |g_2|^2 - |g_1|^2)^2 \le C (|q_1 - q_2|^2 + |h_1 - h_2|^2). \end{aligned}$$

where \(C > 0\) depends only on \(r\). This establishes (2.22).

Returning to (2.21) and controlling the first term on the right hand side of (2.21) with (2.22), we conclude that

$$\begin{aligned} \widehat{{\mathbb {E}}}\, \bigl [(q_{1,t\wedge \tau } - q_{2,t \wedge \tau })^2 \bigr ]&\le C \int _{0}^t \widehat{{\mathbb {E}}}\,\bigl [{\mathbb {I}}_{[0,\tau ]}(s) (q_{1,s} - q_{2,s})^2\bigr ] \,\mathrm{d}s \nonumber \\&+ C \int _{0}^t \widehat{{\mathbb {E}}}\,\bigl [{\mathbb {I}}_{[0,\tau ]}(s) |h_{1,s} - h_{2,s}|^2\bigr ] \,\mathrm{d}s + C |x_1 - x_2|,\nonumber \\&\le C \int _{0}^t \widehat{{\mathbb {E}}}\,\bigl [(q_{1,s \wedge \tau } - q_{2,s\wedge \tau })^2 \bigr ] \,\mathrm{d}s \nonumber \\&+ C \int _{0}^t \widehat{{\mathbb {E}}}\,\bigl [|h_{1,s\wedge \tau } - h_{2,s\wedge \tau }|^2\bigr ] \,\mathrm{d}s + C |x_1 - x_2|.\qquad \end{aligned}$$
(2.27)

By combining (2.20) and (2.27) and applying Gronwall’s inequality, we conclude that

$$\begin{aligned} \widehat{{\mathbb {E}}}\, \bigl [ |h_{1,t \wedge \tau } - h_{2,t\wedge \tau }|^2\bigr ] + \widehat{{\mathbb {E}}}\, \bigl [(q_{1,t\wedge \tau } - q_{2,t\wedge \tau })^2\bigr ] \le C|x_1 - x_2|\left( 1 + t e^{C t}\right) , \quad t \ge 0. \end{aligned}$$
(2.28)

Using (2.21) and (2.22) we also obtain

$$\begin{aligned} \widehat{{\mathbb {E}}}\, \Bigl [\max _{t \in [0,T] }(q_{1,t\wedge \tau } - q_{2,t \wedge \tau })^2\Bigr ]&\le C \int _{0}^T \widehat{{\mathbb {E}}}[(q_{1,s \wedge \tau } - q_{2,s\wedge \tau })^2] \,\mathrm{d}s \nonumber \\&+ C \int _{0}^T \widehat{{\mathbb {E}}}[|h_{1,s\wedge \tau } - h_{2,s\wedge \tau }|^2] \,\mathrm{d}s + C |x_1 - x_2| \nonumber \\&+ \widehat{{\mathbb {E}}}\left[ \max _{t \in [0,T] } V_{t} \right] \end{aligned}$$
(2.29)

where \(V_t\) is the martingale

$$\begin{aligned} V_t = \int _0^{t\wedge \tau } 2 (q_1 - q_2)(g_1 - g_2) \cdot \,\mathrm{d}\widehat{W}_s. \end{aligned}$$

By the Burkholder–Davis–Gundy inequality (e.g. [30, Sec IV.4]) and (2.28), we have

$$\begin{aligned} \widehat{{\mathbb {E}}}\left[ \max _{t \in [0,T] } V_t \right] \le C \left( \int _0^T \widehat{{\mathbb {E}}}[ (q_{1,s\wedge \tau } - q_{2,s \wedge \tau })^2 ] \,\mathrm{d}s \right) ^{1/2} \le C_T|x_1 - x_2|^{1/2}. \end{aligned}$$

This, together with (2.28) and (2.29), gives us

$$\begin{aligned} \widehat{{\mathbb {E}}}\left[ \max _{t \in [0,T] }(q_{1,t\wedge \tau } - q_{2,t \wedge \tau })^2 \right] \le C_T |x_1 - x_2|^{1/2}. \end{aligned}$$

Similar arguments for \(h_1 - h_2\) lead to

$$\begin{aligned} \widehat{{\mathbb {E}}}\left[ \max _{t \in [0,T] }|h_{1,t\wedge \tau } - h_{2,t \wedge \tau }|^2 \right] \le C_T |x_1 - x_2|. \end{aligned}$$

\(\square \)

Proof of Lemma 2.3

Suppose \(\tau _r = 0\) holds with probability \(\epsilon > 0\). Because of (2.6) we may choose \(m\) sufficiently large so that

$$\begin{aligned} \sum _{n =m}^\infty \max _{0 \le t \le (T \wedge \widehat{\tau }^n)} |Y^{x_{n+1}}_t - Y^{x_n}_t| < r/4 \end{aligned}$$

holds with probability at least \(1 - \epsilon /2\). Therefore, with probability at least \(\epsilon /2\) we have both \(\tau _r = 0\) and

$$\begin{aligned} \liminf _{n \rightarrow \infty } |Y^{x_{n}}_{\tau _r^{n}} - Y^{x_m}_{\tau _r^{n}} | \le r/4. \end{aligned}$$
(2.30)

Recall that \(|Y^{x_m}_0 - y_0| \le 25^{-m}\). Let \(m\) be larger, if necessary, so that \(25^{-m} \le r/4\). This and (2.30) imply that

$$\begin{aligned} \liminf _{n \rightarrow \infty } |Y^{x_{n}}_{\tau _r^{n}} - y_0| \le \liminf _{n \rightarrow \infty } \left( |Y^{x_{n}}_{\tau _r^{n}} - Y^{x_m}_{\tau _r^{n}}| + | Y^{x_m}_{\tau _r^{_n}} - y_0|\right) \le r/4 + 25^{-m} \le r/2 \end{aligned}$$

holds with probability at least \(\epsilon /2\). However, this contradicts the fact that \(Y^{x_n}_{\tau _r^{n}} \in \partial B_r(y_0)\) for all \(n\). Hence, we must have \(\tau _r > 0\) with probability one.\(\square \)

Proof of Lemma 2.4

The fact that \(\bar{\tau }_{r/2} > 0\) with probability one follows from an argument very similar to the proof of Lemma 2.3. The fact that \(\bar{\tau }_{r/2} < \tau _r\) will follow by showing that

$$\begin{aligned} \limsup _{t \nearrow \tau _r} |Y_t - y_0| \ge r \end{aligned}$$
(2.31)

holds with probability one. First, suppose that \(\tau ^{n}_r < \tau _r\) and that

$$\begin{aligned} \tau ^{n}_r = \inf _{k \ge n} \tau ^{k}_r \end{aligned}$$

Then by (2.6) we have

$$\begin{aligned} |Y_{\tau ^{n}_r} - y_0| \ge |Y^{x_n}_{\tau ^{n}_r} - y_0| - |Y_{\tau ^{n}_r} - Y^{x_n}_{\tau ^{n}}| = r - |Y_{\tau ^{n}_r} - Y^{x_n}_{\tau ^{n}_r}| = r - R(n). \end{aligned}$$

where \(R(n)\) is the series remainder

$$\begin{aligned} R(n) = \sum _{k = n}^\infty \max _{0 \le t \le \tau ^{n}_r} |Y^{x_{k+1}}_t - Y^{x_k}_t| \end{aligned}$$

which converges to zero, with probability one, as \(n \rightarrow \infty \). So, with probability one, if there is an increasing sequence of such times \(\tau ^{{n_j}}_r \nearrow \tau _r\) as \(j \rightarrow \infty \), we see that (2.31) must hold. On the other hand, suppose there is no such sequence. Then we must have \(\tau ^{n}_r \ge \tau _r\) for \(n\) sufficiently large. Hence \(Y^{x_n}_t\) must converge to \(Y_t\) uniformly on the closed interval \([0,\tau _r]\). Suppose \(\tau ^{n}_r \ge \tau _r\) and \(\tau ^{n}_r = \sup _{k \ge n} \tau ^{k}_r\). Then for all \(k \ge n\), we have

$$\begin{aligned} \begin{aligned} |Y^{x_n}_{\tau ^{k}_r} - y_0|&\ge |Y^{x_k}_{\tau ^{k}_r} - y_0| - |Y^{x_n}_{\tau ^{k}_r} -Y^{x_k}_{\tau ^{k}_r}| \\&= r - |Y^{x_n}_{\tau ^{k}_r} -Y^{x_k}_{\tau ^{k}_r}| \ge r - M(n). \end{aligned} \end{aligned}$$

Therefore, since \(Y^{x_n}_t\) is continuous on \([0,\tau ^{n}_r]\) and since \(\tau _r = \liminf _{k \ge 0} \tau ^{k}_r\), we have

$$\begin{aligned} |Y^{x_n}_{\tau _r} - y_0| \ge r - M(n). \end{aligned}$$

Since \(Y^{x_n}_{\tau _r} \rightarrow Y_{\tau _r}\) in this case and \(Y_t\) is continuous on \([0,\tau _r]\), then with probability one, this case also implies that (2.31) holds. Having established that \(0 < \bar{\tau }_{r/2} < \tau _r\) we conclude that \(Y^{x_n}_{t} \rightarrow Y_t\) uniformly on \([0,\bar{\tau }_{r/2}]\). Since each \(Y^{x_n}_t\) is \(\widehat{\mathcal {F}}_t\)-adapted, so is the limit \(Y_t\). In particular, \(\bar{\tau }_{r/2}\) is a stopping time. \(\square \)

Remark 2.5

Let us point out that if \(y_0 \in \partial A\) and \(T > 0\) is sufficiently small, the equation

$$\begin{aligned} \bar{Y}(t) = y_0 + \int _0^t K(\bar{Y}(s)) \,\mathrm{d}s, \quad t \in [0,T]. \end{aligned}$$
(2.32)

has a unique solution satisfying \(\bar{Y}(t) \in \Theta \) for all \(t \in (0,T]\). Indeed, let \(z(t)\) solve the ODE

$$\begin{aligned} z'(t) = 2 a (z(t))\nabla q(z(t)) + q(z(t))b(z(t)) \end{aligned}$$

for \(t \in [0,T]\), with \(z(0) = y_0\). For sufficiently small \(T\), \(z(s) \in \Theta \) for \(t \in (0,T]\). Hence \(q(z(s)) > 0\) for \(t \in (0,T]\) and the function \(F(t) = \int _0^t q(z(s))\,ds\) is invertible. Now, it is easy to check that the function \(\bar{Y}(t) = z(F^{-1}(t))\) is continuous on \([0,T]\) and satisfies (2.32). Moreover, \(\bar{Y}(t) \in \Theta \) for all \(t \in (0,T]\). In fact,

$$\begin{aligned} \bar{Y}(t) \sim y_0 + 2 \sqrt{t}\, \frac{a(y_0) \nabla q(y_0)}{\langle \nabla q(y_0),a(y_0) \nabla q(y_0) \rangle ^{1/2}} \end{aligned}$$

for small \(t\).

We state and prove two properties of the transition path process, which will be used later.

Proposition 2.6

Let \(F\) be a bounded and continuous functional on \(C([0,\infty ))\). Define

$$\begin{aligned} g(x) = \widehat{{\mathbb {E}}}\,[ F(Y) \mid Y_0 = x] \end{aligned}$$

where \(Y_t\) satisfies (1.14). Then \(g \in C(\bar{\Theta })\).

Proof

Suppose that \(\{x_n\}_{n=1}^\infty \subset \bar{\Theta }\) and that \(x_n \rightarrow x \in \bar{\Theta }\) as \(n \rightarrow \infty \). We claim that there must be a subsequence \(\{ x_{n_j} \}_{j=1}^\infty \) such that, \({\mathbb {Q}}\)-almost surely,

$$\begin{aligned} \lim _{j \rightarrow \infty } F(Y^j) = F(Y), \end{aligned}$$
(2.33)

where \(Y^j_t\) satisfies (1.14) with \(Y^j_0 = x_{n_j}\), and \(Y_t\) satisfies (1.14) with \(Y_0 = x\). Since \(F\) is bounded and continuous on \(C([0,\infty ))\), the dominated convergence theorem then implies that

$$\begin{aligned} \lim _{j \rightarrow \infty } g(x_{n_j}) = \lim _{j \rightarrow \infty } \widehat{{\mathbb {E}}}\,[ F(Y) \mid Y_0 = x_{n_j}] = \widehat{{\mathbb {E}}}\,[ F(Y) \mid Y_0 = x] = g(x). \end{aligned}$$

Since the limit is independent of the subsequence, this implies that \(g(x)\) is continuous.

To establish (2.33), we must show that \(Y^j_t \rightarrow Y_t\) uniformly on compact subsets of \([0,\infty )\). This follows from Corollary 2.2, as in the proof of Theorem 1.1. \(\square \)

Proposition 2.7

For any \(R > 0\), there is a function \(h_R:[0,+\infty ) \rightarrow [0,1]\) such that \(\int _0^\infty h_R(t) \,dt < +\infty \) and

$$\begin{aligned} \sup _{\begin{array}{c} x \in \bar{\Theta } \\ |x| \le R \end{array}} {\mathbb {Q}}( Y_t \in \Theta \mid Y_0 = x) \le h_R(t). \end{aligned}$$

holds for all \(t \ge 0\).

Proof

If \(x \in \Theta \), then by the Doob h-transform, we know that

$$\begin{aligned} {\mathbb {Q}}( Y_t \in \Theta \mid Y_0 = x)&= \frac{{\mathbb { P}}(X_s \in \Theta \;\forall s \in [0,t], \;\; \tau _B < \tau _A \mid X_0 = x)}{{\mathbb { P}}( \tau _B < \tau _A \mid X_0 = x)} \\&\le \frac{{\mathbb { P}}(X_s \in \Theta \;\forall s \in [0,t]\mid X_0 = x) \wedge {\mathbb { P}}( \tau _B < \tau _A \mid X_0 = x)}{{\mathbb { P}}( \tau _B < \tau _A \mid X_0 = x)} \\&= \frac{{\mathbb { P}}( \tau _{AB} > t \mid X_0 = x) \wedge q(x)}{q(x)}, \end{aligned}$$

where \(\tau _{AB}\) is the first hitting time of \(X\) to \(\bar{A} \cup \bar{B}\). Let \(\alpha > 1\) be as in assumption (1.4). Since \(\bar{A} \cup \bar{B}\) has non-empty interior and since \(\sigma \sigma ^T\) is uniformly positive definite, assumption (1.4) implies that for each \(R > 0\) there is \(C_R\) such that

$$\begin{aligned} \sup _{|x| \le R} {\mathbb {E}}[\tau _{AB}^\alpha \;|\; X_0 = x] < C_R. \end{aligned}$$

From this and Chebychev’s inequality, it follows that

$$\begin{aligned} \sup _{|x| \le R} {\mathbb { P}}( \tau _{AB} > t \mid X_0 = x)&\le t^{-\alpha } \sup _{|x| \le R} {\mathbb {E}}[\tau _{AB}^\alpha \;|\; X_0 = x] \le C_R\,t^{-\alpha } \end{aligned}$$
(2.34)

holds for all \(t > 0\). So, for any \(\epsilon > 0\),

$$\begin{aligned} {\mathbb {Q}}( Y_t \in \Theta \mid Y_0 = x) \le \frac{C_R \,t^{-\alpha } \wedge \epsilon }{\epsilon } \end{aligned}$$
(2.35)

holds for all \(t > 0\) and \(x \in \{ x \in \Theta \mid \; |x| \le R,\, q(x) \ge \epsilon \}\).

The bound (2.35) does not include points near \(\partial A\), where \(q(x) < \epsilon \). Fix \(\epsilon \in (0,1)\) and define the set \(S = \{ x \in \Theta \mid q(x) < \epsilon \} \cup \bar{A}\). If \(\epsilon \) is small enough, this set is bounded and we may assume \(|x | < R\) for all \(x \in S\). Suppose \(Y_0 = x\) with \(x \in S \cap \bar{\Theta }\). Let \(q_t = q(Y_t)\), which satisfies

$$\begin{aligned} q_t = q_0 + \int _0^t \frac{|g(Y_s)|^2}{q_s} \,\mathrm{d}s + \int _0^t g(Y_s) \,\mathrm{d}\widehat{W}_s \end{aligned}$$

where \(g(y) = \sqrt{2}(\nabla q(y))^{\mathrm {T}} \sigma (y)\). By (1.9) we know that if \(\epsilon > 0\) is small enough, there is a constant \(C_\epsilon > 0\) such that \(|g(y)|^2 \ge C_\epsilon \) for all \(y \in \bar{S} \cap \bar{\Theta }\). Therefore, if \(Y_t \in \bar{S} \cap \bar{\Theta }\) for all \(t \in [0,T]\), we must have \(q_t \le \epsilon \) for all \(t \in [0,T]\) and

$$\begin{aligned} q_t \ge \int _0^t \frac{C_\epsilon }{q_s} \,\mathrm{d}s + \int _0^t g(Y_s) \,\mathrm{d}\widehat{W}_s \ge t\epsilon ^{-1} C_{\epsilon } + \int _0^t g(Y_s) \,\mathrm{d}\widehat{W}_s \end{aligned}$$

for all \(t \in [0,T]\). This happens only if the martingale \(M_t = \int _0^t g(Y_s) \,\mathrm{d}\widehat{W}_s\) satisfies

$$\begin{aligned} M_t \le \epsilon - t \epsilon ^{-1} C_\epsilon , \quad t \in [0,T]. \end{aligned}$$

To control the probability of this event, for any \(\gamma > 0\), \(\beta > 0\), \(T > 0\), Chebychev’s inequality implies

$$\begin{aligned} {\mathbb {Q}}(M_T \le - \gamma T )&\le e^{-\beta \gamma T} \widehat{{\mathbb {E}}}[e^{-\beta M_T}] \le e^{-\beta \gamma T} \widehat{{\mathbb {E}}}\left[ \exp \left( \frac{\beta ^2}{2} \int _0^T |g|^2\, ds\right) \right] \\&\le e^{- \beta \gamma T + \frac{\beta ^2}{2} ||g ||_\infty ^2 T}. \end{aligned}$$

By choosing \(\beta = \gamma /||g ||_\infty ^2\) we have \({\mathbb {Q}}(M_T \le - \gamma T ) \le e^{- \gamma ^2 C_1 T}\). Hence there is a constant \(C_2 > 0\) such that

$$\begin{aligned} {\mathbb {Q}}\left( Y_t \in \bar{S} \cap \bar{\Theta }, \;\;\; \forall \;t \in [0,T] \mid Y_0 = x \right) \le e^{- \epsilon ^2 C_2 T} \end{aligned}$$
(2.36)

holds for all \(T > 1\) and \(x \in \bar{S} \cap \bar{\Theta }\).

Now we combine (2.35) and (2.36). Let \(\tau _S = \inf \{ t > 0 \mid Y_t \in \partial S \}\). By (2.36) we have \({\mathbb {Q}} \left( \tau _S > t/2\mid Y_0 = x \right) \le e^{- C_3 t}\) holds for all \(x \in \bar{S} \cap \bar{\Theta }\). Therefore, since \(\tau _S\) is a stopping time, we conclude that

$$\begin{aligned} {\mathbb {Q}} \left( Y_t \in \Theta \mid Y_0 \in x \right)&\le {\mathbb {Q}} \left( Y_t \in \Theta , \tau _S < t/2 \mid Y_0 \in x\right) + e^{- C_3 t} \\&\le \sup _{y \in \partial S} {\mathbb {Q}} \left( Y_{t/2} \in \Theta \mid Y_0 \in y\right) + e^{- C_3 t} \\&\le \frac{C t^{-\alpha } \wedge \epsilon }{\epsilon } + e^{-C_3 t}. \end{aligned}$$

for all \(x \in \bar{S} \cap \bar{\Theta }\). Since the last expression is an integrable function of \(t\), this completes the proof.\(\square \)

Proof of Theorem 1.2

Since \(\tau _{A,n}^+\) is a stopping time, it suffices to prove the result for \(n = 0\). Fix \(\epsilon > 0\) and let \(S \supset \bar{A}\) be the open set

$$\begin{aligned} S = \{ x \in \Theta \mid q(x) < \epsilon \} \cup \bar{A}. \end{aligned}$$

For \(\epsilon > 0\) small, this is a bounded set that separates \(A\) and \(B\). The boundary \(\partial S\) is an isosurface for \(q\): \(q(x) = \epsilon \) for \(x \in \partial S\). As \(\epsilon \rightarrow 0\), \(S\) shrinks to \(A\), and the Hausdorff distance \(d_{\mathcal {H}}(\partial S, \partial A)\) is \({\mathcal {O}}(\epsilon )\) (because of (1.9)).

Recalling that \(\tau _{A,0}^+ = \inf \{ t \ge 0 \mid X_t \in \bar{A} \}\), we define

$$\begin{aligned} r_{S,0} = \inf \{ t > \tau _{A,0}^+ \mid X_t \in \partial S\}. \end{aligned}$$

which is a stopping time with respect to \({\mathcal {F}}_t\). Then for \(k \ge 0\), we define inductively the stopping times (see Fig. 2)

$$\begin{aligned} r_{A,k}&= \inf \left\{ t > r_{S,k} \mid X_t \in \bar{A} \right\} , \\ r_{B,k}&= \inf \left\{ t > r_{S,k} \mid X_t \in \bar{B} \right\} , \\ r_{S,k+1}&= \inf \left\{ t > r_{A,k} \mid X_t \in \partial S\right\} . \end{aligned}$$

Observe that \(r_{S,k} < r_{A,k} < r_{S,k+1}\), although it is possible that \(r_{B,k} = r_{B,k+1}\). Let \(r_{AB,k} = r_{A,k} \wedge r_{B,k}\), which is finite with probability one. We also define the random time

$$\begin{aligned} \tau _{S,j} = \inf \left\{ t > \tau _{A,j}^- \mid X_t \in \partial S \right\} . \end{aligned}$$
Fig. 2
figure 2

Left panel The set \(S\) and random times \(\tau _{S, j}\). Right panel Zoom-in of the boxed region together with stopping times \(r_{S, k}\) and \(r_{A, k}\)

Although \(\tau _{S,j}\) is not a stopping time with respect to \({\mathcal {F}}_t\), the relation

$$\begin{aligned} \left\{ r_{S,k} \mid k \ge 0,\;\; r_{B,k} < r_{A,k}\right\} = \{ \tau _{S,j}\}_{j=0}^\infty \end{aligned}$$
(2.37)

holds \({\mathbb { P}}\)-almost surely.

Now, let

$$\begin{aligned} Y^0_t = X_{(t + \tau _{A,0}^-) \wedge \tau _{B,0}^+}, \quad \quad t \ge 0, \end{aligned}$$

and let \(h_0 = \tau _{S,0} - \tau _{A,0}^-\). Since \(F\) is bounded and continuous, and since \(h_0 \rightarrow 0\) (\({\mathbb { P}}\) almost surely) as \(\epsilon \rightarrow 0\), we have

$$\begin{aligned} {\mathbb {E}}[F(X_{\cdot \,+ \tau _{A,0}^-})]= {\mathbb {E}}[F(Y^0_{\cdot })] = \lim _{\epsilon \rightarrow 0} {\mathbb {E}}[F(Y^0_{\cdot \, + h_0})]. \end{aligned}$$
(2.38)

We will show that

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} {\mathbb {E}}[F(Y^0_{\cdot \, + h_0})] = {\mathbb {E}}[ g(X_{\tau _{A,0}^-})] \end{aligned}$$

where \(g(x) = \widehat{{\mathbb {E}}}[F(Y_\cdot )\mid Y_0 = x]\).

Let \(M\) be the unique (random) integer such that

$$\begin{aligned} \tau _{S,0} = r_{S,M}. \end{aligned}$$

Equivalently, \(M = \min \{ k \ge 0 \mid r_{B,k} < r_{A,k} \}\). Since \(r_{B,k} > r_{A,k}\) for all \(k < M\), we have

$$\begin{aligned} F(Y^0_{\cdot \, + h_0}) = \sum _{k=0}^M F(X_{\cdot \, + r_{S,k}}) {\mathbb {I}}_{r_{B,k} < r_{A,k}} = \sum _{k=0}^\infty F(X_{\cdot \, + r_{S,k}}) {\mathbb {I}}_{r_{B,k} < r_{A,k}} {\mathbb {I}}_{k \le M}. \end{aligned}$$
(2.39)

Observe that the event \(\{k \le M\}\) coincides with the event that \(r_{B,j} > r_{A,j}\) for all \(j < k\), so the event \(\{k \le M\}\) is measurable with respect to \({\mathcal {F}}_{r_{S,k}}\). Therefore, we have

$$\begin{aligned} {\mathbb {E}}[F(Y^0_{\cdot \, + h_0})]&= \sum _{k=0}^\infty {\mathbb {E}}\left[ F(X_{\cdot \, + r_{S,k}}) {\mathbb {I}}_{r_{B,k} < r_{A,k}} {\mathbb {I}}_{k \le M} \right] \\&= \sum _{k=0}^\infty {\mathbb {E}}\left[ {\mathbb {E}}[F(X_{\cdot \, + r_{S,k}}) {\mathbb {I}}_{r_{B,k} < r_{A,k}} {\mathbb {I}}_{k \le M} \mid {\mathcal {F}}_{r_{S,k}}] \right] \\&= \sum _{k=0}^\infty {\mathbb {E}}\left[ {\mathbb {I}}_{k \le M} \; {\mathbb {E}}[F(X_{\cdot \, + r_{S,k}}) {\mathbb {I}}_{r_{B,k} < r_{A,k}} \mid {\mathcal {F}}_{r_{S,k}}]\right] \\&= \sum _{k=0}^\infty {\mathbb {E}}\left[ {\mathbb {I}}_{k \le M} \; f(X_{r_{S,k}})\right] , \end{aligned}$$

where

$$\begin{aligned} f(x) = {\mathbb {E}}[F(X_{\cdot }) {\mathbb {I}}_{\tau _{B} < \tau _{A}} \mid X_0 = x] = q(x) \widehat{{\mathbb {E}}}[F(Y_{\cdot })\mid Y_0 = x]. \end{aligned}$$

The last equality follows from the Doob \(h\)-transform (since \(x \in \partial S \subset \Theta \) here). Since \(q(x) = \epsilon \) for all \(x \in \partial S\), this means

$$\begin{aligned} {\mathbb {E}}[F(Y^0_{\cdot \, + h_0})] = \epsilon \, {\mathbb {E}}\left[ \sum _{k=0}^M \; g(X_{r_{S,k}}) \; \right] \end{aligned}$$
(2.40)

where \(g(x) = \widehat{{\mathbb {E}}}[F(Y_{\cdot } )\mid Y_0 = x]\). Note that the random integer \(M\) depends on \(\epsilon \).

Let \(A_j\) denote the event \(\{j < M\}\), which occurs if and only if \(r_{A,k} < r_{B,k}\) for all \(k \in \{0,1,\dots ,j\}\). Since \(q(x) = \epsilon \) for all \(x \in \partial S\), the event \(A_j\) is independent of \(X_{r_{S,j}} \in \partial S\). Moreover, \(P(A_j) = (1 - \epsilon )^{j+1}\), since

$$\begin{aligned} \begin{aligned} {\mathbb { P}}(A_j)&= {\mathbb {E}}\left[ \prod _{k=0}^j {\mathbb {I}}_{r_{A,k} < r_{B,k}} \right] \\&= {\mathbb {E}}\left[ \; \prod _{k=0}^{j-1} {\mathbb {I}}_{r_{A,k} < r_{B,k}} \; {\mathbb {E}}[ {\mathbb {I}}_{r_{A,j} < r_{B,j}} \mid {\mathcal {F}}_{r_{S,j}}]\; \right] = (1 - \epsilon ) {\mathbb { P}}(A_{j-1}). \end{aligned} \end{aligned}$$

Similarly, \({\mathbb { P}}( M = j) = \epsilon (1 - \epsilon )^j\). Now we evaluate (2.40):

$$\begin{aligned} {\mathbb {E}}[F(Y^0_{\cdot \, + h_0})]&= \epsilon \,{\mathbb {E}}[g(X_{r_{S,0}})] + \epsilon \,{\mathbb {E}}\left[ \sum _{k=1}^{M} \; g(X_{r_{S,k}}) \right] \\&= \epsilon \,{\mathbb {E}}[g(X_{r_{S,0}})] + \epsilon \, {\mathbb {E}}\left[ \sum _{j=0}^\infty {\mathbb {I}}_{A_j} \; g(X_{r_{S,j+1}})] \right] \\&= \epsilon \,{\mathbb {E}}[g(X_{r_{S,0}})] + \epsilon \sum _{j=0}^\infty {\mathbb {E}}\left[ {\mathbb {I}}_{A_j} \; g(X_{r_{S,j+1}}) \right] \\&= \epsilon \,{\mathbb {E}}[g(X_{r_{S,0}})] + \epsilon \sum _{j=0}^\infty {\mathbb { P}}(A_j)\,{\mathbb {E}}\left[ \; g(X_{r_{S,j+1}}) \right] \\&= \epsilon \,{\mathbb {E}}[g(X_{r_{S,0}})] + \epsilon \sum _{j=0}^\infty (1 - \epsilon )^{j+1} {\mathbb {E}}\left[ \; g(X_{r_{S,j+1}}) \right] \\&= \sum _{j=0}^\infty \epsilon (1 - \epsilon )^{j} \, {\mathbb {E}}\left[ \; g(X_{r_{S,j}}) \right] \\&= \sum _{j=0}^\infty {\mathbb { P}}( M = j ) {\mathbb {E}}\left[ g(X_{r_{S,j}})\right] = {\mathbb {E}}\left[ g(X_{\tau _{S,0}})\right] . \end{aligned}$$

Now let \(\epsilon \rightarrow 0\). Since \(g(x)\) is bounded and is continuous up to \(\partial A\) by Proposition 2.6, we have (by the dominated convergence theorem)

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} {\mathbb {E}} \left[ g(X_{\tau _{S,0}})\right] = {\mathbb {E}} \left[ \lim _{\epsilon \rightarrow 0} g(X_{\tau _{S,0}})\right] = {\mathbb {E}} \left[ g(X_{\tau _{A,0}^-})\right] . \end{aligned}$$
(2.41)

\(\square \)

3 Reactive exit and entrance distributions

Proof of Lemma 1.3

The equality (1.22) is equivalent to

$$\begin{aligned} \int _{\partial \Theta } \rho (x) \widehat{n}(x)\cdot a(x) \nabla q(x) \,\mathrm{d}\sigma _{\Theta }(x) = 0. \end{aligned}$$

Using (1.19), it is then equivalent to

$$\begin{aligned} \langle \rho , Lq \rangle = \langle L^{*} \rho , q \rangle = 0, \end{aligned}$$

which is obvious.\(\square \)

Before proving Proposition 1.5, we will need to establish some properties of the entrance and exit distributions and of the harmonic measure associated with the generator \(L\). These results will also be used later in the paper. First, using integration by parts, we have

Lemma 3.1

Let \(D \subset {\mathbb { R}}^d\) be open with smooth boundary. Let \(\phi , \psi \in C^2(D) \cap C^1(\bar{D})\) and bounded. Then

$$\begin{aligned}&\int _{D} \rho (x) \bigl (\phi (x) L \psi (x) - \psi (x) \widetilde{L}\phi (x) \bigr ) \,\mathrm{d}x = \int _{\partial D} \rho (x) b\cdot \widehat{n}(x) \phi (x) \psi (x) \,\mathrm{d}\sigma _{D}(x) \nonumber \\&\quad + \int _{\partial D} \rho (x) \phi (x) \widehat{n}(x) \cdot a \nabla \psi (x) - \psi (x) \widehat{n}(x) \cdot {{\mathrm{div}}}(a(x) \rho (x) \phi (x)) \,\mathrm{d}\sigma _{D}(x),\qquad \qquad \end{aligned}$$
(3.1)

where \(\widehat{n}(x)\) is the exterior normal vector at \(x \in \partial D\).

Let us recall some tools from potential theory (see for example the books [28, 32] and also [7, 8] where potential theory was applied to analyze diffusion processes with metastability). The harmonic measure \(H_D(x, \mathrm{d}y)\) is given by the Poisson kernel corresponding to the boundary value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} L u(x) = 0, &{} x \in D, \\ u(x) = f(x), &{} x \in \partial D. \end{array}\right. } \end{aligned}$$
(3.2)

Therefore, for \(f \in C(\partial D)\),

$$\begin{aligned} u(x) = \int _{\partial D} H_D(x, \mathrm{d}y) f(y), \end{aligned}$$
(3.3)

is the unique solution to (3.2). Similarly, the harmonic measure \(\widetilde{H}_D(x, \mathrm{d}y)\) corresponds to the generator \(\widetilde{L}\) (recall (1.28)). For the boundary value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \widetilde{L} \widetilde{u} (x) = 0, &{} x \in D, \\ \widetilde{u}(x) = f(x), &{} x \in \partial D, \end{array}\right. } \end{aligned}$$
(3.4)

the solution is given by

$$\begin{aligned} \widetilde{u}(x) = \int _{\partial D} \widetilde{H}_D(x, \mathrm{d}y) f(y). \end{aligned}$$
(3.5)

The harmonic measures have a probabilistic interpretation: \(H_D(x, \mathrm{d}y)\) (resp. \(\widetilde{H}_D(x, \mathrm{d}y)\)) gives the probability that the process associated with the generator \(L\) (resp. \(\widetilde{L}\)) first strikes the boundary \(\partial D\) at \(\mathrm{d}y\) after starting at \(x\). In particular,

$$\begin{aligned} q(x) = H_D(x,\partial B) \quad \text {and} \quad \widetilde{q}(x) = \widetilde{H}_D(x,\partial A). \end{aligned}$$

We also define the harmonic measures for the conditioned processes as

$$\begin{aligned} H_{\Theta }^q(x, \mathrm{d}y) = \frac{q(y)}{q(x)} H_{\Theta }(x, \mathrm{d}y). \end{aligned}$$
(3.6)

For \(x \in \Theta \) this is a measure on \(\partial B\). For \(x \in \partial A\) where \(q(x) = 0\), we may define \(H_{\Theta }^q(x,\mathrm{d}y)\) through a limit:

$$\begin{aligned} H_{\Theta }^q(x,\mathrm{d}y) = \lim _{\begin{array}{c} x' \in \Theta \\ x' \rightarrow x \end{array}} \frac{q(y)}{q(x)} H_{\Theta }(x, \mathrm{d}y) = \frac{ \widehat{n}(x) \cdot a(x) \nabla _x H_{\Theta }(x,dy)}{\widehat{n}(x) \cdot a(x) \nabla _x q(x)} , \quad x \in \partial A. \end{aligned}$$
(3.7)

Recall that \(q(y) = 1\) for \(y \in \partial B\).

Recall the reactive exit and entrance measures \(\eta _A^-\), \(\eta _A^+\), \(\eta _B^-\) and \(\eta _B^+\). They are connected by harmonic measures as follows:

Proposition 3.2

$$\begin{aligned}&\int _{\partial A} \eta _A^-(\mathrm{d}x) H_{\Theta }^q(x, \mathrm{d}y) = \eta _B^+(\mathrm{d}y). \end{aligned}$$
(3.8)
$$\begin{aligned}&\int _{\partial A} \eta _A^+(\mathrm{d}x) H_{\bar{B}^{c}}(x,\mathrm{d}y) = \eta _B^+(\mathrm{d}y). \end{aligned}$$
(3.9)
$$\begin{aligned}&\int _{\partial B} \eta _B^+(\mathrm{d}x) H_{\bar{A}^{c}}(x,\mathrm{d}y) = \eta _A^+(\mathrm{d}y). \end{aligned}$$
(3.10)

Proof

We prove (3.8) first. If \(f \in C(\partial B)\), let \(u_f(x)\) solve \(Lu = 0\) in \(\Theta \) with

$$\begin{aligned} u = {\left\{ \begin{array}{ll} f(x), &{} x \in \partial B, \\ 0, &{} x \in \partial A. \end{array}\right. } \end{aligned}$$
(3.11)

Hence \(u(x) \widetilde{q}(x) = 0\) on \(\partial \Theta \). By applying (3.1) with \(\phi (x) = \widetilde{q}(x)\) and \(\psi (x) = u_f(x)\), we obtain

$$\begin{aligned} \int _{\partial A} \rho (x) \widehat{n}(x) \cdot a(x) \nabla u_f(x) \,\mathrm{d}\sigma _{A}(x)&= \int _{\partial B} f(x) \widehat{n}(x) \cdot {{\mathrm{div}}}(a(x) \rho (x) \widetilde{q}(x)) \,\mathrm{d}\sigma _{B}(x) \nonumber \\&= \int _{\partial B} f(x) \rho (x) \widehat{n}(x) \cdot a(x) \nabla \widetilde{q}(x) \,\mathrm{d}\sigma _{B}(x) \nonumber \\&= - \int _{\partial B} f(x) \eta _B^+(dx). \end{aligned}$$
(3.12)

From (3.7) and (1.23), we see that for all \(x \in \partial A\),

$$\begin{aligned} \int _{\partial A} \eta _A^-(\mathrm{d}x) H_{\Theta }^q(x,\mathrm{d}y) = - \int _{\partial A} \rho (x)\widehat{n}(x) \cdot a(x) \nabla _x H_{\Theta }(x,\mathrm{d}y) \,\mathrm{d}\sigma _{A}(x). \end{aligned}$$

Hence for any \(f \in C(\partial B)\), we have

$$\begin{aligned}&\int _{\partial B} \left( \,\,\int _{\partial A} \eta _A^-(\mathrm{d}x) H^q_{\Theta }(x,\mathrm{d}y)\right) f(y) \\&\quad = - \int _{\partial B} \int _{\partial A} \rho (x)\widehat{n}(x) \cdot a(x) \nabla _x \left( f(y) H_{\Theta }(x,\mathrm{d}y)\right) \,\mathrm{d}\sigma _{A}(x) \\&\quad = - \int _{\partial A} \rho (x)\widehat{n}(x) \cdot a(x) \nabla _x \left( \,\,\int _{\partial B} H_{\Theta }(x,\mathrm{d}y)f(y) \right) \mathrm{d}\sigma _{A}(x) \\&\quad = - \int _{\partial A} \rho (x)\widehat{n}(x) \cdot a(x) \nabla _x u_f(x) \,\mathrm{d}x. \end{aligned}$$

Combining this with (3.12), we conclude that

$$\begin{aligned} \int _{\partial B} \left( \,\, \int _{\partial A} \eta _A^-(\mathrm{d}x) H^q_{\Theta }(x,\mathrm{d}y)\right) f(y) = \int _{\partial B} f(x) \eta _B^+(\mathrm{d}x), \quad \forall \; f \in C(\partial B), \end{aligned}$$

which proves (3.8).

To prove (3.9), let \(\psi \) solve \(L \psi = 0\) for \(x \in \bar{B}^{c}\) with \(\psi = f\) on \(\partial B\). Then by (3.1) with \(\phi = 1 - \widetilde{q}\), we have

$$\begin{aligned} \int _{\partial A} \eta _A^+(\mathrm{d}x) \psi (x)&= \int _{\partial A} \rho (x) \widehat{n}(x) \cdot a(x) \nabla \widetilde{q}(x) \psi (x) \,\mathrm{d}\sigma _A(x) \\&= - \int _{\partial A} \rho (x) \widehat{n}(x) \cdot a(x) \nabla (1 - \widetilde{q}(x)) \psi (x) \,\mathrm{d}\sigma _A(x) \\&= - \int _{\partial A} \psi (x) \widehat{n}(x) \cdot {{\mathrm{div}}}( a \rho (1 - \widetilde{q})) \,\mathrm{d}\sigma _A(x) \qquad \\&\bigl (\text {since } 1 - \widetilde{q} = 0 \text { on } \partial A \bigr )\\&= \int _{\partial B} f \widehat{n} \cdot {{\mathrm{div}}}( a \rho (1 - \widetilde{q})) \,\mathrm{d}\sigma _B(x) - \int _{\partial B} f \rho b \cdot \widehat{n} \,\mathrm{d}\sigma _B(x) \\&- \int _{\partial B} \rho \widehat{n} \cdot a \nabla \psi \,\mathrm{d}\sigma _B(x). \end{aligned}$$

Applying (3.1) with the function \(\phi \equiv 1\), we also find that

$$\begin{aligned} 0 = - \int _{\partial B} f \widehat{n} \cdot {{\mathrm{div}}}( a \rho ) \,\mathrm{d}\sigma _B(x) + \int _{\partial B} f \rho b \cdot \widehat{n} \,\mathrm{d}\sigma _B(x) + \int _{\partial B} \rho \widehat{n} \cdot a \nabla \psi \,\mathrm{d}\sigma _B(x). \end{aligned}$$

Therefore, since \(1 - \widetilde{q} = 1\) on \(\partial B\), we conclude that

$$\begin{aligned} \begin{aligned} \int _{\partial A} \eta _A^+(\mathrm{d}x) \psi (x)&= \int _{\partial B} f \widehat{n} \cdot {{\mathrm{div}}}( a \rho (1 - \widetilde{q})) \,\mathrm{d}\sigma _B(x) - \int _{\partial B} f \widehat{n}(x) \cdot {{\mathrm{div}}}( a \rho ) \,\mathrm{d}\sigma _B(x) \\&= \int _{\partial B} f \rho \widehat{n}\cdot a\nabla (1 - \widetilde{q})\,\mathrm{d}\sigma _B(x) \\&= - \int _{\partial B} f \rho \widehat{n}\cdot a\nabla \widetilde{q}\,\mathrm{d}\sigma _B(x) = \int _{\partial A} f \eta _B^+(\mathrm{d}x). \end{aligned} \end{aligned}$$

We arrive at (3.9) noting that

$$\begin{aligned} \psi (x) = \int _{\partial B} H_{\bar{B}^{c}}(x, \mathrm{d}y) f(y). \end{aligned}$$

We omit the proof of (3.10) which is analogous to that of (3.9) by switching the role of \(A\) and \(B\). \(\square \)

By combining (3.9) and (3.10) we immediately obtain the following:

Corollary 3.3

Let \(P_B(x,\mathrm{d}y)\) be the probability transition kernel

$$\begin{aligned} P_B(x,\mathrm{d}y) = \int _{\partial A} H_{\bar{A}^{c}}(x,\mathrm{d}z)H_{\bar{B}^{c}}(z,\mathrm{d}y), \quad x,y \in \partial B \end{aligned}$$

on \(\partial B\), and let \(P_A(x,\mathrm{d}y)\) be the probability transition kernel

$$\begin{aligned} P_A(x,\mathrm{d}y) = \int _{\partial B} H_{\bar{B}^{c}}(x,\mathrm{d}z) H_{\bar{A}^{c}}(z,\mathrm{d}y), \quad x,y \in \partial A \end{aligned}$$

on \(\partial A\). Then

$$\begin{aligned} \int _{x \in \partial B} \eta _B^+(\mathrm{d}x) P_B(x,\mathrm{d}y) = \eta _B^+(\mathrm{d}y). \end{aligned}$$

and

$$\begin{aligned} \int _{x \in \partial A} \eta _A^+(\mathrm{d}x) P_A(x,\mathrm{d}y) = \eta _A^-(\mathrm{d}y). \end{aligned}$$

That is, \(\eta _B^+\) and \(\eta _A^+\) are invariant under \(P_B\) and \(P_A\), respectively.

We are ready to return to the proof of Proposition 1.5.

Proof of Proposition 1.5

We first verify that \(\eta _B^+\) is a probability measure. Taking \(\psi = q\) and \(\phi = \widetilde{q}\) in (3.1), we obtain using the boundary conditions of \(q\) and \(\widetilde{q}\) on \(\partial A\) and \(\partial B\),

$$\begin{aligned} \begin{aligned} \eta _{A}^-(\partial A) = \frac{1}{\nu } \int _{\partial _A} \rho \widehat{n} \cdot a \nabla q \,\mathrm{d}\sigma _A&= \frac{1}{\nu } \int _{\partial B} \widehat{n} \cdot {{\mathrm{div}}}(a\rho \widetilde{q}) \,\mathrm{d}\sigma _B \\&= \frac{1}{\nu } \int _{\partial B} \widehat{n} \cdot a \rho \nabla \widetilde{q} \,\mathrm{d}\sigma _B = \eta _{B}^+(\partial B). \end{aligned} \end{aligned}$$

This shows that \(\eta _B^+(\partial B) = 1\) and \(\nu \) is the correct normalization constant.

Let \(g\) be a positive continuous function on \(\partial B\). Define for \(x \not \in \bar{B}\),

$$\begin{aligned} u(x) = {\mathbb { E}}\left[ g(X_{\tau _B}) \mid X_0 = x \right] . \end{aligned}$$
(3.13)

Hence \(u\) satisfies the equation

$$\begin{aligned} {\left\{ \begin{array}{ll} L u(x) = 0, &{} x \in \bar{B}^c; \\ u(x) = g(x), &{} x \in \partial B. \end{array}\right. } \end{aligned}$$
(3.14)

Let \(H_{\bar{B}^c}(x, \mathrm{d}y)\) be the harmonic measure (the measure of the first hitting point on \(\bar{B}\) for the process starting at \(x\)). We have

$$\begin{aligned} u(x) = \int _{\partial B} H_{\bar{B}^c}(x, \mathrm{d}y) g(y). \end{aligned}$$
(3.15)

By the maximum principle, \(u > 0\) in \(\bar{B}^{c}\). By the Harnack inequality for non-divergence form elliptic operators [18, Corollary 9.25] and the compactness of \(\partial A\), we have

$$\begin{aligned} \sup _{x \in \partial A} u(x) \le C \inf _{x \in \partial A} u(x), \end{aligned}$$
(3.16)

where the constant \(C > 0\) only depends on the elliptic constants of \(a(x)\) and on the maximum of \(|b|\) over some compact set \(A'\) satisfying \(A \subset A' \subset \bar{B}^c\). In particular, \(C\) is independent of \(g\). Therefore, we obtain for any \(x, x' \in \partial A\), \(y \in \partial B\)

$$\begin{aligned} 0 < C^{-1} \le \frac{H_{\bar{B}^c}(x, \mathrm{d}y)}{H_{\bar{B}^c}(x', \mathrm{d}y)} \le C < \infty . \end{aligned}$$
(3.17)

If we define

$$\begin{aligned} \nu _B(\mathrm{d}y) = \inf _{x \in \partial A} H_{\bar{B}^c}(x, \mathrm{d}y), \end{aligned}$$
(3.18)

then \(\nu _B(E) > 0\) is absolutely continuous with respect to \(\sigma _B(dy)\) on \(\partial B\), and

$$\begin{aligned} H_{\bar{B}^c}(x, \mathrm{d}y) \ge C^{-1} \nu _B(\mathrm{d}y) \end{aligned}$$
(3.19)

for any \(x \in \partial A\).

Consider the Markov chain given by \(\{X_{\tau _{B, k}^+}\}_{k=0}^\infty \) on \(\partial B\). Let \(P_B\) denote its transition kernel, given by

$$\begin{aligned} P_B(y, \mathrm{d}y') = \int _{\partial A} H_{\bar{A}^c}(y, \mathrm{d}x) H_{\bar{B}^c}(x, \mathrm{d}y'). \end{aligned}$$
(3.20)

By (3.19), \(P_B\) satisfies Doeblin’s minorization condition:

$$\begin{aligned} P_B(y, \mathrm{d}y') \ge C^{-1} \int _{\partial A} H_{\bar{A}^c}(y, \mathrm{d}x) \nu _B(\mathrm{d}y') = C^{-1} \nu _B(\mathrm{d}y'). \end{aligned}$$
(3.21)

Therefore, \(P_B\) has a unique invariant measure [3, Theorem 6.1]. By Corollary 3.3, this invariant measure is given by \(\eta _B^+\). Hence, as \(N \rightarrow \infty \), \(\int _{\partial B} f(x) \,\mathrm{d}\mu _{B, N}^+(x)\) converges exponentially fast to \(\int _{\partial B} f(x) \,\mathrm{d}\eta _B^+(x)\) (see e.g. [26, Theorem 17.1.7]). The rate of the convergence depends on the sets \(A\) and \(B\). \(\square \)

Proof of Theorem 1.7

Consider the family of processes

$$\begin{aligned} X^{A,n}_t = X_{(t + \tau _{A,n}^+) \wedge \tau _{B,n}^+}. \end{aligned}$$

Observe that the \(n\)th reactive trajectory \(t \mapsto Y^n_t\) is a subset of the path \(t \mapsto X^{A,n}_t\); specifically, \(Y^n_t = X^{A,n}_{t + \tau _{A,n}^- - \tau _{A,n}^+}\) for all \(t \ge 0\). The random sequence of points

$$\begin{aligned} y_n = X^{A,n}_0 = X_{\tau _{A,n}^+} \in \partial A, \quad n = 0,1,2,\dots \end{aligned}$$

corresponds to a Markov chain on the state space \(\partial A\) with transition kernel

$$\begin{aligned} P_A(x,\mathrm{d}y) = {\mathbb { P}}(y_{n+1} \in \mathrm{d}y \mid y_n = x) = \int _{\partial B} H_{\bar{B}^{c}}(x,\mathrm{d}z) H_{\bar{A}^{c}}(z,\mathrm{d}y). \end{aligned}$$

As shown in the proof of Proposition 1.5 (reversing the role of \(B\) and \(A\)), this chain satisfies a Doeblin minorizing condition

$$\begin{aligned} P_A(x,\mathrm{d}y) \ge C^{-1} \nu _A(dy) = C^{-1} \inf _{x \in \partial B} H_{\bar{A}^c}(x,\mathrm{d}y) > 0, \end{aligned}$$
(3.22)

and the chain has a unique invariant probability distribution \(\eta _A^+\) supported on \(\partial A\):

$$\begin{aligned} \int _{\partial A} \eta _A^+(\mathrm{d}x) P_A(x,\mathrm{d}y) = \eta _A^+(\mathrm{d}y). \end{aligned}$$

The sequence of processes \(t \mapsto X^{A,n}_t\) corresponds to a homogeneous Markov chain on the metric space \({\mathcal {X}} = C([0,\infty ))\). The transition probability \(K\) for this chain may be expressed as follows. If \(X \in C([0,\infty ))\) is such that \(\tau _B^X = \inf \{ t \ge 0 \;|\; X_t \in \partial B\}\) is finite, then for any set \(E \in {\mathcal {B}}\),

$$\begin{aligned} K(X,E) = {\mathbb { P}}(X^{A,n+1} \in E \;|\; X^{A,n} = X) = \int _{\partial A} H_{\bar{A}^c}(X_{\tau _B^X},dy) {\mathcal {P}}_y(E), \end{aligned}$$
(3.23)

where \({\mathcal {P}}_x\) denotes the law on \(({\mathcal {X}},{\mathcal {B}})\) of the process \(t \mapsto Z_{t \wedge \tau _{B}}\) where

$$\begin{aligned} \,\mathrm{d}Z_t = b(Z_t) \,\mathrm{d}t + \sqrt{2}\, \sigma (Z_t)\,\mathrm{d}W_t, \quad Z_0 = x \end{aligned}$$

and \(\tau _{B}\) is the first hitting time of \(Z_t\) to \(\bar{B}\). If \(X \in C([0,\infty ))\) never hits the set \(\bar{B}\), then we define

$$\begin{aligned} K(X,E) = \int _{\partial A} \eta _A^+(dy) {\mathcal {P}}_y(E), \quad E \in {\mathcal {B}}. \end{aligned}$$
(3.24)

This chain on \({\mathcal {X}}\) has a unique invariant distribution

$$\begin{aligned} \bar{\mathcal {P}}(U) = \int _{\partial A} \eta _A^+(\mathrm{d}y) {\mathcal {P}}_y(U), \quad \forall \;U \in {\mathcal {B}}, \end{aligned}$$

supported on the set of paths which originate in \(\partial A\) and are constant after hitting \(\partial B\). The uniqueness of \(\bar{\mathcal {P}}\) follows from the uniqueness of \(\eta _A^+\) as an invariant distribution for the chain defined by transition kernel \(P_A\) on \(\partial A\). Since \(P_A(x,dy)\) satisfies the Doeblin condition (3.22), so does the chain on \({\mathcal {X}}\):

$$\begin{aligned} \inf _{X \in {\mathcal {X}}} K(X,E) \ge C^{-1} \int _{\partial A} \nu _A(dy) {\mathcal {P}}_y(E). \end{aligned}$$

In particular, it is positive Harris recurrent and aperiodic, and by [26, Theorem 17.1.7], for any \(\Phi \in L^1({\mathcal {X}},{\mathcal {B}},\bar{\mathcal {P}})\) the limit

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{k=1}^N \Phi (X^{A,k}) = {\mathbb {E}}[ \Phi ( Z_{\cdot \wedge \tau _{B}}) \mid Z_0 \sim \eta _A^+] \end{aligned}$$
(3.25)

holds \({\mathbb { P}}\)-almost surely.

Using (3.25) we will establish the following relationship between \(\eta _A^-\) and \(\eta _A^+\):

Lemma 3.4

Let \(X_t\) satisfy the SDE (1.1) with initial distribution \(X_0 \sim \eta _A^+\) on \(\partial A\). Then for any Borel set \(U \subset \partial A\),

$$\begin{aligned} {\mathbb { P}}( X_{\tau _{A,0}^-} \in U \mid X_0 \sim \eta _A^+) = \eta _{A}^-(U) = - \frac{1}{\nu } \int _{U} \rho (x) \widehat{n}(x) \cdot a(x) \nabla q(x) \,\mathrm{d}\sigma _A(x). \end{aligned}$$

Proof of Lemma 3.4

Let \(f \in C({\mathbb { R}}^d)\) be bounded and non-negative. Let us recall set \(S\) introduced in the proof of Theorem 1.2. Given \(\epsilon >0\), we let \(S = \{x \in \Theta \mid q(x) < \epsilon \} \cup \bar{A}\). Then by applying (3.25) to the functional \(\Phi (X) = f(X_{\tau _{S,0}^-})\), we obtain

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n=0}^{N-1} f(X_{\tau _{S,n}})&= \lim _{\epsilon \rightarrow 0} {\mathbb {E}}[ f(X_{\tau _{S,0}}) \mid X_0 \sim \eta _A^+ ] \\&= {\mathbb {E}}[ f(X_{\tau _{A,0}^-}) \mid X_0 \sim \eta _A^+ ]. \end{aligned}$$

We also have,

$$\begin{aligned} \begin{aligned} \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n =0}^{N-1} f(X_{\tau _{S,n}})&= \left( \lim _{K \rightarrow \infty } \frac{K}{N_K}\right) \left( \lim _{K \rightarrow \infty } \frac{1}{K} \sum _{k = 0}^{K-1} f(X_{r_{S,k}}) {\mathbb {I}}_{r_{B,k} < r_{A,k}} \right) \\&= \int _{\partial S} f(x)\zeta _S(\mathrm{d}x), \end{aligned} \end{aligned}$$
(3.26)

holds \({\mathbb { P}}\)-almost surely, where \(N_K = |\{ k \in \{0,1,\dots , K-1\} \mid r_{B,k} < r_{A,k} \}|\). Here we have used \(\zeta _S\) to denote the unique invariant distribution (identified below) for the Markov chain defined by \(X_{r_{S,k}}\) on \(\partial S\). Therefore,

$$\begin{aligned} {\mathbb {E}}\,[ f(X_{\tau _{A,0}^-}) \mid X_0 \sim \eta _A^+ ] = \lim _{\epsilon \rightarrow 0} \int _{\partial S} f(x) \zeta _S(\mathrm{d}x). \end{aligned}$$

We claim that if \(f(x)\) is uniformly continuous in a neighborhood of \(\partial A\), then

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \int _{\partial S} \zeta _S(\mathrm{d}x) f(x) = \int _{\partial A} \eta _A^-(\mathrm{d}x) f(x). \end{aligned}$$
(3.27)

First, let us identify the invariant distribution \(\zeta _S\). By applying Corollary 3.3 (replacing \(B\) by \(\bar{S}^{c}\)) we can identify \(\zeta _S\) as

$$\begin{aligned} \zeta _S(\mathrm{d}x)\; ( = \eta _S^+(\mathrm{d}x) ) = - \frac{\epsilon }{\nu } \rho (x) \widehat{n}(x) \cdot a(x) \nabla \widetilde{q}_S(x) \,\mathrm{d}\sigma _S(x), \end{aligned}$$

where \(\widehat{n}(x)\) is the exterior normal at \(x \in \partial S\), and \(\widetilde{q}_S\) satisfies \(\widetilde{L} \widetilde{q}_S = 0\) in \(S\) with

$$\begin{aligned} \widetilde{q}_S(x) = {\left\{ \begin{array}{ll} 1, &{} x \in \partial A \\ 0, &{} x \in \partial S. \end{array}\right. } \end{aligned}$$

Note that \(\nu \) is independent of \(\epsilon \). Let \(\delta > \epsilon \) be small, and suppose that \(f(x)\) is continuous on the closed set \(\{ x \in \bar{\Theta } \mid 0 \le q(x) \le \delta \}\). (This set contains both \(\partial A\) and \(\partial S\)). A computation similar to (3.12) (replacing \(B\) by \(S\)) shows that for any such function, we have

$$\begin{aligned} \int _{\partial S} \zeta _S(\mathrm{d}x) f(x) = - \frac{\epsilon }{\nu }\int _{\partial A} \rho (x) \widehat{n}(x) \cdot a(x) \nabla u_{f,S}(x) \,\mathrm{d}\sigma _A(x), \end{aligned}$$
(3.28)

where \(u_{f,S}\) satisfies \(L u = 0\) in \(S {\setminus } \bar{A}\), and

$$\begin{aligned} u_{f,S}(x) = {\left\{ \begin{array}{ll} f(x), &{} x \in \partial S \\ 0, &{} x \in \partial A. \end{array}\right. } \end{aligned}$$

Since \(f \ge 0\), we have \(u > 0\) in \(S {\setminus } \bar{A}\). Now, let us define

$$\begin{aligned} z_{f,S}(x) = \epsilon \frac{u_{f,S}(x)}{q(x)}, \quad x \in \bar{S} {\setminus } A, \end{aligned}$$

which satisfies \(L^q z = 0\) in \(S {\setminus } \bar{A}\), with \(z = f\) on \(\partial S\) (recall that \(q(x) = \epsilon \) for all \(x \in \partial S\)). By the boundary Harnack inequality (see Theorem 2 and Corollary 1 of [2], as well as [5, Theorem 2.1] and [11, Theorem 11.6]), \(z_{f,S}(x)\) is bounded and Hölder continuous on \(\bar{S} {\setminus } A\) (including \(\partial A\)). We claim that for any \(x_0 \in \partial A\), we have

$$\begin{aligned} \lim _{x \rightarrow x_0} \nabla u_{f,S}(x) = \epsilon ^{-1} z_{f,S}(x_0) \nabla q(x_0). \end{aligned}$$
(3.29)

Since \(\nabla u_{f,S}\), \(\nabla q\), and \(z_{f,S}\) are continuous up to \(\partial A\), this is true if and only if

$$\begin{aligned} \lim _{x \rightarrow x_0} q(x) \nabla z_{f,S}(x) = 0. \end{aligned}$$

Suppose \(q(x) \nabla z_{f,S}(x) \rightarrow v \ne 0\) as \(x \rightarrow x_0 \in \partial A\). Then we must have

$$\begin{aligned} \lim _{x \rightarrow x_0} \nabla u_{f,S}(x) - z_{f,S}(x) \nabla q(x) = v \end{aligned}$$

so that \(v\) must be a multiple of \(\widehat{n}(x_0)\) (since \(u\) and \(q\) vanish on \(\partial A\)). Thus, we would have

$$\begin{aligned} \widehat{n}(x_0)\cdot \nabla z_{f,S}(x) \sim (\widehat{n}(x_0) \cdot v) q(x)^{-1} \end{aligned}$$
(3.30)

as \(x \rightarrow x_0 \in \partial A\). If \(v \ne 0\), then \((\widehat{n}(x_0) \cdot v) \ne 0\), so (3.30) and the fact that \(q = 0\) on \(\partial A\) would contradict the boundedness of \(z_{f,S}(x)\). Therefore, (3.29) must hold.

Combining (3.28) and (3.29) we obtain

$$\begin{aligned} \int _{\partial S} \zeta _S(\mathrm{d}x) f(x) \!=\! - \frac{1}{\nu }\int _{\partial A} \rho (x) \widehat{n}(x) \cdot a(x) \nabla q(x) z_{f, S}(x) \,\mathrm{d}\sigma _A(x) \!=\! \int _{\partial A} \eta _A^-(\mathrm{d}x) z_{f,S}(x). \end{aligned}$$

Therefore, as \(\epsilon \rightarrow 0\),

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \int _{\partial S} \zeta _S(\mathrm{d}x) f(x) = \lim _{\epsilon \rightarrow 0} \int _{\partial A} \eta _A^-(\mathrm{d}x) z_{f,S}(x) = \int _{\partial A} \eta _A^-(\mathrm{d}x) f(x). \end{aligned}$$
(3.31)

This establishes (3.27) and completes the proof of Lemma 3.4. \(\square \)

Now we continue with the proof of Theorem 1.7. We will apply Theorem 1.2. Suppose that \(F \in L^1({\mathcal {X}},{\mathcal {B}},{\mathcal {Q}}_{\eta _A^-})\), and define the functional

$$\begin{aligned} \Phi (X) = F(X_{(\cdot + \tau _{A,0}^-) \wedge \tau _{B,0}^+}). \end{aligned}$$

Combining Theorem 1.2 and Lemma 3.4 we see that \(\Phi \in L^1({\mathcal {X}},{\mathcal {B}},\bar{\mathcal {P}})\), since

$$\begin{aligned} \bar{\mathcal {P}}( \Phi (X) > \alpha )&= {\mathbb { P}}( \Phi (X) > \alpha \mid X_0 \sim \eta _A^+) \\&= {\mathbb { P}}( F(X_{(\cdot \, + \tau _{A,0}^-) \wedge \tau _{B_0}^+}) > \alpha \mid X_0 \sim \eta _A^+) \\&= {\mathbb {Q}}( F(Y) > \alpha \mid Y_0 \sim \eta _A^-) = {\mathcal {Q}}( F(Y) > \alpha ). \end{aligned}$$

Therefore,

$$\begin{aligned} \frac{1}{N} \sum _{k = 0}^{N-1} F(Y^k) = \frac{1}{N} \sum _{k = 0}^{N-1} F(X^{A,k}_{(\cdot + \tau _{A,k}^-) \wedge \tau _{B,k}^+}) = \frac{1}{N} \sum _{k = 0}^{N-1} \Phi (X^{A,k}_{\cdot }). \end{aligned}$$

By (3.25) and Theorem 1.2, we now conclude that the limit

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{k = 0}^{N-1} F(Y^k) = {\mathbb {E}}[ \Phi (Z_{\cdot \wedge \tau _{B}}) \mid Z_0 \sim \eta _A^+] = \widehat{{\mathbb {E}}}[ F(Y) \mid Y_0 \sim \eta _A^-] \end{aligned}$$

holds \({\mathbb { P}}\)-almost surely. This completes the proof of Theorem 1.7. \(\square \)

4 Reaction rate, density and current of transition paths

4.1 Reaction rate

Proof of Proposition 1.8

Denote \(\tau _B\) the first hitting time of \(X_t\) to \(\bar{B}\). Consider the mean first hitting time

$$\begin{aligned} u_B(x) = {\mathbb {E}}\left[ \tau _B \mid X_0 = x\right] \!, \end{aligned}$$

which satisfies the equation

$$\begin{aligned} {\left\{ \begin{array}{ll} L u_B(x) = - 1, &{} x \in \Theta \\ u_B(x) = 0, &{} x \in \partial B. \end{array}\right. } \end{aligned}$$
(4.1)

By definition of \(\eta _A^+\), we have

$$\begin{aligned} \begin{aligned} \int _{\partial A} \eta _A^+(\mathrm{d}x) u_B(x)&= \frac{1}{\nu } \int _{\partial A} \rho (x) u_B(x) \widehat{n}(x) \cdot a(x) \nabla \widetilde{q}(x) \,\mathrm{d}\sigma _A(x). \end{aligned} \end{aligned}$$
(4.2)

Observe that

$$\begin{aligned} \begin{aligned} \int _{{\mathbb {R}}^d} \rho (x) \widetilde{q}(x) \,\mathrm{d}x&= \int _{B^c} \rho (x) \widetilde{q}(x) \,\mathrm{d}x \\&\mathop {=}\limits ^{(4.1)} - \int _{B^c} \rho (x) \widetilde{q}(x) (L u_B)(x) \,\mathrm{d}x \\&= - \int _A \rho (x) (Lu_B)(x) \,\mathrm{d}x - \int _{\Theta } \rho (x) \widetilde{q}(x) (L u_B)(x) \,\mathrm{d}x. \end{aligned} \end{aligned}$$

Using (3.1) with \(D = A\), \(\phi (x) = 1\) and \(\psi (x) = u_B\), we obtain

$$\begin{aligned} \int _A \rho (L u_B) \,\mathrm{d}x&= - \int _{\partial A} \rho b \cdot \widehat{n} u_B \,\mathrm{d}\sigma _A(x)\\&- \int _{\partial A} \rho \widehat{n} \cdot a \nabla u_B \,\mathrm{d}\sigma _A(x) {+} \int _{\partial A} u_B \widehat{n} \cdot {{\mathrm{div}}}(a \rho ) \,\mathrm{d}\sigma _A(x), \end{aligned}$$

where \(\widehat{n}\) is the interior normal vector at \(\partial A\). Apply (3.1) again with \(D = \Theta \), \(\phi = \widetilde{q}\) and \(\psi = u_B\),

$$\begin{aligned} \int _{\Theta } \rho \widetilde{q} (L u_B) \,\mathrm{d}x&= \int _{\partial A} \rho b \cdot \widehat{n} u_B \,\mathrm{d}\sigma _A(x)\\&+ \int _{\partial A} \rho \widehat{n} \cdot a \nabla u_B \,\mathrm{d}\sigma _A(x) - \int _{\partial A} u_B \widehat{n} \cdot {{\mathrm{div}}}(a \rho \widetilde{q}) \,\mathrm{d}\sigma _A(x). \end{aligned}$$

Combining the two with (4.2), we get

$$\begin{aligned} \int _{\partial A} \eta _A^+(\mathrm{d}x) u_B(x) = \frac{1}{\nu } \int _{\partial A} \rho u_B \widehat{n} \cdot a \nabla \widetilde{q} \,\mathrm{d}\sigma _A(x) = \frac{1}{\nu } \int _{{\mathbb {R}}^d} \rho \widetilde{q} \,\mathrm{d}x. \end{aligned}$$

Similarly, defining \(u_A(x)\) to be the mean first hitting time of \(X_t\) to \(\bar{A}\) starting at \(x\), we have

$$\begin{aligned} \int _{\partial B} \eta _B^+(\mathrm{d}x) u_A(x) = \frac{1}{\nu } \int _{{\mathbb {R}}^d} \rho (1 - \widetilde{q}) \,\mathrm{d}x. \end{aligned}$$

Add the integrals together to obtain

$$\begin{aligned} \int _{\partial A} \eta _A^+(\mathrm{d}x) u_B(x) + \int _{\partial B} \eta _B^+(\mathrm{d}x) u_A(x) = \frac{1}{\nu }. \end{aligned}$$

On the other hand, observe that

$$\begin{aligned} \begin{aligned} \frac{1}{\nu _R}&= \lim _{N_T \rightarrow \infty } \frac{T}{N_T} \\&= \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n=0}^{N-1} \left( \tau _{A, n+1}^+ - \tau _{A, n}^+\right) \\&= \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n=0}^{N-1} \left( \tau _{B, n}^+ - \tau _{A, n}^+\right) + \lim _{N\rightarrow \infty } \frac{1}{N} \sum _{n=0}^{N-1} \left( \tau _{A, n+1}^+ - \tau _{B, n}^+\right) . \end{aligned} \end{aligned}$$

As \(N \rightarrow \infty \), we have

$$\begin{aligned} T_{AB} = \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n=0}^{N-1} \left( \tau _{B, n}^+ - \tau _{A, n}^+\right) = {\mathbb {E} }[ \tau _B \mid X_0 \sim \eta _A^+ ] = \int _{\partial A} \eta _A^+(\mathrm{d}x) u_B(x), \end{aligned}$$

and similarly

$$\begin{aligned} T_{BA} = \lim _{N\rightarrow \infty } \frac{1}{N} \sum _{n=0}^{N-1} \left( \tau _{A, n+1}^+ - \tau _{B, n}^+\right) = \int _{\partial B} \eta _B^+(\mathrm{d}x) u_A(x). \end{aligned}$$

Therefore

$$\begin{aligned} \frac{1}{\nu _R} = \int _{\partial A} \eta _A^+(\mathrm{d}x) u_B(x) + \int _{\partial B} \eta _B^+(\mathrm{d}x) u_A(x) = \frac{1}{\nu }, \end{aligned}$$

or equivalently \(\nu = \nu _R\).

From Theorem 1.7 it follows immediately that

$$\begin{aligned} C_{AB} = \int _{\partial A} \eta _A^-(\mathrm{d}x) v_B(x). \end{aligned}$$

Indeed, the functional \(F: Y \rightarrow \tau ^Y_B\) is in \(L^1({\mathcal {X}},{\mathcal {B}},{\mathcal {Q}}_{\eta _A^-})\) by Proposition 2.7. The function \(v_B(x) = \widehat{{\mathbb {E}}}[ \tau _B^Y \mid Y_0 = x]\) satisfies

$$\begin{aligned} L^q v_B = -1, \quad x \in \Theta \end{aligned}$$

with \(v(x) = 0\) for \(x \in \partial B\). Hence, the function \(w(x) = q(x) v_B(x)\) satisfies \(L w = -q\) for \(x \in \Theta \) with boundary condition \(w(x) = 0\) for \(x \in \partial \Theta \). Moreover, for \(x_0 \in \partial A\), we have

$$\begin{aligned} v_B(x_0) = \lim _{x \rightarrow x_0} \frac{w(x)}{q(x)} = \frac{\widehat{n}(x_0) \cdot a(x_0) \nabla w(x_0)}{ \widehat{n}(x_0) \cdot a(x_0) \nabla q(x_0)}. \end{aligned}$$

Therefore,

$$\begin{aligned} \int _{\partial A} \eta _A^-(\mathrm{d}x) v_B(x) = - \frac{1}{\nu }\int _{\partial A} \rho (x) \widehat{n}(x) \cdot a(x) \nabla w(x) \,\mathrm{d}\sigma _A(x). \end{aligned}$$

Now applying (3.1) with \(D = \Theta \), \(\phi = \tilde{q}\) and \(\psi = w\), we have

$$\begin{aligned} - \frac{1}{\nu }\int _{\partial A} \rho (x) \widehat{n}(x) \cdot a(x) \nabla w(x) \,\mathrm{d}\sigma _A(x) = \frac{1}{\nu } \int _{\Theta } \rho (x) \tilde{q}(x)q(x) \,\mathrm{d}x. \end{aligned}$$

It remains to show that

$$\begin{aligned} \nu = \int _{{\mathbb {R}}^d} \rho \nabla q \cdot a \nabla q \,\mathrm{d}x. \end{aligned}$$

Using integration by parts, we have

$$\begin{aligned} \int _{{\mathbb {R}}^d} \rho \nabla q \cdot a \nabla q \,\mathrm{d}x&= \int _{\Theta } \rho \nabla \left( q - \frac{1}{2}\right) \cdot a \nabla q \,\mathrm{d}x \\&= - \int _{\Theta } \nabla \cdot (\rho a \nabla q) \left( q {-} \frac{1}{2}\right) \,\mathrm{d}x + \int _{\partial A} \rho \left( q {-} \frac{1}{2}\right) \widehat{n} \cdot a \nabla q \,\mathrm{d}\sigma _A(x) \\&\quad + \int _{\partial B} \rho \left( q - \frac{1}{2}\right) \widehat{n} \cdot a \nabla q \,\mathrm{d}\sigma _B(x). \end{aligned}$$

The first term on the right hand side vanishes as

$$\begin{aligned} \begin{aligned} \int _{\Theta } \nabla \cdot (\rho a \nabla q) \left( q - \frac{1}{2}\right) \,\mathrm{d}x&= \int _{\Theta } \left( \rho {{\mathrm{tr}}}a \nabla ^2 q + \rho b \cdot \nabla q\right) \left( q - \frac{1}{2}\right) \,\mathrm{d}x \\&\quad + \frac{1}{2} \int _{\Theta } \left( {{\mathrm{div}}}(\rho a) \cdot \nabla - \rho b \nabla \right) (q^2 - q) \,\mathrm{d}x \\&= \int _{\Theta } \rho (L q) \left( q - \frac{1}{2}\right) \,\mathrm{d}x - \frac{1}{2} \int _{\Theta } (L^{*} \rho )(q^2 - q) = 0, \end{aligned} \end{aligned}$$

where we have used that \(q^2 - q = 0\) on \(\partial A \cup \partial B\). The conclusion then follows from Lemma 1.3, \(q = 0\) on \(\partial A\), and \(q = 1\) on \(\partial B\). \(\square \)

4.2 Density of transition paths

We define the Green’s function \(G_{\Theta }\) of the operator \(L\) in \(\Theta \) with Dirichlet boundary condition on \(\partial \Theta \):

$$\begin{aligned} {\left\{ \begin{array}{ll} L G_{\Theta }(x, y) = - \delta _y(x), &{} x \in \Theta , \\ G_{\Theta }(x, y) = 0, &{} x \in \partial \Theta . \end{array}\right. } \end{aligned}$$
(4.3)

The existence of the Green’s function is guaranteed by the ergodicity of \(X_t\) in \({\mathbb { R}}^d\), which implies that \(X_t\) is transient in \(\Theta \) (see e.g. [28, Section 4.2]).

Lemma 4.1

Let \(G_{\Theta }\) be the Green’s function of \(L\) in \(\Theta \) with Dirichlet boundary condition on \(\partial \Theta \). We have

$$\begin{aligned} G_{\Theta }^{q}(x, y) \equiv \int _0^{\infty } Q_R(t, x, y) \,\mathrm{d}t = \frac{q(y) G_{\Theta }(x, y) }{q(x)}. \end{aligned}$$
(4.4)

In particular, for \(x \in \partial A\), \(y \in \Theta \)

$$\begin{aligned} G_{\Theta }^q(x, y) = \frac{ q(y) \widehat{n}(x) \cdot a(x) \nabla _x G_{\Theta }(x, y)}{ \widehat{n}(x) \cdot a(x) \nabla q(x)}. \end{aligned}$$
(4.5)

Proof

Fix \(y \in \Theta \). For \(x \in \Theta \), (4.4) follows from [28, Proposition 4.2.2]. Specifically, the function \(G_{\Theta }^{q}(x, y)\) defined by

$$\begin{aligned} G_{\Theta }^{q}(x, y) = \int _0^{\infty } Q_R(t, x, y) \,\mathrm{d}t \end{aligned}$$

is related to the Green’s function (4.3) by the formula

$$\begin{aligned} G_{\Theta }^{q}(x, y) = \frac{q(y) G_{\Theta }(x, y) }{q(x)}, \quad x,y \in \Theta . \end{aligned}$$

Because of the regularity of the coefficients \(a(x)\) and \(b(x)\), Schauder-type interior and boundary estimates imply that \(G(\cdot ,y) \in C^{2,\alpha }(\bar{\Theta } {\setminus } \{y\})\). Since \(G(x,y) = q(x) = 0\) for \(x \in \partial A\), the Hopf Lemma implies that for all \(x \in \partial A\), \(\nabla _x G(x,y)\) is a nonzero multiple of \(\widehat{n}(x)\). That is, for all \(x \in \partial A\), \(\nabla _x G(x,y) = r(x)\widehat{n}(x)\) for some continuous \(r(x) < 0\). The same is true for \(q\). Therefore, \(G_{\Theta }^{q}(x, y)\) is continuous in \(x\) up to the boundary \(\partial \Theta \) and for \(x_0 \in \partial A\),

$$\begin{aligned} \lim _{x \rightarrow x_0,\, x \in \Theta } G_{\Theta }^{q}(x, y) = \frac{ q(y) \widehat{n}(x_0) \cdot a(x_0) \nabla _x G_{\Theta }(x_0, y)}{ \widehat{n}(x_0) \cdot a(x_0) \nabla q(x_0)}. \end{aligned}$$

It remains to show that for \(x_0 \in \partial A\),

$$\begin{aligned} \frac{ q(y) \widehat{n}(x_0) \cdot a(x_0) \nabla _x G_{\Theta }(x_0, y)}{ \widehat{n}(x_0) \cdot a(x_0) \nabla q(x_0)} = \int _0^{\infty } Q_R(t, x_0, y) \,\mathrm{d}t. \end{aligned}$$
(4.6)

Let \(\varphi \ge 0\) be smooth and compactly supported in \(\Theta \). By Proposition 2.6, we have

$$\begin{aligned} \lim _{x \rightarrow x_0} \widehat{{\mathbb {E}}}[ \varphi (Y_t)\mid Y_0 = x] = \widehat{{\mathbb {E}}}[ \varphi (Y_t)\mid Y_0 = x_0]. \end{aligned}$$

Moreover,

$$\begin{aligned} \widehat{{\mathbb {E}}}[ \varphi (Y_t)\mid Y_0 = x] \le ||\varphi ||_\infty {\mathbb {Q}}( Y_t \in \Theta \mid Y_0 = x). \end{aligned}$$

By Proposition 2.7, for any \(R > 0\), there a function \(h_R \in L^1(0,+\infty )\) such that \({\mathbb {Q}}( Y_t \in \Theta \mid Y_0 = x) \le h_R(t)\) for all \(x \in \Theta \), \(|x| < R\), \(t \ge 0\). Therefore, we have \(\widehat{{\mathbb {E}}}[ \varphi (Y_t)\mid Y_0 = x] \le ||\varphi ||_\infty h_R(t)\) so the dominated convergence theorem implies that

$$\begin{aligned} \lim _{x \rightarrow x_0} \int _{\Theta } G_{\Theta }^{q}(x, y) \varphi (y) \,\mathrm{d}y&= \lim _{x \rightarrow x_0} \int _0^\infty \widehat{{\mathbb {E}}}[ \varphi (Y_t)\mid Y_0 = x] \,\mathrm{d}t \nonumber \\&= \int _0^\infty \widehat{{\mathbb {E}}}[ \varphi (Y_t)\mid Y_0 = x_0] \,\mathrm{d}t \nonumber \\&= \int _0^\infty \left( \int _{\Theta } Q(t,x_0,y) \varphi (y) \,\mathrm{d}y \right) \,\mathrm{d}t. \end{aligned}$$
(4.7)

On the other hand, we also have

$$\begin{aligned} \lim _{x \rightarrow x_0} \int _{\Theta } G_{\Theta }^{q}(x, y) \varphi (y) \,\mathrm{d}y = \int _{\Theta } \frac{ q(y) \widehat{n}(x_0) \cdot a(x_0) \nabla _x G_{\Theta }(x_0, y)}{ \widehat{n}(x_0) \cdot a(x_0) \nabla q(x_0)} \varphi (y) \,\mathrm{d}y. \end{aligned}$$
(4.8)

Therefore, by combining (4.7) and (4.8) we conclude

$$\begin{aligned} \int _{\Theta } \frac{ q(y) \widehat{n}(x_0) \cdot a(x_0) \nabla _x G_{\Theta }(x_0, y)}{ \widehat{n}(x_0) \cdot a(x_0) \nabla q(x_0)} \varphi (y) \,\mathrm{d}y&= \int _0^\infty \int _{\Theta } Q(t,x_0,y) \varphi (y) \,\mathrm{d}y \,\mathrm{d}t \\&= \int _{\Theta } \left( \int _0^\infty Q(t,x_0,y) \,\mathrm{d}t \right) \varphi (y) \,\mathrm{d}y. \end{aligned}$$

Since \(\varphi \) is arbitrary, this implies (4.6).\(\square \)

Proof of Proposition 1.9

Using Lemma 4.1 and (1.38),

$$\begin{aligned} \rho _R(z) = \nu _R \int _{\partial A} \eta _A^-(\mathrm{d}x) G^q_{\Theta }(x, z). \end{aligned}$$
(4.9)

Recall the explicit formula of \(\eta _A^-\) in terms of \(q\) (1.23), we obtain for \(z \in \Theta \)

$$\begin{aligned} \begin{aligned} \rho _R(z)&= -\int _{\partial A} \rho (x) \frac{ q(y) \widehat{n}(x) \cdot a \nabla _x G_{\Theta }(x, z)}{ \widehat{n}(x) \cdot a \nabla q(x)} \widehat{n}(x) \cdot a \nabla q(x) \,\mathrm{d}\sigma _A(x) \\&= -q(y) \int _{\partial A} \rho (x) \widehat{n}(x) \cdot a \nabla _x G_{\Theta }(x, z) \,\mathrm{d}\sigma _A(x). \end{aligned} \end{aligned}$$

Apply (3.1) by taking \(\psi (x) = G_{\Theta }(x, y)\) and \(\phi (x) = \widetilde{q}(x)\), we conclude that

$$\begin{aligned} \begin{aligned} \rho _R(y)&= - q(y) \int _{\partial \Theta } \rho (x) \phi (x) \widehat{n}(x) \cdot a \nabla \psi (x) \,\mathrm{d}\sigma _{\Theta }(x) \\&= - q(y) \int _{\Theta } \rho (x) \phi (x) L \psi (x) \\&= \rho (y) q(y) \widetilde{q}(y). \end{aligned} \end{aligned}$$

Here to get the second equality, we have used that \(\widetilde{L}\widetilde{q} = 0\) in \(\Theta \) and \(\psi (x) = 0\) on \(\partial \Theta \).\(\square \)

4.3 Current of transition paths

Proof of Proposition 1.10

It follows from a direct calculation from the definition of \(J_R\) as (1.41), noticing that \(q = 0, \widetilde{q} = 1\) on \(\partial A\), and \(q = 1, \widetilde{q} = 0\) on \(\partial B\).\(\square \)

Proof of Corollary 1.11

By Proposition 1.10, we have

$$\begin{aligned} \nu _R = - \int _{\partial A} \widehat{n}(x) \cdot J_R(x) \,\mathrm{d}\sigma _A(x). \end{aligned}$$

Hence, it suffices to show that

$$\begin{aligned} \int _{\partial A} \widehat{n}(x) \cdot J_R(x) \,\mathrm{d}\sigma _A(x) + \int _{\partial S} \widehat{n}(x) \cdot J_R(x) \,\mathrm{d}\sigma _S(x) = 0, \end{aligned}$$

which follows from the fact that \(J_R\) is divergence free in \(\Theta \) (see (1.40)).\(\square \)

Proof of Proposition 1.12

Using Proposition 1.10 for the left hand side of (1.45), we obtain

$$\begin{aligned} \int _{\partial B} f(x) \eta _B^+(\mathrm{d}x) - \int _{\partial A} f(x) \eta _A^-(\mathrm{d}x) = \frac{1}{\nu _R} \int _{\partial B} f \widehat{n} \cdot J_R \,\mathrm{d}\sigma _B + \frac{1}{\nu _R} \int _{\partial A} f \widehat{n} \cdot J_R \,\mathrm{d}\sigma _A, \end{aligned}$$

where \(\widehat{n}\) is the unit normal exterior to \(\Theta \). Equation (1.45) then follows from the divergence theorem.

Now fix any \(g \in C^1(\partial B)\), we extend \(g\) to \(\bar{\Theta }\) using the flow (1.43): for any \(x \in \bar{\Theta }\), we define

$$\begin{aligned} g(x) = g(Z_{t_B}^x), \quad \text {with } Z_0^x = x. \end{aligned}$$
(4.10)

In particular, for \(x \in \partial A\), we have \( g(x) = g(\Phi _{J_R}(x))\), in other words,

$$\begin{aligned} g \vert _{\partial A} = \Phi _{J_R}^{*} (g \vert _{\partial B}). \end{aligned}$$
(4.11)

By the construction (4.10), for any \(x \in \Theta \), \(J_R \cdot \nabla g = 0\). Combining with the first part of the Proposition and (4.11), we obtain

$$\begin{aligned} \int _{\partial B} g(x) \eta _B^{+}(\mathrm{d}x) = \int _{\partial A} \Phi _{J_R}^{*} g \, \eta _A^-(\mathrm{d}x). \end{aligned}$$

Therefore, \(\Phi _{J_R, *}(\eta _A^-) = \eta _B^+\).