1 Introduction

In this work, we consider a data assimilation problem for a stationary convection–diffusion equation

$$\begin{aligned} {\mathcal {L}} u := - \mu \Delta u + \beta \cdot \nabla u = f \quad \text {in } \Omega \subset {\mathbb {R}}^n, \end{aligned}$$
(1)

when convection dominates, that is \(0 < \mu \ll |\beta |\), and complement the diffusion-dominated case discussed in the first part [5]. We assume that \(\Omega \subset {\mathbb {R}}^n\) is open, bounded and connected, and there exists a solution \(u\in H^2(\Omega )\) to (1). The problem under study is to approximate the solution u given the source f in \(\Omega \) and the perturbed restriction \({\tilde{U}}_\omega = u\vert _{\omega } + \delta \) of the solution to an open subset \(\omega \subset \Omega \). The perturbation \(\delta \) is taken in \(L^2(\omega )\). Notice that we consider no boundary conditions on \(\partial \Omega \). This is a linear ill-posed problem also known as unique continuation.

To start with, let us briefly recall the main results obtained in the first part. Consider an open bounded set \(B\subset \Omega \) that contains the data region \(\omega \) such that \(B\setminus \omega \) does not touch the boundary of \(\Omega \). For \(u\in H^1(\Omega )\), the following conditional stability estimate was proven for \(\mu >0\) and \(\beta \in L^{\infty }(\Omega )^n\),

$$\begin{aligned} \left\| u \right\| _{L^2(B)} \le C_{st} \left( \left\| u \right\| _{L^2(\omega )} + \tfrac{1}{\mu } \left\| {\mathcal {L}} u \right\| _{H^{-1}(\Omega )} \right) ^\kappa \left\| u \right\| _{L^2(\Omega )}^{1-\kappa }, \end{aligned}$$
(2)

where the Hölder exponent \(\kappa \in (0,1)\) depends only on the geometric setting. In the case of simple geometric configurations, e.g. when \(\omega ,\, B,\, \Omega \) are three concentric balls, the exponent \(\kappa \in (0,1)\) can be given explicitly, see [5,  Remark 1]. The stability constant \(C_{st}\) is given explicitly in terms of the physical parameters

$$\begin{aligned} C_{st} = C_1 \exp \left( C_2 (1 + \tfrac{|\beta |}{\mu })^2\right) , \quad |\beta | := \left\| \beta \right\| _{L^\infty (\Omega )^n}, \end{aligned}$$
(3)

with constants \(C_1,\, C_2 > 0\) depending only on the geometry. Note that the continuum estimate (2) is valid in both the diffusion-dominated and convection-dominated regimes, and that the stability constant \(C_{st}\) is uniformly bounded when diffusion dominates. However, when convection dominates \(C_{st}\) grows exponentially, rendering the stability estimate ineffective in practice. We also recall that for global unique continuation from \(\omega \) to the entire \(\Omega \) the stability is no longer Hölder, but logarithmic, that is the modulus of continuity for the given data is not \(|\cdot |^\kappa \) any more, but \(|\log (\cdot )|^{-\kappa }\).

On the discrete level, the continuum estimate (2) was combined with a stabilized linear finite element method to obtain convergence orders for the approximate solution. More precisely, for a mesh size h, and defining the Péclet number

$$\begin{aligned} Pe(h) := \frac{|\beta | h}{\mu }, \end{aligned}$$

the following error bound [5,  Theorem 1] was proven for the approximation \(u_h\) in the diffusive regime \(Pe(h)<1\),

$$\begin{aligned} \Vert u-u_h\Vert _{L^2(B)} \le C_{st}\, h^{\kappa } (\Vert u\Vert _{H^2(\Omega )} + h^{-1} \Vert \delta \Vert _{L^2(\omega )}), \end{aligned}$$
(4)

where the convergence order \(\kappa \in (0,1)\) is the same as the Hölder exponent in (2) and the stability constant \(C_{st}\) is proportional to the one in (3). Under an additional assumption on the divergence of the convective field \(\beta \), similar error bounds were also proven in the \(H^1\)-norm, see [5,  Theorem 2].

The prototypical effect of dominating diffusion is shown in Fig. 1, where the problem is set in the unit square and contour error plots are shown for data assimilation from a centered disk of radius 0.1. One can notice oscillating errors that grow in size away from the data region towards the boundary. The exact solution in this example is \(u=2\sin (5\pi x)\sin (5\pi y)\) where the factor 2 is taken to make its \(L^2\)-norm unitary. For the computation we used an unstructured mesh with 512 elements on a side and mesh size \(h\approx 0.0025\).

Fig. 1
figure 1

Absolute error contour plot in the diffusion-dominated case, \(\mu = 1,\, \beta =(1,0)\). The domain is the unit square, data given in a centered disk of radius 0.1 for the exact solution \(u=2\sin (5\pi x)\sin (5\pi y)\)

1.1 Objective and main results

We consider a stabilized finite element method for data assimilation subject to the convection–diffusion equation in the convection-dominated regime. Since the behaviour of the physical system changes fundamentally when convection dominates and

$$\begin{aligned} Pe(h) \gg 1, \end{aligned}$$

the goal of this second part paper is to reconsider the numerical method proposed in the first part [5] and develop an error analysis that captures and exploits the governing transport phenomenon. This is illustrated in Fig. 2 where the transition to the convection-dominated regime through an intermediate regime is made by decreasing the diffusion coefficient \(\mu \). We aim to obtain sharper local error estimates along the characteristics of the convective field through the data region. Even though the error analysis that we perform herein is different in nature to the one in the first part, the numerical method itself is only slightly changed (see Remark 1 below). For the error localization technique we draw on ideas used for the streamline diffusion method in [15], continuous interior penalty in [3], and non-coercive hyperbolic problems in [8].

From the definition of the Péclet number we see that the regime will also depend on the resolution of the computation besides the physical parameters. We can therefore expect the method to change behaviour as the resolution increases and Pe(h) decreases. This phenomenon was already observed computationally in [7] and can now be explained theoretically.

Fig. 2
figure 2

Absolute error contour plot when convection becomes dominant, \(\beta =(1,0)\). The domain is the unit square, data given in a centered disk of radius 0.1 for the exact solution \(u=2\sin (5\pi x)\sin (5\pi y)\)

To make the presentation as simple as possible we consider a model case in the unit square \(\Omega \) with constant convection

$$\begin{aligned} \beta := (\beta _1,0), \quad \beta _1\in {\mathbb {R}}, \end{aligned}$$

and the solution given in the subset

$$\begin{aligned} \omega := (0,x)\times (y^{-}, y^{+}) \text { with } x> h \text { and } y^{+} - y^{-} > h. \end{aligned}$$

For the subset \({\omega _\beta } \subset \Omega \) covered by the characteristics of \(\beta \) that go through \(\omega \), we introduce the stability region \({\mathring{\omega }_\beta } \subset \omega _\beta \) by cutting off a crosswind layer of width \({\mathcal {O}}(h^\frac{1}{2} |\ln (h)|)\) (see Sect. 2.1 for more details). We separate the convection-dominated and diffusion-dominated regimes by introducing a constant \(Pe_{\lim } > 1\) such that

$$\begin{aligned} Pe(h)> Pe_{\lim } > 1. \end{aligned}$$

To reduce the number of constants appearing in the analysis, we will write this as \(Pe(h) \gtrsim 1\). As suggested by Fig. 2, we expect different results for data assimilation downstream vs upstream in an intermediate regime. We prove in Theorem 1 weighted error estimates that for \(\beta _1>0\) essentially take the following form

$$\begin{aligned} \Vert u - u_h\Vert _{L^2(\mathring{\omega }_\beta )} \le C (|\beta |^{\frac{1}{2}} h^{\frac{3}{2}} |u|_{H^2(\Omega )} + |\beta |^{\frac{1}{2}} h^{-\frac{1}{2}} \Vert \delta \Vert _{L^2(\omega )}),\text { when } Pe(h) \gtrsim 1. \end{aligned}$$

This is similar to the typical error estimates for piecewise linear stabilized FEMs for convection-dominated well-posed problems, such as local projection stabilization, dG methods, continuous interior penalty or Galerkin least squares. On general shape-regular meshes these methods have an \({\mathcal {O}}(h^\frac{1}{2})\) gap to the best approximation convergence order. Taking this into account, our result is thus quasi-optimal. For a recent overview of challenges and open problems in the well-posed case, see e.g. [14] and [18].

When going against the characteristics, i.e. \(\beta _1<0\), we prove in Theorem 2 first the pre-asymptotic bound

$$\begin{aligned} \Vert u - u_h\Vert _{L^2(\mathring{\omega }_\beta )} \le C ( |\beta |^{\frac{1}{2}} h |u|_{H^2(\Omega )} + h^{-1} \Vert \delta \Vert _{L^2(\omega )} ), \text { when } 1 \lesssim Pe(h) < h^{-1}, \end{aligned}$$

followed by

$$\begin{aligned} \Vert u - u_h\Vert _{L^2(\mathring{\omega }_\beta )} \le C (|\beta |^{\frac{1}{2}} h^{\frac{3}{2}} |u|_{H^2(\Omega )} + h^{-\frac{1}{2}} \Vert \delta \Vert _{L^2(\omega )}), \text { when } Pe(h) > h^{-1}. \end{aligned}$$

It follows that when solving the data assimilation problem against the flow, the diffusivity reduces the convergence order in an intermediate regime. Only for very small diffusion coefficients \(\mu < |\beta | h^2\) do we get quasi-optimal bounds. This asymmetry of the error distribution for moderate Péclet numbers is clearly visible in the left plot of Fig. 2.

Previous results on optimal control for stabilized convection–diffusion equations include [9, 10, 13, 20]. We refer to the first part [5] for a more detailed discussion.

In terms of notation, above and throughout the paper C denotes a generic positive constant, not necessarily the same at each occurrence, that is independent of the coefficients \(\mu ,\, \beta \) and the mesh size h.

2 Discrete setting

Let \(V_h\subset H^1(\Omega )\) be the conforming finite element space of piecewise affine \({\mathbb {P}}_1\) functions defined on a computational mesh \({\mathcal {T}}_h\) that consists of shape-regular triangular elements K with diameter \(h_K\). The mesh size h is the maximum over \(h_K\) and we will assume that \(h<1\). The interior faces of all the elements are collected in the set \({\mathcal {F}}_i\) and the jump of a quantity across such a face F is denoted by \(\llbracket \cdot \rrbracket _F\), omitting the subscript whenever the context is clear. We denote by n the unit normal.

First we introduce the standard inner products with the induced norms

$$\begin{aligned} (v_h,w_h)_\Xi := \int _{\Xi } v_h w_h ~\text{ d } x, \quad \left\langle v_h,w_h\right\rangle _{\partial \Xi } := \int _{\partial \Xi } v_h w_h ~\text{ d } s, \end{aligned}$$

and the bilinear form in the weak formulation of (1)

$$\begin{aligned} a_h(v_h,w_h) := (\beta \cdot \nabla v_h, w_h )_\Omega + ( \mu \nabla v_h , \nabla w_h )_{\Omega }- \left\langle \mu \nabla v_h \cdot n , w_h \right\rangle _{\partial \Omega }. \end{aligned}$$

We will make use of the stabilizing bilinear forms

$$\begin{aligned} s_{\Omega }(v_h,w_h):= \gamma \sum _{F \in {\mathcal {F}}_i} \int _{F} h (\mu + |\beta |h) \llbracket \nabla v_h \cdot n \rrbracket \cdot \llbracket \nabla w_h \cdot n \rrbracket ~\text{ d }s, \end{aligned}$$

which will act on the discrete solution penalizing the jumps of its gradient across interior faces, and

$$\begin{aligned} s_*(v_h,w_h) :=\gamma _* \left( \left\langle (|\beta |+ \mu h^{-1})v_h, w_h \right\rangle _{\partial \Omega }+ (\mu \nabla v_h, \nabla w_h )_\Omega + s_\Omega (v_h,w_h) \right) , \end{aligned}$$

where \(\gamma \) and \(\gamma _*\) are positive constants that can be heuristically chosen at implementation. They do not play a role in the convergence of the method and most of the time we will include them in the generic constant C. For the data assimilation term we consider the scaled inner product in the data set \(\omega \) given by

$$\begin{aligned} s_{\omega }(v_h,w_h):=((|\beta | h^{-1} + \mu h^{-\zeta }) v_h, w_h)_{ \omega }, \quad \zeta \in [0,2]. \end{aligned}$$

To this we add the stabilizing interior penalty term \(s_\Omega \) to define for conciseness

$$\begin{aligned} s(v_h,w_h) := s_\Omega (v_h,w_h) + s_{\omega }(v_h,w_h), \end{aligned}$$

The idea behind the computational method follows the discretize-then-optimize approach: we first discretize and then formulate the data assimilation problem as a PDE-constrained optimization problem with additional stabilizing terms. Apart from their stabilizing intake, these terms are also chosen such that they vanish at optimal rates. For an overview on this approach to ill-posed problems and more details on the desired properties of discrete stabilizers, we refer the reader to [7]. To be more precise, for an approximation \(u_h \in V_h\) and a discrete Lagrange multiplier \(z_h \in V_h\), we consider the functional

$$\begin{aligned} L_h(u_h,z_h) :={}&\tfrac{1}{2} s_\omega (u_h - {\tilde{U}}_\omega , u_h - {\tilde{U}}_\omega ) + a_h(u_h,z_h) - (f,z_h)_\Omega \\&+ \tfrac{1}{2} s_\Omega (u_h,u_h) - \tfrac{1}{2} s_*(z_h,z_h), \end{aligned}$$

where the first term is measuring the misfit between \(u_h\) and the known perturbed restriction \({\tilde{U}}_\omega = u\vert _{\omega } + \delta \), the next two terms are imposing the weak form of the PDE (1) as a constraint, and the last two terms have stabilizing role and act only on the discrete level.

We look for the saddle points of the Lagrangian \(L_h\) and analyse their convergence to the exact solution. Using the optimality conditions we obtain the finite element method for data assimilation subject to (1), which reads as follows: find \((u_h,z_h) \in [V_h]^2\) such that

$$\begin{aligned} \left\{ \begin{array}{rcl} a_h(u_h,w_h) - s_*(z_h,w_h) &{}=&{} (f,w_h)_\Omega \\ a_h(v_h,z_h) + s(u_h,v_h) &{}=&{}s_{\omega }({\tilde{U}}_\omega ,v_h) \end{array} \right. ,\quad \forall (v_h,w_h) \in [V_h]^2. \end{aligned}$$
(5)

Notice that the exact solution \(u\in H^2(\Omega )\) (with noise \(\delta \equiv 0\)) and the dual variable \(z \equiv 0\) satisfy (5) since the gradient of the exact solution has no jumps across interior faces. Hence the Lagrange multiplier \(z_h\) should converge to zero.

Remark 1

The same finite element method (5) has been proposed in the first part [5] for the diffusion-dominated case; \(s_\Omega \) and \(s_*\) are exactly the stabilizing terms introduced there. However, herein we have increased the penalty coefficient in the data term \(s_\omega \) from \(|\beta | h + \mu \) to \(|\beta | h^{-1} + \mu h^{-\zeta }\). We note that the bounds in [5] hold also for this stronger penalty term, but the sensitivity to perturbations in data increases by a factor of \(h^{-1}\).

Proposition 1

The finite element method (5) has a unique solution \((u_h,z_h) \in [V_h]^2\) and the Euclidean condition number \({\mathcal {K}}_2\) of the system matrix satisfies

$$\begin{aligned} {\mathcal {K}}_2 \le C h^{-4}. \end{aligned}$$

Proof

The proof given in [5,  Proposition 2] holds verbatim. \(\square \)

2.1 Stability region and weight functions

We will exploit the convective term of the PDE to obtain stability in the zone that connects through characteristics to the data region \(\omega \). The objective is to obtain weighted \(L^2\)-estimates in this region that are independent of \(\mu \) (but not of the regularity of the exact solution) with the underlying assumption that \(\mu \ll |\beta |\). To this end we first define the subdomain where we can obtain stability (see Fig. 3 for a sketch) and some weight functions that will be used to define weighted norms. These can be given in explicit form in the simple framework where \(\beta = (\beta _1,0)\) and

$$\begin{aligned} \omega := (0,x)\times (y^{-}, y^{+}) \text { with } x> h \text { and } y^{+} - y^{-} > h. \end{aligned}$$

Let \(\omega _\beta \) denote the closed set of all the points \(p \in {\bar{\Omega }}\) that can be reached through characteristics from \(\omega \), i.e. for which there exists \(s \in {\mathbb {R}}\) such that \(p+s\beta \in \partial \omega \). Similarly to the classical work [15], we define the stability region \({\mathring{\omega }_\beta }\) by cutting off a crosswind layer from \(\omega _\beta \), namely

$$\begin{aligned} {\mathring{\omega }_\beta } := \{p \in {\omega _\beta }:\, \text{ dist }(p,\Omega \setminus \omega _\beta ) \ge c_\lambda h^\frac{1}{2} \ln (1/h)\}, \end{aligned}$$
(6)

with the constant \(c_\lambda \) to be made precise in the following. In our setting, we simply have that \({\mathring{\omega }_\beta } = [0,1]\times [\mathring{y}^{-}, \mathring{y}^{+}]\) for some \(\mathring{y}^{+}> \mathring{y}^{-} > 0\). The crosswind layer and its width are motivated by the subsequent construction of weight functions with a specific decay outside \(\omega _\beta \).

Fig. 3
figure 3

Data set \(\omega \) (gray) and the stability region \({\mathring{\omega }_\beta }\) (hatched)

We will consider different weight functions depending on the direction of the convection field. In the downstream case we let

$$\begin{aligned} \psi _1(x,y):= e^{-x}, \text { when } \beta _1 >0, \end{aligned}$$

and in the upstream case

$$\begin{aligned} \psi _1(x,y):= -e^{-x}, \text { when } \beta _1 <0. \end{aligned}$$

In both cases we have that \(\nabla \psi _1 = (-\psi _1, 0)\). Let then \(\psi _2\in W^{1,\infty }(\Omega )\) be a function satisfying

$$\begin{aligned} \psi _2=\left\{ \begin{array}{ll} 1, &{} \text { in } \mathring{\omega }_\beta \\ {\mathcal {O}}(h^{3}), &{} \text { in } \Omega \setminus \omega _\beta \end{array} \right. ,\quad \beta \cdot \nabla \psi _2=0, \quad |\nabla \psi _2| \le C h^{-\frac{1}{2}}. \end{aligned}$$
(7)

Such a function can be obtained by taking a positive constant \(\lambda \) that will be specified later and letting

$$\begin{aligned} \psi _2(x,y) := \left\{ \begin{array}{ll} \exp ((\mathring{y}^{+} - y)/(\lambda h^\frac{1}{2})), &{} y > \mathring{y}^{+} \\ 1, &{} (x,y) \in {\mathring{\omega }_\beta } \\ \exp ((y - \mathring{y}^{-})/(\lambda h^\frac{1}{2})), &{} y < \mathring{y}^{-}. \end{array} \right. \end{aligned}$$

Note that \(\psi _2\) is only piecewise continuously differentiable. For \(\psi _2\) to decrease sufficiently rapidly to \({\mathcal {O}}(h^3)\) outside of \(\omega _\beta \), we can take

$$\begin{aligned} \text{ dist }({\mathring{\omega }_\beta },\Omega \setminus \omega _\beta ) = \min (y^{+} - \mathring{y}^{+}, \mathring{y}^{-} - y^{-}) \ge 3 \lambda h^\frac{1}{2} \ln (1/h), \end{aligned}$$

which corresponds to \(c_\lambda = 3\lambda \) in the definition of \(\mathring{\omega }_\beta \) given in (6). We thus have that

$$\begin{aligned} |\nabla \psi _2| \le \lambda ^{-1} h^{-\frac{1}{2}}, \end{aligned}$$

and in the subsequent proofs the constant \(\lambda \) will be taken large enough.

We now define the weight function \(\varphi \in W^{1,\infty }(\Omega )\) that will be used in the weighted norms. For the downstream case we take in Sect. 4.3

$$\begin{aligned} \varphi := \psi _1 \psi _2 \in (0,1), \text { when } \beta _1 >0, \end{aligned}$$
(8)

and for the upstream case in Sect. 4.4,

$$\begin{aligned} \varphi := \psi _1 \psi _2 \in (-1,0), \text { when } \beta _1 <0. \end{aligned}$$
(9)

Using the product rule and the fact that \(\beta \cdot \nabla \psi _2=0\), it follows that in both cases we have

$$\begin{aligned} \beta \cdot \nabla \varphi = -|\beta | |\varphi |, \end{aligned}$$
(10)

and

$$\begin{aligned} |\nabla \varphi | \le (1 + \lambda ^{-1} h^{-\frac{1}{2}}) |\varphi |. \end{aligned}$$
(11)

We will denote the inflow and outflow boundaries by \(\partial \Omega ^-\) and \(\partial \Omega ^+\), i.e. \(\beta \cdot n < 0\) on \(\partial \Omega ^-\) and \(\beta \cdot n > 0\) on \(\partial \Omega ^+\). We will also assume that \(\beta \cdot n=0\) can only hold on the boundary of \(\Omega \setminus \omega _\beta \), and that \(\mu \le {\text {Pe}}^{-1}_{\lim } |\beta \cdot n|h\) when \(\beta \cdot n \ne 0\). This is straightforward to verify in the model case of the unit square that we are considering.

3 Preliminaries and the discrete commutator property

We first collect several inequalities that will be used repeatedly. We recall the standard discrete inverse inequality

$$\begin{aligned} \Vert \nabla v_h\Vert _{L^2(K)} \le C h^{-1} \Vert v_h\Vert _{L^2(K)},\quad \forall v_h \in {\mathbb {P}}_1(K), \end{aligned}$$
(12)

see e.g. [11,  Lemma 1.138], the continuous trace inequality

$$\begin{aligned} \Vert v\Vert _{L^2(\partial K)} \le C( h^{-\frac{1}{2}} \Vert v\Vert _{L^2(K)} + h^{\frac{1}{2}} \Vert \nabla v\Vert _{L^2(K)}),\quad \forall v \in H^1(K), \end{aligned}$$
(13)

see e.g. [16], and the discrete trace inequality

$$\begin{aligned} \Vert \nabla v_h \cdot n\Vert _{L^2(\partial K)} \le C h^{-\frac{1}{2}} \Vert \nabla v_h\Vert _{L^2(K)},\quad \forall v_h \in {\mathbb {P}}_1(K). \end{aligned}$$
(14)

We will use standard estimates for the \(L^2\)-projection \(\pi _h:L^2(\Omega ) \mapsto V_h\), namely

$$\begin{aligned} \left\| \pi _h u \right\| _{H^m(\Omega )} \le C \left\| u \right\| _{H^m(\Omega )},\quad u\in H^m(\Omega ),\, m=0,1, \\ \left\| u-\pi _h u \right\| _{H^m(\Omega )} \le C h^{k-m} \left\| u \right\| _{H^k(\Omega )},\quad u\in H^k(\Omega ),\, k=1,2. \end{aligned}$$

Scaling the result in [4,  Lemma 2] we recall the Poincaré-type inequality

$$\begin{aligned} \Vert (\mu ^{\frac{1}{2}} h + |\beta |^{\frac{1}{2}} h^{\frac{3}{2}}) v_h\Vert _{H^1(\Omega )} \le C \gamma ^{-\frac{1}{2}} s(v_h,v_h)^{\frac{1}{2}}, \quad \forall v_h \in V_h. \end{aligned}$$
(15)

Using (13) and approximation estimates we also have the jump inequality

$$\begin{aligned} s_\Omega (\pi _h u,\pi _h u)^{\frac{1}{2}} \le C \gamma ^{\frac{1}{2}} (\mu ^{\frac{1}{2}} h + |\beta |^{\frac{1}{2}} h^{\frac{3}{2}})|u|_{H^2(\Omega )},\quad \forall u\in H^2(\Omega ). \end{aligned}$$
(16)

We also recall that for a Lipschitz domain K—and hence for any element \(K\in {\mathcal {T}}_h\)—a function \(\varphi \) is Lipschitz continuous if and only if \(\varphi \in W^{1,\infty }(K)\). This follows from the proof in [12,  Theorem 4, p. 294] where the extension operator in the third step of the proof is replaced by the extension operator in [19,  Theorem 5, p. 181]. This equivalence holds for more general domains satisfying the minimal smoothness property in [19,  p. 189]. The proof in [12,  Theorem 4, p. 294] also shows that the mean value theorem holds and for any \(x,y\in K\),

$$\begin{aligned} |\varphi (x) - \varphi (y)| \le C_{ex} h_K |\varphi |_{W^{1,\infty }(K)}, \end{aligned}$$
(17)

where \(|\varphi |_{W^{1,\infty }(K)} := \Vert \nabla \varphi \Vert _{\infty ,K}\) and the constant \(C_{ex}>0\) is the norm of the extension operator used.

Lemma 1

For all \(v_h \in V_h\) and \(K \in {\mathcal {T}}_h\), the following inequalities hold

$$\begin{aligned}&\Vert \varphi \Vert _{\infty ,K} \Vert v_h\Vert _K \le C \Vert v_h \varphi \Vert _K, \end{aligned}$$
(18)
$$\begin{aligned}&\Vert v_h \varphi \Vert _{\partial K} \le C h^{-\frac{1}{2}} \Vert v_h \varphi \Vert _K, \end{aligned}$$
(19)

assuming that \((h + \lambda ^{-1} h^{\frac{1}{2}})\) is small enough.

Proof

Let \(x^* \in K\) be such that \(|\varphi (x^*)| = \Vert \varphi \Vert _{\infty ,K}\). Using the triangle inequality we may write

$$\begin{aligned} \Vert \varphi \Vert _{\infty ,K} \Vert v_h\Vert _K \le \Vert \varphi v_h\Vert _K + \Vert (\varphi (x^*) - \varphi ) v_h\Vert _K. \end{aligned}$$

By the mean value theorem (17) we have that

$$\begin{aligned} |\varphi (x^*) - \varphi | \le C_{ex} h |\varphi |_{W^{1,\infty }(K)}, \end{aligned}$$

and by (11) together with the assumption that \(C_{ex}(h + \lambda ^{-1} h^{\frac{1}{2}}) < \frac{1}{2}\) we get

$$\begin{aligned} C_{ex} h |\varphi |_{W^{1,\infty }(K)} \le C_{ex} (h +\lambda ^{-1} h^{\frac{1}{2}}) \Vert \varphi \Vert _{\infty ,K} \le \frac{1}{2} \Vert \varphi \Vert _{\infty ,K}. \end{aligned}$$

It follows that

$$\begin{aligned} \Vert \varphi \Vert _{\infty ,K} \Vert v_h\Vert _K \le \Vert \varphi v_h\Vert _K + \frac{1}{2} \Vert \varphi \Vert _{\infty ,K}\Vert v_h\Vert _K, \end{aligned}$$

from which the claim (18) is immediate. Considering now (19), using the standard element-wise trace inequality (13) we have

$$\begin{aligned} \Vert h^{\frac{1}{2}} v_h \varphi \Vert _{\partial K} \le C (\Vert v_h \varphi \Vert _{ K} + h \Vert \nabla (v_h \varphi )\Vert _K). \end{aligned}$$

We bound the gradient term using (11) and the inverse inequality (12),

$$\begin{aligned} h\Vert \nabla (v_h \varphi )\Vert _K&\le h\Vert v_h \nabla \varphi \Vert _K + h\Vert \varphi \nabla v_h\Vert _K \\&\le (h + \lambda ^{-1} h^{\frac{1}{2}}) \Vert \varphi \Vert _{\infty ,K} \Vert v_h \Vert _K + C \Vert \varphi \Vert _{\infty ,K} \Vert v_h\Vert _K. \end{aligned}$$

We conclude by collecting the terms and using (18). \(\square \)

3.1 Discrete commutator property

We denote by \(i_h\) the Lagrange nodal interpolant. We herein consider a Lipschitz weight function and prove the following super approximation result, also known as the discrete commutator property. This result will be essential to derive local weighted estimates and is similar to the one proven in [2] for smooth compactly supported weight functions. For an introduction to interior estimates we refer the reader to [17].

Lemma 2

Let \(v_h \in V_h\) and \(K\in {\mathcal {T}}_h\). Then for any weight function \(\varphi \in W^{1,\infty }(K)\)

$$\begin{aligned} \Vert v_h \varphi - i_h (v_h \varphi )\Vert _K + h \Vert \nabla (v_h \varphi - i_h (v_h \varphi ))\Vert _K \le C h |\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _K. \end{aligned}$$

Proof

We will first show the \(L^2\)-norm estimate

$$\begin{aligned} \Vert v_h \varphi - i_h (v_h \varphi )\Vert _{K} \le C h |\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _{K}. \end{aligned}$$

Let \(x^* \in K\) be such that \(|\varphi (x^*)| = \Vert \varphi \Vert _{\infty ,K}\) and let \(R_\varphi = \varphi - \varphi (x^*)\). Note that

$$\begin{aligned} \Vert (1-i_h) (v_h \varphi )\Vert _{K} = \Vert (1-i_h) (v_h R_\varphi )\Vert _{K}. \end{aligned}$$

Observe that \(i_h (v_h \varphi ) = i_h (v_h i_h\varphi )\) and therefore

$$\begin{aligned} \Vert (1-i_h) (v_h R_\varphi )\Vert _{K} = \Vert v_h R_\varphi - i_h (v_h i_h R_\varphi )\Vert _{K}. \end{aligned}$$

By the triangle inequality

$$\begin{aligned} \Vert i_h (v_h i_h R_\varphi ) - v_h R_\varphi \Vert _{K} \le \Vert i_h (v_h i_h R_\varphi ) - v_h i_h R_\varphi \Vert _{K} + \Vert v_h ( i_h R_\varphi - R_\varphi )\Vert _{K}. \end{aligned}$$

For the first term, since \(v_h i_h R_\varphi \in H^1(K)\) we have by standard interpolation that

$$\begin{aligned} \Vert i_h (v_h i_h R_\varphi ) - v_h i_h R_\varphi \Vert _{K} \le C h \Vert \nabla (v_h i_h R_\varphi )\Vert _{K} \end{aligned}$$

and then

$$\begin{aligned} \Vert \nabla (v_h i_h R_\varphi )\Vert _{K} \le |i_h R_\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _{K} + \Vert i_h R_\varphi \Vert _{\infty ,K} \Vert \nabla v_h\Vert _{K}. \end{aligned}$$

By inserting \(\nabla \varphi \) and \(\varphi \), respectively, and using interpolation estimates in \(W^{1,\infty }(K)\) [11,  Theorem 1.103] and the mean value theorem (17) we have the following approximation

$$\begin{aligned} h |i_h R_\varphi |_{W^{1,\infty }(K)} + \Vert i_h R_\varphi \Vert _{\infty ,K} \le C h |\varphi |_{W^{1,\infty }(K)}. \end{aligned}$$

Combined with the previous estimate and the inverse inequality (12) this gives that

$$\begin{aligned} \Vert \nabla (v_h i_h R_\varphi )\Vert _{K} \le C|\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _{K}. \end{aligned}$$
(20)

For the second term, using again interpolation [11,  Theorem 1.103] we have

$$\begin{aligned} \Vert v_h ( i_h R_\varphi - R_\varphi )\Vert _{K} \le \Vert i_h R_\varphi - R_\varphi \Vert _{\infty ,K} \Vert v_h\Vert _{K} \le C h |\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _{K}, \end{aligned}$$

and thus we have shown that

$$\begin{aligned} \Vert v_h \varphi - i_h (v_h \varphi )\Vert _{K} \le C h |\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _{K}. \end{aligned}$$

The approximation estimate for the gradient follows by the same arguments. Indeed,

$$\begin{aligned} \Vert \nabla (1 - i_h) (v_h \varphi )\Vert _K&= \Vert \nabla (1 - i_h) (v_h R_\varphi )\Vert _K = \Vert \nabla (v_h R_\varphi ) - \nabla i_h(v_h i_h R_\varphi )\Vert _K \\&\le \Vert \nabla (v_h R_\varphi ) - \nabla (v_h i_h R_\varphi )\Vert _K + \Vert \nabla (v_h i_h R_\varphi ) - \nabla i_h(v_h i_h R_\varphi )\Vert _K. \end{aligned}$$

We first use interpolation and the inverse inequality (12) to get

$$\begin{aligned} \Vert \nabla (v_h ( R_\varphi - i_h R_\varphi ))\Vert _K&\le \Vert v_h \nabla (R_\varphi - i_h R_\varphi )\Vert _K + \Vert (R_\varphi - i_h R_\varphi ) \nabla v_h \Vert _K \\&\le C |\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _{K}. \end{aligned}$$

Then we use an inverse inequality followed by interpolation and (20) to obtain

$$\begin{aligned} \Vert \nabla (v_h i_h R_\varphi ) - \nabla i_h(v_h i_h R_\varphi )\Vert _K \le C \Vert \nabla (v_h i_h R_\varphi )\Vert _{K} \le C|\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _{K}. \end{aligned}$$

\(\square \)

4 A priori local error estimates

4.1 Consistency and continuity

The following consistency result holds exactly as in the diffusion-dominated case, see [5,  Lemma 4]. We give the proof for the sake of completeness.

Lemma 3

(Consistency) Let \(u \in H^2(\Omega )\) be the solution to (1) and \((u_h,z_h) \in [V_h]^2\) the solution to (5), then

$$\begin{aligned} a_h(\pi _h u - u_h,w_h) + s_*(z_h,w_h) = a_h(\pi _h u - u,w_h), \end{aligned}$$

and

$$\begin{aligned} -a_h(v_h,z_h) + s(\pi _h u - u_h,v_h) = s_\Omega (\pi _h u - u,v_h) + s_\omega (\pi _h u - {\tilde{U}}_\omega ,v_h), \end{aligned}$$

for all \((v_h,w_h)\in [V_h]^2\).

Proof

The first claim follows from the definition of \(a_h\), since

$$\begin{aligned} a_h(u_h,w_h) - s_*(z_h,w_h) = (f,w_h)_\Omega = (-\mu \Delta u + \beta \cdot \nabla u, w_h)_\Omega = a_h(u,w_h), \end{aligned}$$

where in the last equality we integrated by parts. The second claim follows similarly from

$$\begin{aligned} a_h(v_h,z_h) + s(u_h,v_h) = s_\omega ({\tilde{U}}_\omega ,v_h), \end{aligned}$$

which combined with the fact that \(s_\Omega (u,v_h)=0\) leads to

$$\begin{aligned} -a_h(v_h,z_h) + s(\pi _h u - u_h,v_h)&= s(\pi _h u,v_h) - s_\omega (\tilde{U}_\omega ,v_h) \\&= s_\Omega (\pi _h u - u,v_h) + s_\omega (\pi _h u - {\tilde{U}}_\omega ,v_h). \end{aligned}$$

\(\square \)

We now introduce the stabilization norm on \([V_h]^2\) by combining the primal and dual stabilizers

$$\begin{aligned} \Vert (v_h,w_h)\Vert ^2_s:= s(v_h,v_h) + s_*(w_h,w_h), \end{aligned}$$

and the “continuity norm” defined on \(H^{\frac{3}{2}+\varepsilon }(\Omega )\), for any \(\varepsilon >0\),

$$\begin{aligned} \Vert v \Vert _\sharp := \Vert |\beta |^\frac{1}{2} h^{-\frac{1}{2}} v \Vert _{\Omega } + \Vert |\beta |^{\frac{1}{2}} h^{\frac{1}{2}} \nabla v\Vert _\Omega + \Vert h^{\frac{1}{2}} \mu ^{\frac{1}{2}} \nabla v \cdot n\Vert _{\partial \Omega }. \end{aligned}$$

From the jump inequality (16), standard approximation bounds for \(\pi _h\) and the trace inequality (13), it follows that for \(u \in H^2(\Omega )\)

$$\begin{aligned} \Vert (u - \pi _h u,0)\Vert _s + \Vert u - \pi _h u \Vert _\sharp \le C (\mu ^{\frac{1}{2}} h +|\beta |^{\frac{1}{2}} h^{\frac{3}{2}}) |u|_{H^2(\Omega )}. \end{aligned}$$
(21)

We also define the orthogonal space

$$\begin{aligned} V_h^\perp := \{ v \in H^2(\Omega ): (v,w_h)_\Omega = 0,\quad \forall w_h \in V_h\}. \end{aligned}$$

Lemma 4

(Continuity) Let \(v \in V_h^{\perp }\) and \(w_h \in V_h\), then

$$\begin{aligned} a_h(v,w_h) \le C \Vert v \Vert _\sharp \Vert (0,w_h)\Vert _s. \end{aligned}$$

Proof

Integrating by parts in the convective term of \(a_h\) and using \(\nabla \cdot \beta = 0\) leads to

$$\begin{aligned} a_h(v,w_h) = -(v,\beta \cdot \nabla w_h)_\Omega + \left\langle v \beta \cdot n,w_h \right\rangle _{\partial \Omega } + ( \mu \nabla v , \nabla w_h )_{\Omega } -\left\langle \mu \nabla v \cdot n , w_h \right\rangle _{\partial \Omega }. \end{aligned}$$

For the first term we recall the discrete approximation estimate that holds for any piecewise linear \(\beta \), see e.g. [6,  Theorem 2.2],

$$\begin{aligned} \inf _{x_h \in V_h} \Vert h^{\frac{1}{2}} (\beta \cdot \nabla w_h - x_h)\Vert _\Omega&\le C \left( \sum _{F \in \mathcal {F}_i} \Vert h \llbracket \beta \cdot \nabla w_h \rrbracket \Vert _{F}^2\right) ^{\frac{1}{2}} \nonumber \\&\le C |\beta |^\frac{1}{2} \gamma ^{-\frac{1}{2}} s_\Omega (w_h,w_h)^{\frac{1}{2}} \end{aligned}$$
(22)

and use orthogonality to obtain

$$\begin{aligned} -(v,\beta \cdot \nabla w_h)_\Omega \le \Vert h^{-\frac{1}{2}} v\Vert _\Omega \inf _{x_h \in V_h} \Vert h^{\frac{1}{2}} (\beta \cdot \nabla w_h -x_h)\Vert _\Omega \le C \Vert v \Vert _\sharp \Vert (0,w_h)\Vert _s. \end{aligned}$$

For the remaining terms, applying the Cauchy–Schwarz inequality we see that

$$\begin{aligned} \left\langle v \beta \cdot n,w_h \right\rangle _{\partial \Omega }+ ( \mu \nabla v , \nabla w_h )_{\Omega }- \left\langle \mu \nabla v \cdot n , w_h \right\rangle _{\partial \Omega } \le C \Vert v \Vert _\sharp \Vert (0,w_h)\Vert _s. \end{aligned}$$

\(\square \)

Note that the proof of the above continuity estimate holds for any divergence-free piecewise linear velocity field \(\beta \). To address the case of a general velocity field \(\beta \in W^{1,\infty }(\Omega )\) one can use a similar argument by considering its piecewise linear approximation as in [5,  Lemma 5]. Assuming that \(\beta \) is divergence-free, the constant would be proportional to \(h Pe(h)^\frac{1}{2} |\beta |_{1,\infty } / \Vert \beta \Vert _{\infty }\), otherwise it would be proportional to \(Pe(h)^\frac{1}{2} |\beta |_{1,\infty } / \Vert \beta \Vert _{\infty }\).

4.2 Convergence of regularization

We now prove optimal convergence for the stabilizing and data assimilation terms.

Proposition 2

(Convergence of regularization) Let \(u \in H^2(\Omega )\) be the solution of (1) and \((u_h,z_h) \in [V_h]^2\) the solution to (5), then there holds

$$\begin{aligned} \Vert (\pi _h u - u_h,z_h)\Vert _s \le C(\mu ^{\frac{1}{2}} h + |\beta |^{\frac{1}{2}} h^{\frac{3}{2}}) (|u|_{H^2(\Omega )} + h^{-2}\Vert \delta \Vert _\omega ). \end{aligned}$$

Proof

Denoting \(e_h = \pi _h u - u_h\) we have that

$$\begin{aligned} \Vert (e_h,z_h)\Vert _s^2 = a_h(e_h,z_h) + s_*(z_h,z_h) - a_h(e_h,z_h) + s(e_h,e_h). \end{aligned}$$

Using both claims in Lemma 3 we may write

$$\begin{aligned} \Vert (e_h,z_h)\Vert _s^2 = a_h(\pi _h u - u,z_h) +s_\Omega (\pi _h u - u,e_h)+s_\omega (\pi _h u - {\tilde{U}}_\omega ,e_h). \end{aligned}$$

Since \(\pi _h u - u \in V_h^\perp \) we have by Lemma 4 that

$$\begin{aligned} a_h(\pi _h u - u,z_h) \le C \Vert \pi _h u - u\Vert _\sharp \Vert (0,z_h)\Vert _s. \end{aligned}$$

We bound the other terms using the Cauchy–Schwarz inequality

$$\begin{aligned} s_\Omega (\pi _h u - u,e_h)+s_\omega (\pi _h u - {\tilde{U}}_\omega ,e_h) \le (\Vert (\pi _h u - u,0)\Vert _s + (|\beta | h^{-1} + \mu h^{-\zeta })^{\frac{1}{2}} \Vert \delta \Vert _\omega ) \Vert (e_h,0)\Vert _s. \end{aligned}$$

Collecting the above bounds we have

$$\begin{aligned} \Vert (e_h,z_h)\Vert _s^2 \le C(\Vert \pi _h u - u\Vert _\sharp +\Vert (\pi _h u - u,0)\Vert _s + (|\beta | h^{-1} + \mu h^{-\zeta })^{\frac{1}{2}} \Vert \delta \Vert _\omega ) \Vert (e_h,z_h)\Vert _s \end{aligned}$$

and the claim follows by applying the approximation inequality (21). \(\square \)

Remark 2

Compared to the result in the diffusion-dominated case [5,  Proposition 3], the sensitivity to data perturbations has increased by a factor of \(h^{-1}\). This is due to the stronger penalty in the data term \(s_\omega \) (c.f. Remark 1).

4.3 Downstream estimates

In this case we consider \(\beta = (\beta _1,0)\) with \(\beta _1 > 0\) and the data set

$$\begin{aligned} \omega = (0,x)\times (y^{-}, y^{+}) \end{aligned}$$

touching part of the inflow boundary \(\partial \Omega ^-\). We aim to obtain control of the following weighted triple norm defined on \(V_h\),

$$\begin{aligned} |\!|\!| v_h|\!|\!|_\varphi ^2 := \Vert |\beta |^\frac{1}{2} v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega }^2 + \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega }^2 + \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}}\Vert _{ \partial \Omega ^+}^2, \end{aligned}$$
(23)

where \(\varphi \) is given by (8). Since \(\varphi \in (0,1)\), we will often use that \(\Vert \cdot \varphi \Vert _\Omega \le \Vert \cdot \varphi ^\frac{1}{2} \Vert _\Omega \). We first consider \(v_h \varphi \) as a test function in the weak bilinear form \(a_h\) and obtain the following bound.

Lemma 5

There exist \(\alpha >0\) and \(h_0>0\) such that for all \(h<h_0\) and \(v_h \in V_h\) we have

$$\begin{aligned} \alpha |\!|\!| v_h|\!|\!|_\varphi ^2 \le a_h(v_h, v_h \varphi ) + C \Vert (v_h,0)\Vert ^2_s. \end{aligned}$$

Proof

We start with the convective term. Since \(\nabla \cdot \beta = 0\), the divergence theorem leads to

$$\begin{aligned} 2 (\beta \cdot \nabla v_h, v_h \varphi )_\Omega = \left\langle v_h \beta \cdot n, v_h \varphi \right\rangle _{\partial \Omega } - (v_h, v_h \beta \cdot \nabla \varphi )_\Omega . \end{aligned}$$

Then combining with (10) we have

$$\begin{aligned} (\beta \cdot \nabla v_h, v_h \varphi )_\Omega = \frac{1}{2} \left( \left\langle v_h \beta \cdot n, v_h \varphi \right\rangle _{\partial \Omega } + |\beta | \Vert v_h \varphi ^{\frac{1}{2}}\Vert ^2_\Omega \right) . \end{aligned}$$

We split the boundary term into inflow and outflow

$$\begin{aligned} \left\langle v_h \beta \cdot n, v_h \varphi \right\rangle _{\partial \Omega } = -\Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}} \Vert _{\partial \Omega ^-}^2 + \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}} \Vert _{\partial \Omega ^+}^2, \end{aligned}$$

and write

$$\begin{aligned} \frac{1}{2} \left( \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}} \Vert _{\partial \Omega ^+}^2 + |\beta | \Vert v_h \varphi ^{\frac{1}{2}}\Vert ^2_\Omega \right) =(\beta \cdot \nabla v_h, v_h \varphi )_\Omega + \frac{1}{2} \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}} \Vert _{\partial \Omega ^-}^2. \end{aligned}$$

Splitting now the inflow boundary with respect to the closed set \(\omega _\beta \) and using the discrete trace inequality (13) in \(\omega \), and the weight decay (7) together with a standard global trace inequality for \(H^1\) functions outside, we have that

$$\begin{aligned} \begin{aligned} \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}}\Vert _{\partial \Omega ^-}&\le C |\beta |^{\frac{1}{2}} (\Vert v_h \varphi ^{\frac{1}{2}}\Vert _{\partial \Omega ^- \cap \omega _\beta } + \Vert v_h \varphi ^{\frac{1}{2}}\Vert _{\partial \Omega ^- \setminus \omega _\beta })\\&\le C |\beta |^{\frac{1}{2}} h^{-\frac{1}{2}} \Vert v_h \Vert _{\omega } + C |\beta |^{\frac{1}{2}} h^{\frac{3}{2}} \Vert v_h\Vert _{H^1(\Omega )} \\&\le C \gamma ^{-\frac{1}{2}} \Vert (v_h,0)\Vert _s, \end{aligned} \end{aligned}$$
(24)

where in the last step we used the Poincaré-type inequality (15). Hence we obtain control over the convective terms in the triple weighted norm

$$\begin{aligned} \frac{1}{2} \left( \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}} \Vert _{\partial \Omega ^+}^2 \!+\! |\beta | \Vert v_h \varphi ^{\frac{1}{2}}\Vert ^2_\Omega \right) \!\!\le \!\! (\beta \cdot \nabla v_h, v_h \varphi )_\Omega \!+\! C \gamma ^{-1} \Vert (v_h,0)\Vert ^2_s. \end{aligned}$$
(25)

Let us consider the terms in \(a_h(v_h, v_h \varphi )\) corresponding to the diffusion operator, starting with

$$\begin{aligned} (\mu \nabla v_h,\nabla (v_h \varphi ))_{\Omega } = \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega }^2 + (\mu \nabla v_h, v_h \nabla \varphi )_\Omega , \end{aligned}$$

which we rearrange as

$$\begin{aligned} \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega }^2 = (\mu \nabla v_h,\nabla (v_h \varphi ))_{\Omega } - (\mu \nabla v_h, v_h \nabla \varphi )_\Omega . \end{aligned}$$

Bounding \(\nabla \varphi \) by (11) and using Cauchy–Schwarz together with \(\mu \le |\beta | h\) we have that

$$\begin{aligned} | (\mu \nabla v_h, v_h \nabla \varphi )_\Omega |&\le \mu (|\nabla v_h \cdot \nabla \varphi |, v_h)_\Omega \\&\le \mu (1 + \lambda ^{-1} h^{-\frac{1}{2}}) (|\nabla v_h| \varphi ^{\frac{1}{2}}, v_h \varphi ^{\frac{1}{2}})_{\Omega }\\&\le \frac{1}{3} \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega }^2 + C(h + \lambda ^{-2}) |\beta | \Vert v_h \varphi ^{\frac{1}{2}}\Vert ^2_{\Omega }. \end{aligned}$$

We split the boundary term \(\left\langle \mu \nabla v_h \cdot n , v_h \varphi \right\rangle _{\partial \Omega }\) into inflow and outflow again. Similarly to (24) we have that

$$\begin{aligned} \left\langle \mu \nabla v_h \cdot n, v_h \varphi \right\rangle _{ \partial \Omega ^-} \le C h \gamma ^{-1} \Vert (v_h,0)\Vert ^2_s. \end{aligned}$$

For the outflow boundary term we use Cauchy–Schwarz and a trace inequality to obtain

$$\begin{aligned} \left\langle \mu \nabla v_h \cdot n, v_h \varphi \right\rangle _{ \partial \Omega ^+}&\le \Vert \mu ^{\frac{1}{2}} \nabla v_h \cdot n \varphi ^{\frac{1}{2}}\Vert _{ \partial \Omega ^+} \Vert \mu ^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}}\Vert _{ \partial \Omega ^+} \\&\le C h^{-\frac{1}{2}} \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega } Pe_{\lim }^{-\frac{1}{2}} h^{\frac{1}{2}} \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}}\Vert _{ \partial \Omega ^+} \\&\le {\textstyle \frac{1}{3}} \big \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi ^{\frac{1}{2}}\big \Vert _{\Omega }^2 + Pe^{-1}_{\lim } \big \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}}\big \Vert _{ \partial \Omega ^+}^2. \end{aligned}$$

We denote the part of the boundary where \(\beta \cdot n = 0\) by \(\partial \Omega ^0\) and use the assumption that \(\partial \Omega ^0\) is away from \(\omega _\beta \), meaning that the weight function \(\varphi \) is \({\mathcal {O}}(h^3)\) there. We use trace inequalities and (15) to bound

$$\begin{aligned} \left\langle \mu \nabla v_h \cdot n , v_h \varphi \right\rangle _{\partial \Omega ^0} \le C \gamma ^{-1} \Vert (v_h,0)\Vert ^2_s. \end{aligned}$$

Collecting the above bounds we obtain that

$$\begin{aligned} \tfrac{1}{3} \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega }^2 \le {}&(\mu \nabla v_h,\nabla (v_h \varphi ))_{\Omega } - \left\langle \mu \nabla v_h \cdot n , v_h \varphi \right\rangle _{\partial \Omega }\\&+ C (h + \lambda ^{-2} + Pe^{-1}_{\lim }) ( \Vert |\beta |^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega }^2 + \Vert |\beta \cdot n|^{\frac{1}{2}} v_h \varphi ^{\frac{1}{2}}\Vert ^2_{ \partial \Omega ^+})\\&+ C \gamma ^{-1} \Vert (v_h,0)\Vert ^2_s. \end{aligned}$$

We conclude by combining this with (25) and assuming that h is small enough and \(Pe_{\lim }\) are large enough (thus absorbing the convective terms from the rhs into the lhs). \(\square \)

Now we refine the control over the triple norm \(|\!|\!| v_h|\!|\!|_\varphi \) by taking the projection \(\pi _h (v_h \varphi ) \in V_h\) as a test function.

Corollary 1

There exist \(\alpha >0\) and \(h_0>0\) such that for all \(h<h_0\) and \(v_h \in V_h\) we have

$$\begin{aligned} \alpha |\!|\!| v_h|\!|\!|_\varphi ^2 \le a_h(v_h, \pi _h (v_h\varphi ) ) + C \Vert (v_h,0)\Vert ^2_s. \end{aligned}$$

Proof

Since

$$\begin{aligned} a_h(v_h, \pi _h (v_h \varphi ) ) = a_h(v_h, (\pi _h-1) (v_h \varphi ) ) + a_h(v_h, v_h \varphi ), \end{aligned}$$

we must bound \(a_h(v_h, (\pi _h-1) (v_h \varphi ) )\) in a suitable way. Writing out the terms we have

$$\begin{aligned} a_h(v_h, (\pi _h-1) (v_h \varphi ) )&= (\beta \cdot \nabla v_h, (\pi _h-1) (v_h \varphi ) )_{\Omega } + (\mu \nabla v_h, \nabla (\pi _h-1) (v_h \varphi ) )_{\Omega } \nonumber \\&\quad \quad - \left<\mu \nabla v_h \cdot n, (\pi _h-1) (v_h \varphi ) \right>_{\partial \Omega } = I + II + III. \end{aligned}$$

Let us consider the convection term first, and use orthogonality combined with (22)

$$\begin{aligned} I = (\beta \cdot \nabla v_h, (\pi _h-1) (v_h\varphi ) )_{\Omega }&\le C |\beta |^\frac{1}{2} \gamma ^{-\frac{1}{2}} \Vert (v_h,0)\Vert _s h^{-\frac{1}{2}} \Vert (\pi _h-1) (v_h\varphi )\Vert _{\Omega } \\&\le C |\beta |^\frac{1}{2} \gamma ^{-\frac{1}{2}} \Vert (v_h,0)\Vert _s h^{-\frac{1}{2}} \Vert (i_h-1) (v_h\varphi )\Vert _{\Omega }. \end{aligned}$$

Integrating by parts and using that \(\Delta v_h = 0\) on every element K we obtain by the trace inequality (13) and the assumption \(Pe(h)>1\) that

$$\begin{aligned}II + III&= \sum _{F \in {\mathcal {F}}_i} \int _{F} \mu \llbracket \nabla v_h \cdot n \rrbracket (\pi _h-1) (v_h \varphi ) ~\text{ d }s \\&\le C |\beta |^{\frac{1}{2}} \gamma ^{-\frac{1}{2}} s_\Omega (v_h,v_h)^{\frac{1}{2}} (h^{-\frac{1}{2}} \Vert (\pi _h-1)(v_h\varphi )\Vert _{\Omega }\\&\quad + h^{\frac{1}{2}} \Vert \nabla (\pi _h-1) (v_h\varphi )\Vert _{\Omega }). \end{aligned}$$

Notice that \(i_h (v_h\varphi ) = \pi _h(i_h (v_h\varphi ))\) and the stability of the projection gives

$$\begin{aligned} \Vert \nabla (\pi _h-i_h) (v_h\varphi )\Vert _{\Omega } = \Vert \nabla \pi _h(1-i_h) (v_h\varphi )\Vert _{\Omega } \le C \Vert \nabla (1-i_h) (v_h\varphi )\Vert _{\Omega }, \end{aligned}$$
(26)

and hence

$$\begin{aligned} \begin{aligned} h^{\frac{1}{2}} \Vert \nabla (\pi _h-1) (v_h\varphi )\Vert _{\Omega }&\le h^{\frac{1}{2}} (\Vert \nabla (\pi _h-i_h) (v_h\varphi )\Vert _{\Omega }+\Vert \nabla (i_h-1) (v_h\varphi )\Vert _{\Omega })\\&\le C h^{\frac{1}{2}} \Vert \nabla (i_h-1) (v_h \varphi )\Vert _{\Omega }. \end{aligned} \end{aligned}$$
(27)

Since

$$\begin{aligned} h^{-\frac{1}{2}} \Vert (\pi _h-1) (v_h \varphi )\Vert _{\Omega } \le C h^{-\frac{1}{2}} \Vert (i_h-1) (v_h\varphi )\Vert _{\Omega }, \end{aligned}$$

collecting the contributions above we see that

$$\begin{aligned} I+II+III \le C |\beta |^{\frac{1}{2}} \gamma ^{-\frac{1}{2}} \Vert (v_h,0)\Vert _s \underbrace{(h^{-\frac{1}{2}} \Vert (i_h-1) (v_h\varphi )\Vert _{\Omega } + h^{\frac{1}{2}} \Vert \nabla (i_h-1) (v_h\varphi )\Vert _{\Omega })}_{IV}, \end{aligned}$$

and hence

$$\begin{aligned} a_h(v_h, (\pi _h-1) (v_h \varphi )) = I+II+III \le C \gamma ^{-1} \Vert (v_h,0)\Vert ^2_s + |\beta | IV^2. \end{aligned}$$

The discrete commutator property Lemma 2 together with the \(\varphi \)-bounds (11) and (18) give that

$$\begin{aligned} IV \le C h^{\frac{1}{2}} \Vert \nabla \varphi \Vert _{\infty ,\Omega } \Vert v_h\Vert _\Omega \le C (h^{\frac{1}{2}}+\lambda ^{-1}) \Vert v_h \varphi \Vert _\Omega . \end{aligned}$$
(28)

Since \(\varphi \in (0,1)\) and \(\varphi < \varphi ^\frac{1}{2}\), it follows that for h small enough and \(\lambda \) large enough, given some \(\alpha >0\) we have

$$\begin{aligned} |\beta | IV^2 \le \frac{\alpha }{2} |\!|\!| v_h|\!|\!|_\varphi ^2. \end{aligned}$$
(29)

Collecting the estimates for \(a_h(v_h, (\pi _h-1) (v_h \varphi ) )\) and using Lemma 5 we see that

$$\begin{aligned} a_h(v_h, \pi _h (v_h \varphi ) )&= a_h(v_h, (\pi _h-1) (v_h \varphi ) ) + a_h(v_h, v_h \varphi )\\&\ge \frac{\alpha }{2} |\!|\!| v_h|\!|\!|_\varphi ^2 - C \gamma ^{-1} \Vert (v_h,0)\Vert ^2_s, \end{aligned}$$

from which we conclude by renaming \(\alpha /2\) as \(\alpha \). \(\square \)

Lemma 6

For all \(v_h \in V_h\) there holds

$$\begin{aligned} \Vert (0,\pi _h (v_h \varphi ))\Vert ^2_s \le C (|\!|\!| v_h|\!|\!|^2_\varphi + \Vert (v_h,0)\Vert ^2_s). \end{aligned}$$

Proof

First note that by triangle inequalities we have that up to a constant

$$\begin{aligned} \Vert (0,\pi _h (v_h \varphi ))\Vert _s&\le \Vert \mu ^{\frac{1}{2}} \nabla (\pi _h - 1) (v_h \varphi ))\Vert _\Omega + \Vert \mu ^{\frac{1}{2}} \nabla (v_h \varphi ))\Vert _\Omega \\&\quad + (|\beta |+ \mu h^{-1})^{\frac{1}{2}} (\Vert (\pi _h-1) (v_h \varphi )\Vert _{\partial \Omega } + \Vert v_h\varphi \Vert _{\partial \Omega } )\\&\quad + s_\Omega (\pi _h (v_h \varphi ), \pi _h (v_h \varphi ))^{\frac{1}{2}}. \end{aligned}$$

We bound these terms line by line. Using (27), (28), (11) and (18) we bound the first line by

$$\begin{aligned} |\beta |^{\frac{1}{2}} h^{\frac{1}{2}} \Vert \nabla (i_h-1) (v_h \varphi )\Vert _{\Omega } +\Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi \Vert _\Omega + 2 |\beta |^{\frac{1}{2}} (h^\frac{1}{2} + \lambda ^{-1}) \Vert v_h \varphi \Vert _\Omega \le C |\!|\!| v_h|\!|\!|_\varphi . \end{aligned}$$

For the second line, using a global trace inequality and the stability of the projection we have

$$\begin{aligned} (|\beta |+ \mu h^{-1})^{\frac{1}{2}} \Vert (\pi _h - 1) (v_h \varphi )\Vert _{\partial \Omega } \le C (|\beta |+ \mu h^{-1})^{\frac{1}{2}} \Vert v_h \varphi \Vert _{\Omega } \le C |\beta |^{\frac{1}{2}} \Vert v_h \varphi ^{\frac{1}{2}}\Vert _{\Omega }. \end{aligned}$$

Splitting the boundary into inflow and outflow and using (24), we have

$$\begin{aligned} (|\beta |+ \mu h^{-1})^{\frac{1}{2}} \Vert v_h \varphi \Vert _{\partial \Omega } \le C |\beta |^{\frac{1}{2}} \Vert v_h \varphi \Vert _{\partial \Omega } \le C \Vert (v_h,0)\Vert _s + C|\!|\!| v_h|\!|\!|_\varphi . \end{aligned}$$

For the contribution of the jump term, we insert \(i_h\) and bound

$$\begin{aligned} \begin{aligned} s_\Omega (\pi _h (v_h \varphi ), \pi _h (v_h \varphi ))^{\frac{1}{2}}&\le s_\Omega ((\pi _h - i_h) (v_h \varphi ), (\pi _h - i_h) (v_h\varphi ))^{\frac{1}{2}}\\&\quad + s_\Omega ((i_h-1) (v_h \varphi ), (i_h - 1) (v_h \varphi ))^{\frac{1}{2}}\\&\quad + s_\Omega (v_h \varphi , v_h \varphi )^{\frac{1}{2}}. \end{aligned} \end{aligned}$$
(30)

We first observe that using (14) and (26), we can bound the first term by

$$\begin{aligned} s_\Omega ((\pi _h - i_h) (v_h \varphi ), (\pi _h - i_h) (v_h\varphi ))^{\frac{1}{2}}&\le |\beta |^{\frac{1}{2}} h^{\frac{1}{2}} \Vert \nabla (\pi _h - i_h) (v_h\varphi )\Vert _{\Omega } \\&\le |\beta |^{\frac{1}{2}} h^{\frac{1}{2}} \Vert \nabla (i_h - 1) (v_h\varphi )\Vert _{\Omega }\\&\le C |\beta |^{\frac{1}{2}} h^{\frac{1}{2}} \Vert \nabla \varphi \Vert _{\infty ,\Omega } \Vert v_h\Vert _\Omega \\&\le C |\beta |^{\frac{1}{2}} (h^{\frac{1}{2}}+\lambda ^{-1}) \Vert v_h \varphi \Vert _\Omega , \end{aligned}$$

where for the last two inequalities we used the discrete commutator property Lemma 2 together with the \(\varphi \)-bounds (11) and (18). Since \(\varphi \) is Lipschitz continuous on K, \(\varphi |_F\) is also Lipschitz continuous, and so \(\varphi |_F \in W^{1,\infty }(F)\). The restriction of the nodal interpolant on K onto F gives the nodal interpolant on F, hence applying Lemma 2 to F instead of K we have the discrete commutator estimate

$$\begin{aligned} h \Vert n \cdot \nabla (i_h-1) (v_h \varphi )\Vert _F \le C h |\varphi |_{W^{1,\infty }(K)} \Vert v_h\Vert _F \le C (h^{\frac{1}{2}}+\lambda ^{-1}) \Vert v_h \varphi \Vert _K, \end{aligned}$$

where in the last step we used (11) and (18) together with a discrete trace inequality. After summation we have that

$$\begin{aligned} s_\Omega ((i_h-1) (v_h \varphi ), (i_h - 1) (v_h \varphi ))^{\frac{1}{2}} \le C |\!|\!| v_h|\!|\!|_\varphi . \end{aligned}$$

Finally we use the trivial bound (since \(|\varphi | < 1\))

$$\begin{aligned} s_\Omega (v_h \varphi , v_h \varphi )^{\frac{1}{2}} \le s_\Omega (v_h, v_h)^{\frac{1}{2}}. \end{aligned}$$

We conclude the proof by summing up the above contributions. \(\square \)

We can now prove the following error estimate showing that in the zone \({\mathring{\omega }_\beta }\) where we have stability, the convergence in the \(L^2\)-norm is of order \({\mathcal {O}}(h^{\frac{3}{2}})\) on unstructured meshes, which is known to be optimal.

Theorem 1

Let \(u \in H^2(\Omega )\) be the solution of (1) and \((u_h,z_h) \in [V_h]^2\) the solution to (5). Then there exists \(h_0>0\) such that for all \(h<h_0\) with \(Pe(h) \gtrsim 1\) there holds

$$\begin{aligned} |\!|\!| u - u_h|\!|\!|_\varphi \le C (|\beta |^{\frac{1}{2}} h^{\frac{3}{2}} |u|_{H^2(\Omega )} + |\beta |^{\frac{1}{2}} h^{-\frac{1}{2}} \Vert \delta \Vert _\omega ). \end{aligned}$$

Proof

Let \(e_h = \pi _h u - u_h \in V_h\), then \(u - u_h = u- \pi _h u + e_h\). By Corollary 1 there exists \(\alpha >0\) such that

$$\begin{aligned} \alpha |\!|\!| e_h|\!|\!|_\varphi ^2 \le a_h(e_h, \pi _h (e_h\varphi ) ) + C \gamma ^{-1} \Vert (e_h,0)\Vert ^2_s. \end{aligned}$$

By Cauchy–Schwarz combined with Lemma 6 and Young’s inequality

$$\begin{aligned} -s_*(z_h, \pi _h (e_h\varphi )) \le C \varepsilon _1^{-1} \Vert (0,z_h)\Vert ^2_s + \varepsilon _1(|\!|\!| e_h|\!|\!|_\varphi ^2 + \Vert (e_h,0)\Vert ^2_s), \end{aligned}$$

for some \(0< \varepsilon _1 < \alpha /2\), hence

$$\begin{aligned} \frac{\alpha }{2} |\!|\!| e_h|\!|\!|_\varphi ^2 \le a_h(e_h, \pi _h (e_h\varphi )) + s_*(z_h, \pi _h (e_h\varphi )) + C \varepsilon _1^{-1} \Vert (0,z_h)\Vert ^2_s + C \Vert (e_h,0)\Vert ^2_s. \end{aligned}$$

Applying the first equality of the consistency Lemma 3 we obtain

$$\begin{aligned} \frac{\alpha }{2} |\!|\!| e_h|\!|\!|_\varphi ^2 \le a_h(\pi _h u - u, \pi _h (e_h\varphi ) ) + C \varepsilon _1^{-1} \Vert (0,z_h)\Vert ^2_s + C\Vert (e_h,0)\Vert ^2_s. \end{aligned}$$
(31)

Since \(\pi _h u - u \in V_h^\perp \) we may apply Lemma 4 to bound

$$\begin{aligned} a_h(\pi _h u - u, \pi _h (e_h\varphi ) ) \le C \Vert \pi _h u - u\Vert _\sharp \Vert (0,\pi _h (e_h\varphi ))\Vert _s. \end{aligned}$$

From Lemma 6 and Young’s inequality we thus have that for some \(\varepsilon _2>0\),

$$\begin{aligned} a_h(\pi _h u - u, \pi _h (e_h \varphi )) \le C ((1 + \varepsilon _2^{-1}) \Vert \pi _h u - u\Vert _\sharp ^2 + \Vert (e_h,0)\Vert _s^2 + \varepsilon _2 |\!|\!| e_h|\!|\!|_\varphi ^2). \end{aligned}$$

Taking \(\varepsilon _2 < \alpha /4\) and combining the above bound with (31) we see that

$$\begin{aligned} \frac{\alpha }{4} |\!|\!| e_h|\!|\!|_\varphi ^2 \le C((1+ \varepsilon _2^{-1}) \Vert \pi _h u - u\Vert _\sharp ^2 + (1+\varepsilon _1^{-1})\Vert (e_h,z_h)\Vert _s^2). \end{aligned}$$

Since \(\varepsilon _{1,2}\) are independent of h we can absorb them in the generic constant C and using the approximation inequality (21) together with Proposition 2, we conclude that

$$\begin{aligned} |\!|\!| e_h|\!|\!|_\varphi&\le C (\mu ^{\frac{1}{2}} h + |\beta |^{\frac{1}{2}} h^{\frac{3}{2}}) (|u|_{H^2(\Omega )} + h^{-2} \Vert \delta \Vert _\omega ) \\&\le C (|\beta |^{\frac{1}{2}} h^{\frac{3}{2}} |u|_{H^2(\Omega )} + |\beta |^{\frac{1}{2}} h^{-\frac{1}{2}} \Vert \delta \Vert _\omega ), \end{aligned}$$

where we used that \(Pe(h)>1\). \(\square \)

4.4 Upstream estimates

In this case we consider \(\beta = (\beta _1,0)\) with \(\beta _1 < 0\) and the data set

$$\begin{aligned} \omega = (0,x)\times (y^{-}, y^{+}) \end{aligned}$$

touching part of the outflow boundary \(\partial \Omega ^+\). We must choose the weight function differently and this time we take a negative \(\varphi \) given by (9)

$$\begin{aligned} \varphi := \psi _1 \psi _2 \in (-1,0). \end{aligned}$$

It seems that in this case we can not simultaneously get control of the \(L^2\)-norm and the weighted \(H^1\)-norm and we have to sacrifice the latter since it is not uniform in \(\mu \). We now take the weighted triple norm to be

$$\begin{aligned} |\!|\!| v_h|\!|\!|_\varphi ^2 := \Vert |\beta |^\frac{1}{2} v_h |\varphi |^{\frac{1}{2}} \Vert _{\Omega }^2 + \Vert |\beta \cdot n|^{\frac{1}{2}} v_h |\varphi |^{\frac{1}{2}}\Vert _{ \partial \Omega ^-}^2, \end{aligned}$$
(32)

and rederive the results obtained in Sect. 4.3, aiming for a local error estimate. Since \(\varphi \in (-1,0)\), we will use that \(\Vert \cdot \varphi \Vert _\Omega \le \Vert \cdot |\varphi |^\frac{1}{2} \Vert _\Omega \).

We start with an analogue of Lemma 5 by taking \(v_h \varphi \) as a test function in the weak bilinear form \(a_h\) and notice that since \(\varphi <0\) we now have that

$$\begin{aligned}&\frac{1}{2} \left( \Vert |\beta \cdot n|^{\frac{1}{2}} v_h |\varphi |^{\frac{1}{2}} \Vert _{\partial \Omega ^-}^2 + |\beta | \Vert v_h |\varphi |^{\frac{1}{2}}\Vert ^2_\Omega \right) \\&\quad =(\beta \cdot \nabla v_h, v_h \varphi )_\Omega + \frac{1}{2} \Vert |\beta \cdot n|^{\frac{1}{2}} v_h |\varphi |^{\frac{1}{2}} \Vert _{\partial \Omega ^+}^2. \end{aligned}$$

Arguing as previously in (24) but now for the outflow boundary, we obtain the bound

$$\begin{aligned} \begin{aligned} \Vert |\beta \cdot n|^{\frac{1}{2}} v_h |\varphi |^{\frac{1}{2}}\Vert _{\partial \Omega ^+}&\le C |\beta |^{\frac{1}{2}} (\Vert v_h |\varphi |^{\frac{1}{2}}\Vert _{\partial \Omega ^+ \cap \omega _\beta } + \Vert v_h |\varphi |^{\frac{1}{2}}\Vert _{\partial \Omega ^+ \setminus \omega _\beta })\\&\le C |\beta |^{\frac{1}{2}} h^{-\frac{1}{2}} \Vert v_h \Vert _{\omega } + C |\beta |^{\frac{1}{2}} h^{\frac{3}{2}} \Vert v_h\Vert _{H^1(\Omega )}\\&\le C \gamma ^{-\frac{1}{2}} \Vert (v_h,0)\Vert _s, \end{aligned} \end{aligned}$$
(33)

and thus

$$\begin{aligned} \frac{1}{2} \left( |\beta | \Vert v_h|\varphi |^{\frac{1}{2}}\Vert ^2_\Omega \!+\! \Vert |\beta \!\cdot \! n|^{\frac{1}{2}} v_h |\varphi |^{\frac{1}{2}} \Vert ^2_{\partial \Omega ^-} \right) \!\!\le \!\! (\beta \!\cdot \!\nabla v_h, v_h \varphi )_\Omega \!+\! C \gamma ^{-1} \Vert (v_h,0)\Vert ^2_s. \end{aligned}$$
(34)

For the diffusion term we no longer have any positive contribution due to the change in sign of the weight function \(\varphi \), since now

$$\begin{aligned} (\mu \nabla v_h,\nabla (v_h \varphi ))_{\Omega } = - \Vert \mu ^{\frac{1}{2}} \nabla v_h |\varphi |^{\frac{1}{2}}\Vert _{\Omega }^2 + (\mu \nabla v_h, v_h \nabla \varphi )_\Omega . \end{aligned}$$

We must therefore control this entirely using the stabilization. Integrating by parts and using the weighted trace inequality (19)

$$\begin{aligned} (\mu \nabla v_h,\nabla (v_h \varphi ))_{\Omega } - \left\langle \mu \nabla v_h \cdot n , v_h \varphi \right\rangle _{\partial \Omega }&= \sum _{F \in {\mathcal {F}}_i} \int _{F} \mu \llbracket \nabla v_h \cdot n \rrbracket v_h \varphi ~\text{ d }s \\&\le C \gamma ^{-\frac{1}{2}} s_\Omega (v_h,v_h)^{\frac{1}{2}} \mu ^\frac{1}{2} h^{-1} \Vert v_h \varphi \Vert _\Omega \\&\le C \gamma ^{-\frac{1}{2}} s_\Omega (v_h,v_h)^{\frac{1}{2}} \mu ^\frac{1}{2} h^{-1} \Vert v_h |\varphi |^{\frac{1}{2}}\Vert _\Omega . \end{aligned}$$

To bound this by the triple norm we can simply use that \(|\varphi | < 1\) and \(\mu \le |\beta | h\), giving that \(\mu ^\frac{1}{2} h^{-1} |\varphi |^{\frac{1}{2}} \le |\beta |^\frac{1}{2} h^{-\frac{1}{2}}.\) Hence we have that for some \(\varepsilon >0\),

$$\begin{aligned} |(\mu \nabla v_h,\nabla (v_h \varphi ))_{\Omega } - \left\langle \mu \nabla v_h \cdot n , v_h \varphi \right\rangle _{\partial \Omega }| \le C \varepsilon ^{-1} \gamma ^{-1} h^{-1} s_\Omega (v_h,v_h) + C \varepsilon |\!|\!| v_h|\!|\!|_\varphi ^2. \end{aligned}$$

However, when \(Pe(h) h > 1\) one can obtain a better estimate due to \(\mu ^\frac{1}{2} h^{-1} |\varphi |^{\frac{1}{2}} \le |\beta |^\frac{1}{2},\) which gives that

$$\begin{aligned} |(\mu \nabla v_h,\nabla (v_h \varphi ))_{\Omega } - \left\langle \mu \nabla v_h \cdot n , v_h \varphi \right\rangle _{\partial \Omega }| \le C \varepsilon ^{-1} \gamma ^{-1} s_\Omega (v_h,v_h) + C \varepsilon |\!|\!| v_h|\!|\!|_\varphi ^2. \end{aligned}$$

Summing these contributions we obtain the following result corresponding to Lemma 5.

Lemma 7

There exists \(\alpha >0\) such that for all \(v_h \in V_h\) we have

$$\begin{aligned} \alpha |\!|\!| v_h|\!|\!|_\varphi ^2 \le a_h(v_h, v_h \varphi ) + C h^{-1} \Vert (v_h,0)\Vert ^2_s,\text { when } 1 \lesssim Pe(h) < h^{-1}, \end{aligned}$$

and

$$\begin{aligned} \alpha |\!|\!| v_h|\!|\!|_\varphi ^2 \le a_h(v_h, v_h \varphi ) + C \Vert (v_h,0)\Vert ^2_s,\text { when } Pe(h) > h^{-1}. \end{aligned}$$

Again, we can refine the control over the triple norm \(|\!|\!| v_h|\!|\!|_\varphi \) by taking the projection \(\pi _h (v_h \varphi ) \in V_h\) as a test function and we obtain corresponding results.

Corollary 2

There exists \(\alpha >0\) such that for all \(v_h \in V_h\) we have

$$\begin{aligned} \alpha |\!|\!| v_h|\!|\!|_\varphi ^2 \le a_h(v_h, \pi _h(v_h \varphi )) + C h^{-1} \Vert (v_h,0)\Vert ^2_s,\text { when } 1 \lesssim Pe(h) < h^{-1}, \end{aligned}$$

and

$$\begin{aligned} \alpha |\!|\!| v_h|\!|\!|_\varphi ^2 \le a_h(v_h, \pi _h(v_h \varphi )) + C \Vert (v_h,0)\Vert ^2_s,\text { when } Pe(h) > h^{-1}. \end{aligned}$$

Proof

The argument in the proof of Corollary 1 remains valid with the remark that we now use the inequality \(|\varphi | < |\varphi |^\frac{1}{2}\). \(\square \)

Lemma 8

For all \(v_h \in V_h\) there holds

$$\begin{aligned} \Vert (0,\pi _h (v_h \varphi ))\Vert ^2_s \le C (h^{-1} |\!|\!| v_h|\!|\!|^2_\varphi + \Vert (v_h,0)\Vert ^2_s),\text { when } 1 \lesssim Pe(h) < h^{-1}, \end{aligned}$$

and

$$\begin{aligned} \Vert (0,\pi _h (v_h \varphi ))\Vert ^2_s \le C (|\!|\!| v_h|\!|\!|^2_\varphi + \Vert (v_h,0)\Vert ^2_s),\text { when } Pe(h) > h^{-1}. \end{aligned}$$

Proof

We follow the proof of Lemma 6 and we focus on the bounds that are now different. As before, by the triangle inequality we have that up to a constant

$$\begin{aligned} \Vert (0,\pi _h (v_h \varphi ))\Vert _s&\le \Vert \mu ^{\frac{1}{2}} \nabla (\pi _h - 1) (v_h \varphi ))\Vert _\Omega + \Vert \mu ^{\frac{1}{2}} v_h \nabla \varphi \Vert _\Omega + \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi \Vert _\Omega \\&\quad + (|\beta |+ \mu h^{-1})^{\frac{1}{2}} (\Vert (\pi _h-1) (v_h \varphi )\Vert _{\partial \Omega } + \Vert v_h\varphi \Vert _{\partial \Omega } )\\&\quad + s_\Omega (\pi _h (v_h \varphi ), \pi _h (v_h \varphi ))^{\frac{1}{2}}. \end{aligned}$$

The first two terms can be bounded by \(C |\!|\!| v_h|\!|\!|_\varphi \) as previously. For the third one, we can use the inverse inequality (12) and (18) to obtain

$$\begin{aligned} \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi \Vert _\Omega \le C \mu ^{\frac{1}{2}} h^{-1} \Vert \varphi \Vert _{\infty ,\Omega } \Vert v_h \Vert _\Omega \le C \mu ^{\frac{1}{2}} h^{-1} \Vert v_h \varphi \Vert _\Omega \le C \mu ^{\frac{1}{2}} h^{-1} \Vert v_h |\varphi |^\frac{1}{2} \Vert _\Omega . \end{aligned}$$

Hence we have that

$$\begin{aligned} \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi \Vert _\Omega \le C h^{-\frac{1}{2}} |\!|\!| v_h|\!|\!|_\varphi , \text { when } 1 \lesssim Pe(h) < h^{-1}, \end{aligned}$$

and

$$\begin{aligned} \Vert \mu ^{\frac{1}{2}} \nabla v_h \varphi \Vert _\Omega \le C |\!|\!| v_h|\!|\!|_\varphi , \text { when } Pe(h)>h^{-1}. \end{aligned}$$

Arguing as previously, we can bound the second line by \(C |\!|\!| v_h|\!|\!|_\varphi \) using (33) instead of (24). We conclude the proof by recalling the estimate (30) for the jump term and the subsequent bounds. \(\square \)

We now prove the weighted error estimate in the upstream case showing that in the stability region \(\mathring{\omega }_\beta \) we have quasi-optimal convergence for high Péclet numbers and a reduction of the convergence order by \({\mathcal {O}}(h^\frac{1}{2})\) in an intermediate regime.

Theorem 2

Let \(u \in H^2(\Omega )\) be the solution of (1) and \((u_h,z_h) \in [V_h]^2\) the solution to (5), then there holds

$$\begin{aligned} |\!|\!| u - u_h|\!|\!|_\varphi \le C (|\beta |^{\frac{1}{2}} h |u|_{H^2(\Omega )} + |\beta |^{\frac{1}{2}} h^{-1} \Vert \delta \Vert _\omega ), \text { when } 1 \lesssim Pe(h) < h^{-1}, \end{aligned}$$

and

$$\begin{aligned} |\!|\!| u - u_h|\!|\!|_\varphi \le C (|\beta |^{\frac{1}{2}} h^{\frac{3}{2}} |u|_{H^2(\Omega )} + |\beta |^{\frac{1}{2}} h^{-\frac{1}{2}} \Vert \delta \Vert _\omega ), \text { when } Pe(h) > h^{-1}. \end{aligned}$$

Proof

We combine Lemma 7, Corollary 2 and Lemma 8 as in the proof of Theorem 1 and note that the argument holds verbatim when \(Pe(h) > h^{-1}\). Observe that when \(1 \lesssim Pe(h) < h^{-1}\) we similarly obtain for some \(\alpha >0\) and \(0< \varepsilon _1 < \alpha /2\),

$$\begin{aligned} \frac{\alpha }{2} |\!|\!| e_h|\!|\!|_\varphi ^2 \le a_h(\pi _h u - u, \pi _h (e_h\varphi ) ) + C \varepsilon _1^{-1} \Vert (0,z_h)\Vert ^2_s + C h^{-1} \Vert (e_h,0)\Vert ^2_s. \end{aligned}$$
(35)

Since \(\pi _h u - u \in V_h^\perp \) we may apply Lemma 4 to bound

$$\begin{aligned} a_h(\pi _h u - u, \pi _h (e_h\varphi ) ) \le C \Vert \pi _h u - u\Vert _\sharp \Vert (0,\pi _h (e_h\varphi ))\Vert _s. \end{aligned}$$

From Lemma 8 and Young’s inequality we thus have that for some \(\varepsilon _2>0\),

$$\begin{aligned} a_h(\pi _h u - u, \pi _h (e_h \varphi )) \le C ((1 + \varepsilon _2^{-1} h^{-1}) \Vert \pi _h u - u\Vert _\sharp ^2 + \Vert (e_h,0)\Vert _s^2 + \varepsilon _2 |\!|\!| e_h|\!|\!|_\varphi ^2). \end{aligned}$$

Taking \(\varepsilon _2 < \alpha /4\) and combining the above bound with (35) we see that

$$\begin{aligned} \frac{\alpha }{4} |\!|\!| e_h|\!|\!|_\varphi ^2 \le C((1+ \varepsilon _2^{-1} h^{-1} ) \Vert \pi _h u - u\Vert _\sharp ^2 + \varepsilon _1^{-1} h^{-1} \Vert (e_h,z_h)\Vert _s^2). \end{aligned}$$

Since \(\varepsilon _{1,2}\) are independent of h we can absorb them in the constant C and conclude the proof by using the approximation inequality (21) and Proposition 2 to obtain that

$$\begin{aligned} |\!|\!| e_h|\!|\!|_\varphi&\le C (\mu ^{\frac{1}{2}} h^{\frac{1}{2}} + |\beta |^{\frac{1}{2}} h) (|u|_{H^2(\Omega )} + h^{-2} \Vert \delta \Vert _\omega ) \\&\le C (|\beta |^{\frac{1}{2}} h |u|_{H^2(\Omega )} + |\beta |^{\frac{1}{2}} h^{-1} \Vert \delta \Vert _\omega ). \end{aligned}$$

\(\square \)

The error bounds in this section contain a global Sobolev norm. This may be large in the presence of layers and it would be optimal to replace it with a local regularity measure. However, it is not clear how to reconcile this goal with the ill-posed character of the problem. Inserting everywhere the weight function as in the well-posed case [3], would perturb the stability of the optimality system and, due to the lack of physical coercivity, the residual terms would then be hard to control. One could also consider a reduced transport equation (with zero diffusivity) and define the far field, where the solution is unknown, by a smooth extension, but is not obvious how this could be constructed in our context, and without requiring regularity for the right-hand side. Nonetheless, in numerical experiments we observe a good regularity behaviour, i.e. layers do not pollute the solution in the stability zone, as shown in Figs. 15 and 16.

5 Numerical experiments

We let \(\Omega \) be the unit square and illustrate the performance of the numerical method (5) for different locations of the data domain \(\omega \) and different regions of interest where we measure the approximation error. The computational domains are given in Fig. 4 and the implementation is done using FEniCS [1]. In all the examples below we have used uniform triangulations with alternating left/right diagonals. In the definition of \(s_\Omega \) and \(s_*\) we have taken the stabilization parameters \(\gamma = 10^{-5}\) and \(\gamma _* = 1\), and \(\zeta = 2\) for \(s_\omega \). The effect of different combinations of \(\gamma \) and \(\gamma _*\) on the \(L^2\)-errors is shown in Figs. 5 and 6 when data is given in a centered disk. Similar results are obtained when the data set is near the inflow/outflow boundary. Notice that our choice is empirically close to being optimal both locally and globally.

Fig. 4
figure 4

Data set \(\omega \) and error measurement regions (hatched)

Fig. 5
figure 5

Varying the stabilization parameters \(\gamma \) and \(\gamma _*\). Absolute \(L^2\)-errors downstream, computational domains in Fig. 4a. \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\). Similar results in the upstream case

Fig. 6
figure 6

Varying the stabilization parameters \(\gamma \) and \(\gamma _*\). Absolute \(L^2\)-errors globally, computational domains in Fig. 4a. \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

We first show convergence plots both downstream and upstream from the data set when varying the diffusion coefficient \(\mu \) and keeping the convection field \(\beta \) fixed. As in the case of well-posed convection-dominated problems, the observed \(L^2\)-convergence order is typically \({\mathcal {O}}(h^2)\), surpassing by \({\mathcal {O}}(h^\frac{1}{2})\) the weighted error estimates proven for general meshes ((see Fig. 7, for example).

Fig. 7
figure 7

Absolute \(L^2\)-errors against mesh size h when varying the diffusion coefficient \(\mu \) for fixed \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

Fig. 8
figure 8

Absolute \(L^2\)-errors against mesh size h, downstream vs upstream for \(\mu = 10^{-2}\), \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

Fig. 9
figure 9

Absolute \(H^1\)-errors against mesh size h when varying the diffusion coefficient \(\mu \) for fixed \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

5.1 Data set near the inflow/outflow boundary

We consider the data set \(\omega \) near the inflow and outflow boundaries of \(\Omega \), as assumed in Sect. 2.1. We observe in Fig. 7 that as diffusion is reduced the convergence order for the \(L^2\)-errors increases, culminating with quadratic convergence when convection dominates. Confirming the theoretical analysis in Sect. 4.4, we note the presence of an intermediate regime for Péclet numbers in which the upstream convergence orders are reduced and the upstream errors are typically larger. This can also be seen in Fig. 8 where we consider the diffusion coefficient \(\mu = 10^{-2}\) and an interior data set. The errors in the \(H^1\)-seminorm are given in Fig. 9 which shows almost linear convergence. This is probably due to the small contribution of the gradient term in the triple norm (23).

5.2 Interior data set

Next we consider the setting of the example discussed in the Introduction (Fig. 2), where data is given in the centre of the domain. We give the convergence of the \(L^2\)-errors in Fig. 10 with the caveat that this location of the data set \(\omega \) is not rigorously covered by the theoretical analysis of the previous sections. Nonetheless, the experiments are in agreement with the proven results. Notice that the \(L^2\)-convergence is faster as \(\mu \) decreases and for high Péclet numbers (above 10) one has optimal quadratic convergence both downstream and upstream, with the distinction that in the upstream case the convergence order is reduced in an intermediate regime, in agreement with the theoretical results in Sect. 4. Also, as expected from the error estimates proven in the first part [5], when diffusion is moderately small one can see the transition towards the diffusion-dominated regime as the mesh gets refined – the convergence changes from almost quadratic to sublinear as the Péclet number decreases below 1. Figure 11 shows almost linear convergence in the \(H^1\)-seminorm. We think this is observed due to the small contribution of the gradient term in the triple norm (23). We also remark almost no distinction between upstream and downstream for this example, probably because the gradient term is controlled by the \(L^2\)-norm for small enough \(\mu \).

Fig. 10
figure 10

Absolute \(L^2\)-errors against mesh size h, computational domains in Fig. 4a. Varying the diffusion coefficient \(\mu \) for fixed \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

Fig. 11
figure 11

\(H^1\)-errors against mesh size h, computational domains in Fig. 4a. Varying the diffusion coefficient \(\mu \) for fixed \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

Fig. 12
figure 12

\(L^2\)-errors against mesh size h for perturbations in data, computational domains in Fig. 4a. Varying the diffusion coefficient \(\mu \) for fixed \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

Fig. 13
figure 13

\(L^2\)-errors against mesh size h for perturbations in data, computational domains in Fig. 4a. Varying the diffusion coefficient \(\mu \) for fixed \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

Fig. 14
figure 14

\(L^2\)-errors against mesh size h for perturbations in data, computational domains in Fig. 4a. Varying the diffusion coefficient \(\mu \) for fixed \(\beta = (1,0)\), exact solution \(u = 2\sin (5\pi x)\sin (5\pi y)\)

Fig. 15
figure 15

Absolute error in the diffusion-dominated regime, \(\mu = 1,\, \beta =(1,0)\). Data given in the four outlined boxes for the solution \(u=\sin (3\pi x)+ \tanh (100(y-1/2))\)

Fig. 16
figure 16

Absolute error in transition to the convection-dominated regime, \(\beta =(1,0)\). Data given in the four outlined boxes for the solution \(u=\sin (3\pi x)+ \tanh (100(y-1/2))\)

5.3 Data perturbations

We demonstrate the effect of data perturbations \({\tilde{U}}_\omega = u\vert _{\omega } + \delta \) in a downstream vs upstream setting by polluting the restriction of u to each node of the mesh in \(\omega \) with uniformly distributed values in \([-h^2,h^2], [-h, h]\) and \([-h^\frac{1}{2}, h^\frac{1}{2}]\), respectively. Comparing first Figs. 10, 11 and 12 we see that perturbations of amplitude \({\mathcal {O}}(h^2)\) have no effect on the \(L^2\)-errors, as expected.

An \({\mathcal {O}}(h)\) noise amplitude exhibits in Fig. 13 the difference—proven in Theorem 1 and Theorem 2 – between the downstream and upstream scenarios. In the upstream case the noise has a strong effect for moderate Péclet numbers and the errors stagnate. Only for high Péclet numbers one has convergence of order \({\mathcal {O}}(h^\frac{1}{2})\). In the downstream case one observes lower errors, faster convergence and almost no noise effect for high Péclet numbers. The difference is also very clear for perturbations of amplitude \({\mathcal {O}}(h^\frac{1}{2})\) shown in Fig. 14. In the upstream case the errors stagnate and there seems to be no convergence, while in the downstream case the errors still convergence for high Péclet numbers.

5.4 Internal layer example

We now consider an exact solution \(u=\sin (3\pi x)+ \tanh (100(y-1/2))\) having an internal layer at \(y=1/2\) and study qualitatively the transition from dominant diffusion to dominant convection. Data is given on both sides of the layer. The distribution of the absolute error is presented in Fig. 15 for the diffusion-dominated regime and in Fig. 16 for the intermediate and convection-dominated regimes. Note that the width of the internal layer does not depend on the physical parameters. Initially, the errors oscillate away from the data sets and concentrate around the boundary of the domain. When convection dominates, the approximation around the layer strongly deteriorates due to the crosswind position relative to the data sets. In this example the mesh is unstructured with 512 elements on a side and \(h \approx 0.0025\).