1 Introduction

A family of k-dimensional surfaces \(M_t\subset {\mathbb {R}}^n\) parameterized by time t is a mean curvature flow (abbreviated as MCF) if the normal velocity is equal to the mean curvature vector of \(M_t\). Given a smooth k-dimensional submanifold \(M_0\), there exists a unique smooth MCF with initial datum \(M_0\) until singularities such as vanishing or neck-pinching occur. To extend the flow beyond the time of singularity, numerous notions of generalized solution to MCF have been proposed since the 1970s: we mention, among others, the viscosity solutions produced by the level set method [3, 5], BV solutions [14], and varifold solutions [2, 21].

In the present paper, we focus on the varifold solutions known as Brakke flows, proposed and studied in Brakke’s pioneering work [2]. One of the main results of [2] is the partial regularity theorem of Brakke flows [2, 6.12], which states that any unit density Brakke flow is a smooth MCF for a.e. time almost everywhere. Since a time-independent Brakke flow is a stationary varifold, and since in that case the unit density hypothesis means that the multiplicity function is equal to 1, the result may be seen as the natural parabolic counterpart of the well-known result established by Allard in [1] in the context of stationary varifolds. For Brakke’s partial regularity theorem, as in many similar problems, the key ingredient is the proof of a “flatness implies regularity” type result, that is, an \(\varepsilon \)-regularity theorem. This is referred to as Brakke’s local regularity theorem [2, 6.11] in this context. It states, roughly speaking, that if \(\{M_t\}_{t\in (-\Lambda ,\Lambda )}\) is a Brakke flow in a cylinder

$$\begin{aligned} \textrm{C}_2:=\textrm{C}({\mathbb {R}}^k\times \{0\},2):=\{(x,y)\in {\mathbb {R}}^k\times {\mathbb {R}}^{n-k}:|x|<2\} \end{aligned}$$

which is close to

$$\begin{aligned} B_2^k:=\{(x,0)\in {\mathbb {R}}^k\times {\mathbb {R}}^{n-k}: |x|<2\} \end{aligned}$$

in the sense of measure over \(t\in (-\Lambda ,\Lambda )\), then, in the smaller cylinder \(\textrm{C}_1\), \(M_t\) coincides with a smooth graph over \(B_{1}^k\) evolving by MCF for \(t\in (-\Lambda /2,\Lambda /2)\), with estimates on all the derivatives of such graph in terms of the overall height of \(M_t\). The constant \(\Lambda \) depends on how close \(M_t\) is to \(B_2^k\) in measure. While the original proof of Brakke’s local regularity theorem contained various gaps and errors, a rigorous proof was provided in [11, 20] with a different approach than Brakke’s, and for more general flows, allowing for an additive perturbation in the form of a forcing term in the right-hand side of the underlying PDE.

Though this local regularity theorem is useful to prove the partial regularity of Brakke flows, there is a drawback in that it does not provide the regularity of the flow up until the “end-time”. Since the problem is parabolic in nature, one would expect the validity of interior estimates away from the “parabolic boundary” of \(B_2^k \times (-\Lambda ,\Lambda )\), and thus that the graphical representation over \(B_1^k\) together with the corresponding estimates on the derivatives hold for \(t \in (-\Lambda /2,\Lambda )\) instead of \((-\Lambda /2,\Lambda /2)\).

The present paper addresses precisely this problem, and proves that such estimates are possible for Brakke flows, even when the aforementioned forcing term is present. There are many more-or-less equivalent ways of stating the main regularity theorem proved here: an illustrative form is the following, where, for convenience, we discuss the simple case of Brakke flows with no forcing term and we change the time interval from \((-\Lambda ,\Lambda )\) to \([-2,0]\). For the sake of accuracy, the statement uses the varifold notation \(V_t\) (see [1, 11]), but the reader may think of the support of the weight measure \(\textrm{spt}\Vert V_t\Vert \) as \(M_t\).

Theorem 1.1

Corresponding to \(E_0 \in \left( 0,\infty \right) \), there exists \(\varepsilon _0= \varepsilon _0(n,k,E_0)\in (0,1)\) with the following property. Suppose \(\{V_t\}_{t\in (-2,0]}\) is a k-dimensional unit density Brakke flow in the cylinder \(\textrm{C}_3 = \textrm{C}(\mathbb {R}^k \times \{0\},3) \subset {\mathbb {R}}^n\) satisfying:

  1. (1)

    \(\sup _{t\in (-2,0]}\Vert V_t\Vert (\textrm{C}_3)\le E_0\);

  2. (2)

    \(\Vert V_{-4/5}\Vert (\textrm{C}_1)\le \frac{5}{4}\,\omega _k,\) (\(\omega _k= \text{ volume } \text{ of } B_1^k\));

  3. (3)

    \(0\in \textrm{spt}\Vert V_0\Vert \);

  4. (4)

    \(\cup _{t\in [-1,0]}\,\textrm{spt}\Vert V_t\Vert \subset \{(x,y)\in {\mathbb {R}}^k\times {\mathbb {R}}^{n-k}:|y|\le \varepsilon \}\) for some \(\varepsilon \in (0, \varepsilon _0]\).

Then, for every \(t \in \left[ -1/4,0\right) \), \(\textrm{C}_{1/2}\cap \textrm{spt}\Vert V_t\Vert \) is a \(C^\infty \) graph over \(B_{1/2}^k\) evolving by MCF, and the space-time \(C^\ell \)-norm of the graph on \(B_{1/2}^k\times [-1/4,0)\) is bounded by \(c(\ell ,n,k,E_0)\varepsilon \) for any \(\ell \ge 1\).

Any Brakke flow locally satisfies the assumption (1) for some \(E_0>0\). The assumption (2) excludes the case of two parallel k-dimensional planes, which is not a univalent graph, while (3) excludes the sudden vanishing of Brakke flow before the end-time \(t=0\). Since the definition of Brakke flow allows such irregularity, (3) (or some variant of similar nature) is necessary. The last (4) assumes that the height is kept small for \(t\in [-1,0]\). The conclusion is that the Brakke flow is a smooth graph away from the parabolic boundary, and all derivatives can be controlled in terms of the height. Note that \(\textrm{spt}\Vert V_0\Vert \) may not be a smooth surface due to a possible (partial) sudden vanishing at \(t=0\), but we can smoothly extend \(\textrm{spt}\Vert V_t\Vert \) as \(t\rightarrow 0-\) in \(\textrm{C}_{1/2}\) due to the estimates. As anticipated, the main result of the present paper is in fact more general. Precisely, the assumptions on the flow can be relaxed in various ways. First, the unit density assumption can be entirely dropped, and the theorem can be stated requiring \(\{V_t\}_{t \in (-2,0]}\) to be a k-dimensional integral Brakke flow in \(\textrm{C}_3\), instead. The reason is that assumption (2) prevents the presence of higher multiplicity points in a slightly smaller parabolic region, as one can see using Huisken’s monotonicity formula and a compactness argument, so that any k-dimensional integral Brakke flow satisfying (1)–(4) for sufficiently small \(\varepsilon _0\) is necessarily unit density in a smaller parabolic region. Second, assumption (4) on the smallness of the height can be phrased in a weaker measure-theoretic sense: for the result to be valid, it is in fact sufficient that the (space-time) \(L^2\)-distance of the flow from the plane \(\mathbb {R}^k\times \{0\}\) in \(\textrm{C}_1 \times [-1,0]\) (a quantity typically referred to as (\(L^2\)-)excess) is sufficiently small. Furthermore, the regularity result proved here is in fact valid for the larger class of Brakke flows with forcing term; more precisely, in this case we obtain \(C^{1,\zeta }\) (\(\zeta =1-k/p-2/q\)) or \(C^{2,\alpha }\) regularity estimates depending on whether the forcing is in the \(L^{p,q}\)-integrability class or in the \(\alpha \)-Hölder class, respectively. There are several reasons, stemming both from theoretical considerations and from the applications, leading one to consider Brakke-like flows with additional forcing term. A major one is the study of Brakke flows on a Riemannian manifold M: once M is (isometrically) embedded into some Euclidean space \({\mathbb {R}}^N\), the extrinsic curvatures of the immersion act as a forcing term in the corresponding definition of Brakke flow in M; see Sect. 2 for further details on this, and Theorems 2.2 and 2.3 for the precise statements of the main results.

We next discuss some related works. When the Brakke flow in Theorem 1.1 is a smooth MCF or is obtained as a weak limit of smooth MCF, the result has been known as a part of White’s local regularity theorem from [23], and it has been used widely in the literature of MCF to analyze the nature of singularities. White’s theorem applies, for instance, to Brakke flows obtained by the elliptic regularization method of Ilmanen [9], and, since the class of such MCF is weakly compact (see [23, Section 7]), to their tangent flows. The present paper shows that the same conclusions of White’s theorem in various forms hold true even without the proviso of approximability by smooth MCF, and can be derived solely from the definition of Brakke flow. As an illustration, using the main regularity theorem, we can prove the following.

Theorem 1.2

There exists \(\varepsilon _{1}=\varepsilon _{1}(n,k)\in (0,1)\) with the following property. Let \(\{V_t\}_{t\in (a,b]}\) be a k-dimensional Brakke flow in a domain \(U\subset {\mathbb {R}}^n\) (or an n-dimensional Riemannian manifold). For any point \((x,t)\in U\times (a,b]\) with the Gaussian density \(\Theta (x,t)\in [1,1+\varepsilon _{1})\) (see Sect. 2.6), there exists \(r>0\) such that \(B_r(x)\cap \textrm{spt}\Vert V_s\Vert \) is a smooth MCF in \(B_r(x)\) for \(s\in (t-r^2,t)\) and can be extended smoothly to t in the limit as \(s\rightarrow t-\).

We remark that there are, in the literature, existence theorems of Brakke flows for which one cannot tell a priori whether they arise as weak limits of smooth MCF or not. The examples include the limits of solutions to the Allen–Cahn equation [8, 18, 19] as well as the flows obtained by means of time-discrete approximate schemes [2, 12, 16, 17]. In the case of Brakke flows with no forcing term, Lahiri [13] showed an analogous end-time \(C^{1,\zeta }\) regularity result using some height growth estimates, a suitable constancy theorem for integral varifolds, and higher order derivative estimates. The proof is very different from that of the present paper, and it appears difficult to generalize it to flows with forcing term. More recently, Gasparetto [7] showed the validity of a similar end-time \(C^{1,\zeta }\) regularity result for Brakke flows with boundary, with a proof based on viscosity techniques. About six months after the present manuscript was made available as a preprint, De Philippis, Gasparetto, and Schulze provided in [4] an alternative proof—again based on viscosity techniques—of the end-time regularity result in the interior for Brakke flows possibly with forcing term in \(L^\infty \).

Next, we describe the idea of the proof. The proof of Theorem 1.1 is achieved by modifying suitable portions of the proof of the local regularity theorem in [11], so to extend the graphicality and the relevant estimates up to the end-time. Just as in many similar problems of this type, a fundamental step towards regularity is the proof of a Caccioppoli-type estimate stating that a certain “Dirichlet-type energy” can be controlled in terms of the \(L^2\)-height of the solution. In the context of Brakke flows, such Dirichlet type energy corresponds, roughly speaking, to the difference (excess) of surface measure of \(\Vert V_t\Vert \) within the cylinder \(\textrm{C}_1\) and the measure \(\omega _k\) of the unit disk. Such difference is shown to be less than a constant times the \(L^2\)-height of the flow by means of an ODE argument, see [11, Section 5]: indeed, one proves, by appropriately testing Brakke’s inequality, that the excess of measure—as a function of time—satisfies an ordinary differential inequality. The ODE argument implemented in [11], though, requires some “waiting time” both near the beginning and the end of the time interval: this is the main reason for the lack of estimates up to the end-time in [11]. A key point of the present paper is the observation that such waiting time becomes shorter when the height of the Brakke flow is smaller. The proof of the regularity then proceeds just like in Allard’s regularity theorem: the Brakke flow is approximated by a (parabolic) Lipschitz function, and one initiates a blow-up argument. The approximating Lipschitz functions are rescaled by the height of the Brakke flow, but, thanks to the above mentioned key observation, in the process of passing to the limit as the height goes to 0, also the waiting time becomes infinitesimal. One can then show that the rescaled Lipschitz functions converge strongly in \(L^2\) to a solution of the heat equation as long as small neighborhoods of \(t=-1\) and \(t=0\) are removed. The contribution to the \(L^2\)-norms of the rescaled functions coming from the neighborhood of \(t=0\) can be made small, so that, in combination with the linear regularity theory of the heat equation, one obtains decay estimates for the linearized problem. By iterating, one concludes \(C^{1,\zeta }\) regularity and graphical representation of \(\Vert V_t\Vert \) on a parabolic region of space-time which touches the origin. In particular, any point on the boundary of this parabolic region is in the support of \(\Vert V_t\Vert \), so that one can repeat the same argument regarding these points as the origin. This implies that the domain of graphicality with estimates can be extended so that it covers the whole support of \(\Vert V_t\Vert \) in \(\textrm{C}_{1/2}\times [-1/4,0)\), proving the \(C^{1,\zeta }\) estimate up to the end-time. Once this is done, \(C^{2,\alpha }\) regularity up to the end-time can be obtained by repeating—with essentially no changes—the proof in [20]. Once the \(C^{2,\alpha }\) end-time regularity is available, the classical parabolic regularity theory gives all the higher derivative estimates for the Brakke flow with no forcing term, while the regularity theory for inhomogeneous linear heat equation implies the result when the forcing term is present.

The paper is organized as follows. In Sect. 2 we set up the notation in use throughout the paper, and we provide the formal statements of the main results in their full generality (see Theorems 2.2 and 2.3) as well as the proofs of Theorems 1.1 and 1.2 as a consequence of the general main results. Section 3 contains the enhanced ODE argument which gives energy estimates with short waiting time at the end of the time interval. In Sect. 4 we produce a parabolic Lipschitz approximation of the flow with good estimates up to the end-time, by suitably modifying the corresponding construction in [11, Section 7]. In Sect. 5, the main modification of the blow-up argument is described and the main \(C^{1,\zeta }\) regularity on a parabolic domain touching the origin (a subdomain of \(\{(x,t): |x|^2<|t|\}\)) is obtained. In Sect. 6, we complete the proof of Theorems 2.2 and 2.3.

2 Assumptions and main results

2.1 Notation

Since the proof follows [11] very closely, we mostly adopt the same notation (see [11, Section 2]), except for a few symbols of norms. Throughout \(1\le k<n\) are fixed, and the dependence of constants on n and k is often not specified for simplicity. We set \({\mathbb {R}}^+:=\{x \in \mathbb {R}:\,x\ge 0\}\). For \(r\in (0,\infty )\) and \(a\in {\mathbb {R}}^n\) (or \(a\in {\mathbb {R}}^k\)) we set

$$\begin{aligned} B_r(a):=\{x\in {\mathbb {R}}^n: |x-a|<r\},\,\, B_r^k(a):=\{x\in {\mathbb {R}}^k: |x-a|<r\}, \end{aligned}$$

and we often identify \({\mathbb {R}}^k\) with \({\mathbb {R}}^k\times \{0\}\subset {\mathbb {R}}^n\). When \(a=0\), we may write \(B_r\) and \(B_r^k\). For \(a\in {\mathbb {R}}^n\), \(s\in {\mathbb {R}}\) and \(r>0\) we define two types of parabolic cylinders

$$\begin{aligned} \begin{aligned} P_r(a,s)&:=\{(x,t)\in {\mathbb {R}}^n\times {\mathbb {R}}: |x-a|<r,\, |t-s|<r^2\}, \\ {\tilde{P}}_r(a,s)&:=\{(x,t)\in {\mathbb {R}}^n\times {\mathbb {R}}:|x-a|<r,\, s-r^2<t<s\}; \end{aligned} \end{aligned}$$
(2.1)

the first one was used in [11], whereas in the present paper we will prefer to work with the second one. We denote by \({\mathcal {L}}^n\) the Lebesgue measure on \({\mathbb {R}}^n\) and by \({\mathcal {H}}^k\) the k-dimensional Hausdorff measure on \({\mathbb {R}}^n\). The restriction of a measure to a (measurable) set A is expressed by \(\lfloor _A\). For an open set \(U\subset {\mathbb {R}}^n\), \(C_c(U)\) is the set of continuous and compactly supported functions defined on U, and \(C_c^k(U)\) is the set of k-times continuously differentiable functions with compact support in U. The symbols \(\nabla f\) and \(\nabla ^2 f\) always denote the spatial gradient and Hessian of f, respectively, and \(f_t=\partial _t f\) is the time derivative of f. For a function f defined on a domain in space-time \(D\subset {\mathbb {R}}^n\times {\mathbb {R}}\) and \(\alpha \in (0,1)\), define the following (semi-)norms to ease the notation in [11, 20]:

$$\begin{aligned} \Vert f\Vert _0:= & {} \Vert f\Vert _{L^\infty (D)}, \\ {}[f]_{\alpha }:= & {} \sup \\{} & {} \left\{ \frac{|f(y_1,s_1)-f(y_2,s_2)|}{\max \{|y_1-y_2|,|s_1-s_2|^{\frac{1}{2}}\}^\alpha } \, :\, (y_1,s_1),(y_2,s_2)\in D, \; (y_1,s_1)\ne (y_2,s_2) \right\} , \\ {}[f]_{1+\alpha }:= & {} [\nabla f]_{\alpha }+\sup \left\{ \frac{|f(y,s_1)-f(y,s_2)|}{|s_1-s_2|^{\frac{1+\alpha }{2}}} \, :\, (y,s_1),(y,s_2)\in D,\; s_1 \ne s_2 \right\} . \end{aligned}$$

Let \(\textbf{G}(n,k)\) be the space of k-dimensional linear subspaces of \({\mathbb {R}}^n\) and let \(\textbf{A}(n,k)\) be the space of k-dimensional affine planes in \({\mathbb {R}}^n\). For \(S\in \textbf{G}(n,k)\), we identify S with the corresponding orthogonal projection matrix of \({\mathbb {R}}^n\) onto S. Let \(S^{\perp }\in \textbf{G}(n,n-k)\) be the orthogonal complement of S. For \(A\in \textrm{Hom}({\mathbb {R}}^n;{\mathbb {R}}^n)\), we define the operator norm

$$\begin{aligned} \Vert A\Vert :=\sup \{|A(x)|:x\in {\mathbb {R}}^n, \, |x|=1\}, \end{aligned}$$

and we often use this as a metric on \(\textbf{G}(n,k)\). For \(T\in \textbf{G}(n,k)\), \(a \in T\), and \(r\in (0,\infty )\) we define the cylinder

$$\begin{aligned} \textrm{C}(T,a,r):=\{x\in {\mathbb {R}}^n:|T(x-a)|<r\}. \end{aligned}$$

A general k-varifold on \(U\subset {\mathbb {R}}^n\) is a Radon measure defined on \(G_k(U):=U\times \textbf{G}(n,k)\) (see [1, 15] for a more comprehensive introduction), and the set of all general k-varifolds in U is denoted by \(\textbf{V}_k(U)\). For \(V\in \textbf{V}_k(U)\), let \(\Vert V\Vert \) be the weight measure of V (with no fear of confusion with the operator norm), that is the measure defined on U by

$$\begin{aligned} \Vert V\Vert (\phi ):=\int _{G_k(U)}\phi (x)\,dV(x,S) \qquad \text{ for } \text{ every } \phi \in C_c(U). \end{aligned}$$

For a proper map \(f\in C^1({\mathbb {R}}^n;{\mathbb {R}}^n)\), the symbol \(f_{\sharp } V\) denotes the push-forward of the varifold V through f. We say that \(V\in \textbf{V}_k(U)\) is a rectifiable varifold if there are some \({\mathcal {H}}^k\)-measurable and countably k-rectifiable set \(M\subset {\mathbb {R}}^n\) as well as a non-negative function \(\theta \in L^1_{\textrm{loc}}({\mathcal {H}}^k\lfloor _M)\) such that

$$\begin{aligned} V(\phi )=\int _{M}\phi (x,\textrm{Tan}_x M)\,\theta (x)\,d{\mathcal {H}}^k(x) \qquad \text{ for } \text{ all } \phi \in C_c(G_k(U)), \end{aligned}$$

and in such case we write \(V = \textbf{var}(M,\theta )\). Here, \(\textrm{Tan}_x M\) is the approximate tangent space to M at x, which exists for \({\mathcal {H}}^k\)-a.e. \(x\in M\). When \(\theta (x)\) is integer-valued for \({\mathcal {H}}^n\)-a.e. \(x\in M\), V is said to be an integral varifold. The set of all integral varifolds is denoted by \(\textbf{IV}_k(U)\). When \(\theta =1\) additionally, we say that V is of unit density. For \(V\in \textbf{V}_k(U)\), \(\delta V\) denotes the first variation of V and \(\Vert \delta V\Vert \) denotes the total variation of \(\delta V\). When \(\delta V\) is bounded and absolutely continuous with respect to \(\Vert V\Vert \), the Radon–Nikodym derivative (times \(-1\)), \(-\delta V/\Vert V\Vert \), is denoted by \(h(V,\cdot )\) and is called the generalized mean curvature vector of V. A fundamental geometric property of integral varifolds, of great importance for the analysis of Brakke flows, is Brakke’s perpendicularity theorem [2, Chapter 5]: if \(V\in \textbf{IV}_k(U)\) and \(h(V,\cdot )\) exists, then \(S(h(V,x))=0\) for V-a.e. \((x,S)\in G_k(U)\).

For a one-parameter family of varifolds \(\{V_t\}_{t\in [a,b]}\), we often use \(\Vert V_t\Vert \times dt\) to represent the natural product measure on \(U\times [a,b]\); the latter is also expressed as \(d\Vert V_t\Vert dt\) within integration.

Fix \(\phi \in C^\infty ([0,\infty ))\) such that \(0\le \phi \le 1\),

$$\begin{aligned} \phi (x)\left\{ \begin{array}{ll} =1 &{}\quad \text{ for } 0\le x\le (2/3)^{1/k}, \\ >0 &{}\quad \text{ for } 0\le x<(5/6)^{1/k}, \\ =0 &{} \quad \text{ for } x\ge (5/6)^{1/k}. \end{array}\right. \end{aligned}$$
(2.2)

For \(R\in (0,\infty )\), \(x\in {\mathbb {R}}^n\) and \(T\in \textbf{G}(n,k)\), define

$$\begin{aligned} \phi _{T,R}(x):=\phi (R^{-1}|T(x)|),\,\, \phi _T(x):=\phi _{T,1}(x)=\phi (|T(x)|) \end{aligned}$$
(2.3)

and set

$$\begin{aligned} \textbf{c}:=\int _T \phi _T^2(x)\,d{\mathcal {H}}^k(x). \end{aligned}$$
(2.4)

The functions \(\phi _{T,R}\) and \(\phi _T\) will be used as smooth test functions to gauge the measure deviation of \(\Vert V\Vert \) away from T with multiplicity one. Notice that \(\textbf{c}\) is independent of T.

2.2 Definition of Brakke flow

Since in this paper we are mostly interested in the end-time regularity, we consider time intervals of the form \([-\Lambda ,0]\) with \(\Lambda >0\) in the following.

Definition 2.1

Suppose that \(U\subset {\mathbb {R}}^n\) is a domain and \(1\le k<n\). A family of varifold \(\{V_t\}_{t\in [-\Lambda ,0]}\subset \textbf{V}_k(U)\) is a (k-dimensional) Brakke flow if the following conditions are satisfied.

  1. (1)

    For a.e. \(t\in [-\Lambda ,0]\), \(V_t\in \textbf{IV}_k(U)\).

  2. (2)

    For all , we have

    $$\begin{aligned} \sup _{t\in [-\Lambda , 0]}\Vert V_t\Vert ({\tilde{U}})<\infty . \end{aligned}$$
    (2.5)
  3. (3)

    For a.e. \(t\in [-\Lambda , 0]\), \(\delta V_t\) is locally bounded and absolutely continuous with respect to \(\Vert V_t\Vert \), and thus \(h(V_t,\cdot )\) exists. Furthermore, For all ,

    $$\begin{aligned} \int _{-\Lambda }^0\int _{{\tilde{U}}} |h(V_t,x)|^2\, d\Vert V_t\Vert dt<\infty . \end{aligned}$$
    (2.6)
  4. (4)

    For all \(\varphi \in C^1(U\times [-\Lambda ,0];{\mathbb {R}}^+)\) with \(\varphi (\cdot , t)\in C^1_c(U)\) for all \(t\in [-\Lambda ,0]\), and for all \(-\Lambda \le t_1<t_2\le 0\), we have

    $$\begin{aligned} \begin{aligned}&\int _U\varphi (x,t_2)\,d\Vert V_{t_2}\Vert (x)-\int _U \varphi (x,t_1)\,d\Vert V_{t_1}\Vert (x) \\&\quad \le \int _{t_1}^{t_2}dt\int _{U} \{(\nabla \varphi (x,t)-h(V_t,x)\varphi (x,t))\cdot h(V_t,x)+\varphi _t(x,t)\}\,d\Vert V_t\Vert (x). \end{aligned}\nonumber \\ \end{aligned}$$
    (2.7)

The condition (4) is a weak formulation of MCF due to Brakke [2]. While Brakke’s original formulation of (2.7) is in the form of a differential inequality, nothing is lost if one works in this integral formulation. In fact, the latter is advantageous, in that it may easily accommodate the setting with additional unbounded forcing term as described in the next subsection.

One may naturally consider a MCF and the corresponding notion of Brakke flow in a general n-dimensional Riemannian manifold M. By Nash’s isometric embedding theorem, we may always consider M to be a submanifold in a domain \(U\subset {\mathbb {R}}^N\) for some sufficiently large N. A Brakke flow in M can then be defined by asking \(\textrm{spt}\Vert V_t\Vert \subset M\) for all t, (1)–(3), and by replacing the inequality (2.7) by

$$\begin{aligned} \begin{aligned}&\int _U\varphi (x,t_2)\,d\Vert V_{t_2}\Vert (x)-\int _U \varphi (x,t_1)\,d\Vert V_{t_1}\Vert (x) \\&\quad \le \int _{t_1}^{t_2}dt\int _{G_k(U)} \{(\nabla \varphi (x,t)-h(V_t,x)\varphi (x,t))\cdot (h(V_t,x)\\&\qquad -H_M(x,S))+\varphi _t(x,t)\}\,dV_t(x,S). \end{aligned} \end{aligned}$$
(2.8)

Here, \(H_M(x,S)=\sum _{i=1}^k\textbf{B}_x(v_i,v_i)\in (\textrm{Tan}_x M)^\perp \), where \(\textbf{B}_x(\cdot ,\cdot )\) is the second fundamental form of \(M\subset {\mathbb {R}}^N\) at \(x\in M\) and the set \(\{v_1,\cdots ,v_k\}\) is an orthonormal basis of \(S\in \textbf{G}(n,k)\). See [20, Section 7] for a further explanation. The term \(H_M\) is already perpendicular to M and, for all analytical purposes, can be regarded as a locally bounded forcing term u as described in the next subsection.

2.3 Assumptions

The following assumptions are the same as [11], and we list them for the reader’s convenience.

For an open set \(U\subset {\mathbb {R}}^n\), suppose that we have a family of k-varifolds \(\{V_t\}_{t\in [-\Lambda ,0]}\subset \textbf{V}_k(U)\) and a family of \((\Vert V_t\Vert \times dt)\)-measurable vector fields \(\{u(\cdot ,t)\}_{t\in [-\Lambda ,0]}\) defined on U and satisfying the following.

  1. (A1)

    For a.e. \(t\in [-\Lambda ,0]\), \(V_t\) is a unit density k-varifold.

  2. (A2)

    There exists \(E_1\in [1,\infty )\) such that

    $$\begin{aligned} \Vert V_t\Vert (B_r(x))\le \omega _k r^k E_1\,\, \text{ for } \text{ all } B_r(x)\subset U \text{ and } t\in [-\Lambda ,0]. \end{aligned}$$
    (2.9)
  3. (A3)

    The numbers \(p\in [2,\infty )\) and \(q\in (2,\infty )\) satisfy

    $$\begin{aligned} \zeta :=1-\frac{k}{p}-\frac{2}{q}>0, \end{aligned}$$
    (2.10)

    and u satisfies

    $$\begin{aligned} \Vert u\Vert _{L^{p,q}(U \times \left[ -\Lambda ,0\right] )}:= \left( \int _{-\Lambda }^0\left( \int _U |u(x,t)|^p\,d\Vert V_t\Vert (x) \right) ^{\frac{q}{p}}\,dt\right) ^{\frac{1}{q}} <\infty .\nonumber \\ \end{aligned}$$
    (2.11)
  4. (A4)

    For all \(\varphi \in C^1(U\times [-\Lambda ,0];{\mathbb {R}}^+)\) with \(\varphi (\cdot ,t)\in C^1_c(U)\) for all \(t\in [-\Lambda ,0]\), and for all \(-\Lambda \le t_1<t_2\le 0\), we have

    $$\begin{aligned} \begin{aligned}&\int _U \varphi (x,t_2)\,d\Vert V_{t_2}\Vert (x)-\int _U \varphi (x,t_1)\,d\Vert V_{t_1}\Vert (x)\\&\quad \le \int _{t_1}^{t_2}dt\int _U \{ (\nabla \varphi (x,t)-h(V_t,x)\varphi (x,t))\cdot (h(V_t,x)\\&\qquad + u^{\perp }(x,t))+\varphi _t (x,t)\}\, d\Vert V_t\Vert (x). \end{aligned} \end{aligned}$$
    (2.12)

    Implicitly in the formulation of (A4), it is assumed that the first variation \(\delta V_t\) of \(V_t\) is locally bounded and it is absolutely continuous with respect to \(\Vert V_t\Vert \) (so that \(h(V_t,x)\) exists) for a.e. \(t\in [-\Lambda ,0]\), and that \(h(V_t,x)\in L^2_{\textrm{loc}} (U\times [-\Lambda , 0])\). For a.e. \(t\in [-\Lambda ,0]\), \(u^\perp (x,t)\) is the projection of u onto the orthogonal complement of the approximate tangent space to \(V_t\) at x, which exists for \(\Vert V_t\Vert \)-a.e. x due to the integrality of \(V_t\). The inequality (2.12) characterizes formally that the normal velocity of the flow is equal to the mean curvature vector h plus \(u^\perp \). When \(u \equiv 0\), (2.12) simply becomes (2.7), and thus \(\{V_t\}_{t \in \left[ -\Lambda ,0\right] }\) is a Brakke flow. More generally, (2.12) includes the case when \(\{V_t\}_{t \in \left[ -\Lambda ,0\right] }\) is a Brakke flow in a Riemannian manifold M, which corresponds to \(u(x,t):= -H_M(x,\textrm{Tan}_x \Vert V_t\Vert )\): indeed, as already explained, in this case \(u(x,t) \in (\textrm{Tan}_x M)^\perp \), and thus in particular \(u(x,t) \in (\textrm{Tan}_x \Vert V_t\Vert )^\perp \) given that \(\textrm{spt}\Vert V_t\Vert \subset M\) for all t. One technical point to add is that (A1) may be replaced, for all purposes of the present paper, by

\(({\mathrm{A1'}})\):

for a.e. \(t\in [-\Lambda ,0]\), \(V_t\in \textbf{IV}_k(U)\).

The reason for this is that the assumptions of the main theorems essentially allow only unit density varifolds. We will nonetheless adopt (A1) as our working hypothesis, in order to be consistent with [11]. As already mentioned, there are in the literature various results guaranteeing the existence of (generalized) MCF (possibly with forcing term u) satisfying (A1)–(A4).

2.4 Main results

The first theorem is the basic \(\varepsilon \)-regularity theorem, and it corresponds to a parabolic version of Allard’s regularity theorem; the second theorem gives a \(C^{2,\alpha }\) estimate. They are the end-time regularity counterpart of [11] and [20], respectively.

Theorem 2.2

Corresponding to \(\nu \in (0,1)\), \(E_1\in [1,\infty )\), p and q satisfying (2.10), there exist \(\varepsilon _{2}\in (0,1)\) and \(c_{1}\in (1,\infty )\) depending only on \(n,k,p,q,\nu ,E_1\) with the following property. For \(R\in (0,\infty )\), \(T\in \textbf{G}(n,k)\), and \(U=\textrm{C}(T,2R)\), suppose \(\{V_t\}_{t\in [-R^2,0]}\) and \(\{u(\cdot ,t)\}_{t\in [-R^2,0]}\) satisfy (A1)–(A4). Suppose furthermore that we have

$$\begin{aligned}{} & {} \Vert V_{-4R^2/5}\Vert (\phi _{T,R}^2)\le (2-\nu )\,\textbf{c} \,R^k, \end{aligned}$$
(2.13)
$$\begin{aligned}{} & {} (\textrm{C}(T,\nu R)\times \{0\})\cap \textrm{spt}(\Vert V_t\Vert \times dt)\ne \emptyset , \end{aligned}$$
(2.14)
$$\begin{aligned}{} & {} \mu :=\left( R^{-(k+4)}\int _{-R^2}^0\int _{\textrm{C}(T,2R)} |T^{\perp }(x)|^2\,d\Vert V_t\Vert dt\right) ^{\frac{1}{2}}<\varepsilon _{2}, \end{aligned}$$
(2.15)
$$\begin{aligned}{} & {} \Vert u\Vert _{p,q}:=R^\zeta \Vert u\Vert _{L^{p,q}(\textrm{C}(T,2R)\times [-R^2,0])}<\varepsilon _{2}. \end{aligned}$$
(2.16)

Let \({\tilde{D}}:=\left( B_{R/2} \cap T\right) \times [-R^2/4,0)\). Then there are \(C^{1,\zeta }\) functions \(f:{\tilde{D}}\rightarrow T^\perp \) and \(F:{\tilde{D}}\rightarrow {\mathbb {R}}^n\) such that \(T(F(y,t))=y\) and \(T^\perp (F(y,t))=f(y,t)\) for all \((y,t)\in {\tilde{D}}\),

$$\begin{aligned}{} & {} \textrm{spt}\Vert V_t\Vert \cap \textrm{C}(T,R/2)=\textrm{image}\,F(\cdot ,t) \text{ for } \text{ all } t\in [-R^2/4,0), \end{aligned}$$
(2.17)
$$\begin{aligned}{} & {} R^{-1}\Vert f\Vert _0+\Vert \nabla f\Vert _0 +R^\zeta [f]_{1+\zeta } \le c_{1}\max \{\mu ,\Vert u\Vert _{p,q}\}. \end{aligned}$$
(2.18)

As discussed in the Introduction, (2.13) excludes the possibility that \(V_t\) consists of multiple sheets in \(\textrm{C}(T,R)\), and it can replace the assumption that \(V_t\) be unit density. Notice that (2.13) is stated as a property valid at time \(-4R^2/5\); nonetheless, the validity of (2.12) implies that in fact \(\Vert V_t\Vert (\phi _{T,R}^2)\) is an almost-decreasing function of t, even when the forcing term u is present. As a consequence, the mass estimate in (2.13) remains valid when \(\Vert V_{-4R^2/5}\Vert \) is replaced by \(\Vert V_t\Vert \) for \(t > -4R^2/5\), modulo replacing \(\nu \) with \(\nu ' \in (0,\nu )\), provided \(\varepsilon _{2}\) is sufficiently small depending on \(\nu '\). The assumption (2.14) prevents sudden vanishing of the flow prior to the end-time. Finally, (2.15) is a smallness requirement on the (space-time) \(L^2\)-height of the flow, namely of the space-time \(L^2\)-distance of the flow from the given k-dimensional plane T. We notice explicitly that, as a consequence of (2.17)–(2.18), one can naturally extend f and F to \(t=0\) as \(C^{1,\zeta }\) functions. Nonetheless, \(\textrm{C}(T,R/2)\cap \textrm{spt}\Vert V_0\Vert \subset \textrm{image}\, F\), but equality may not hold in general.

When u is \(\alpha \)-Hölder continuous, we have the \(C^{2,\alpha }\)-regularity estimate as follows.

Theorem 2.3

Corresponding to \(\nu \in (0,1)\), \(E_1\in [1,\infty )\) and \(\alpha \in (0,1)\), there exist \(\varepsilon _{3}\in (0,\varepsilon _{2})\) and \({c_{2}}\in (1,\infty )\) depending only on \(n,k,\alpha , \nu ,E_1\) with the following property. For \(R\in (0,\infty )\), \(T\in \textbf{G}(n,k)\), and \(U=\textrm{C}(T,2R)\), suppose \(\{V_t\}_{t\in [-R^2,0]}\) and \(\{u(\cdot ,t)\}_{t\in [-R^2,0]}\) satisfy (A1), (A2), (A4) and in place of (A3), assume \(u\in C^{0,\alpha }(\textrm{C}(T,2R)\times [-R^2,0])\). Furthermore, assume (2.13), (2.14), (2.15) with \(\varepsilon _{3}\), and in place of (2.16),

$$\begin{aligned} \Vert u\Vert _{\alpha }:=R \Vert u\Vert _0+R^{1+\alpha }[u]_{\alpha } <\varepsilon _{3}. \end{aligned}$$

Then the conclusion of Theorem 2.2 holds in the \(C^{2,\alpha }\) class, that is (2.18) can be replaced by

$$\begin{aligned}{} & {} R^{-1} \Vert f\Vert _0+\Vert \nabla f\Vert _0 +R(\Vert \nabla ^2 f\Vert _0+\Vert f_t\Vert _0)\nonumber \\{} & {} \quad +R^{1+\alpha } ([\nabla ^2 f]_\alpha +[f_t]_\alpha ) \le c_{2}\max \{\mu ,\Vert u\Vert _\alpha \}. \end{aligned}$$
(2.19)

Moreover, \(\textrm{image}\,F\) satisfies in the classical (pointwise) sense the motion law that normal velocity \(=h+u^\perp \).

Here one can extend f and F as \(C^{2,\alpha }\) functions to \(t=0\) on \(B_{R/2} \cap T\). Once the regularity goes up to \(C^{2,\alpha }\) and the surfaces satisfy the PDE pointwise, then the parabolic Schauder estimates can be applied in the case that u is more regular. In particular, we will deduce \(C^{k+2,\alpha }\) estimates if \(u \in C^{k,\alpha }\). In the case of Brakke flow, when \(u=0\), we have all the derivative estimates in terms of \(\mu \).

In the next sections, we prove how the results stated in the Introduction, namely Theorems 1.1 and 1.2 can be deduced from Theorems 2.2 and 2.3.

2.5 Proof of Theorem 1.1

Let \(E_0 \in \left( 0, \infty \right) \), and suppose \(\{V_t\}_{t \in \left( -2,0\right] }\) is a k-dimensional Brakke flow satisfying (1)–(4) in Theorem 1.1 with \(\varepsilon \in \left( 0, \varepsilon _0\right] \). We prove that, if \(\varepsilon _0\) is chosen sufficiently small, then \(\{V_t\}_{t \in \left[ -1,0\right] }\) satisfies the hypotheses of Theorem 2.3. We set \(R=1\), \(T=\mathbb {R}^k\times \{0\}\), and \(U=\textrm{C}(T,2)=:\textrm{C}_2\), and we notice that (A1)(A3)(A4) are satisfied by assumption. To check (A2), let \(t \in \left[ -1,0\right] \) and \(B_r(x) \subset U\): it is then a classical consequence (see e.g. [21, Proposition 3.5]) of Huisken’s monotonicity formula that

$$\begin{aligned} r^{-k} \Vert V_t\Vert (B_r(x)) \le c\, \sup _{s\in \left[ -2,t\right] } \Vert V_s\Vert (\textrm{C}_3) \le c E_0, \end{aligned}$$

where c is a universal constant. This proves (A2). Next, using that

$$\begin{aligned} \phi _T^2 \le \textrm{1}_{\textrm{C}_1} \qquad \text{ and } \qquad \textbf{c} \ge \frac{2}{3} \omega _k, \end{aligned}$$

we see that (2) implies

$$\begin{aligned} \Vert V_{-4/5}\Vert (\phi _T^2) \le \Vert V_{-4/5}\Vert (\textrm{C}_1) \le \frac{5}{4} \omega _k \le \frac{15}{8} \, \textbf{c}, \end{aligned}$$

that is (2.13) holds with \(\nu = 1/8\). Also, (2.15) with \(\varepsilon _{3}\) is an immediate consequence of (4) as soon as \(\varepsilon _0 \le \varepsilon _{3}\), whereas (2.14) follows from (3) and Huisken’s monotonicity formula (see, for instance, [21, Proposition 3.6]). Hence, Theorem 2.3 applies, and Theorem 1.1 follows from the fact that the forcing field \(u \equiv 0\) is smooth. \(\square \)

2.6 Proof of Theorem 1.2

In order to simplify the presentation, we will work under the assumption that \(U = \mathbb {R}^n\), and that \(\textrm{spt}\Vert V_t\Vert \subset B_R\) for every \(t \in \left( a, b \right] \), for some \(R >0\). The general case can be obtained with simple modifications, but the underlying idea is the same; see Remark 2.4.

Before proceeding with the proof, let us recall the classical definition of Gaussian density in the context of Brakke flows; see for instance [22] for a thorough presentation. Under the above assumptions, and setting \({\mathscr {V}} = \{V_t\}_{t\in \left( a, b \right] }\), for any point \((x_0,t_0) \in \mathbb {R}^n \times \left( a, b \right] \) we define

$$\begin{aligned} \Theta ({\mathscr {V}},(x_0,t_0)):= \lim _{\tau \rightarrow 0^+} \frac{1}{(4\pi \tau )^{\frac{k}{2}}} \int _{\mathbb {R}^n} \exp \left( - \frac{|y-x_0|^2}{4\tau } \right) \, d\Vert V_{t_0-\tau }\Vert (y). \end{aligned}$$
(2.20)

The existence of the above limit is guaranteed by the fact that the function

$$\begin{aligned} \tau \in \left( 0, t_0-a \right) \mapsto \frac{1}{(4\pi \tau )^{\frac{k}{2}}} \int _{\mathbb {R}^n} \exp \left( - \frac{|y-x_0|^2}{4\tau } \right) \, d\Vert V_{t_0-\tau }\Vert (y) \end{aligned}$$

is monotone increasing as a consequence of Huisken’s monotonicity formula.

Step one. Assume first that \(\Theta ({\mathscr {V}},(x_0,t_0)) = 1\), and let \({\mathscr {V}}' = \{V_t'\}_{t \in \left( -\infty ,0\right) }\) be any tangent flow to \({\mathscr {V}}\) at \((x_0,t_0)\). We then have

$$\begin{aligned} 1 = \Theta ({\mathscr {V}},(x_0,t_0)) = \Theta ({\mathscr {V}}',(0,0)), \end{aligned}$$

so that, in particular, \(\Theta ({\mathscr {V}}', (y,s)) \le 1\) for every \((y,s) \in \mathbb {R}^n \times \left( -\infty , 0 \right) \). Since it is a general fact that \(\Theta ({\mathscr {V}},(y,s)) \ge 1\) for an integral Brakke flow \({\mathscr {V}}\) (see Appendix A), for every \((y,s) \in \textrm{spt}(\Vert V_t'\Vert \times dt)\), we have

$$\begin{aligned} \Theta ({\mathscr {V}}', (y,s)) = 1 = \Theta ({\mathscr {V}}', (0,0)) \qquad \text{ for } \text{ all } (y,s) \in \textrm{spt}(\Vert V_t'\Vert \times dt). \end{aligned}$$

This immediately implies (see e.g. [22, Theorem 8.1]) that there exist \(a \in \left[ 0,\infty \right] \) and \(T \in \textbf{G}(n,k)\) such that

$$\begin{aligned} V_t' = \textbf{var}(T,1)\qquad \text{ for } \text{ every } t \in \left( -\infty , a\right) , \end{aligned}$$

namely that \({\mathscr {V}}'\) is a static k-dimensional plane with unit density. Therefore, there exists \(\rho > 0\) such that the hypotheses of Theorem 2.3 are satisfied with \(R=\rho \) by the flow \(\{(\tau _{x_0})_\sharp V_{t_0+s}\}_{s \in \left[ -\rho ^2,0\right] }\), where \(\tau _{x_0}\) is the translation \(\tau _{x_0}(x):= x-x_0\). Thus, by Theorem 2.3, for all \(t \in \left[ t_0-\rho ^2/4,t_0\right) \), \(\textrm{spt}\Vert V_t\Vert \cap \left( x_0 + \textrm{C}(T,\rho /2)\right) \) coincides with the graph of a \(C^\infty \) function

$$\begin{aligned} f :B_{\rho /2}(x_0) \cap (x_0 + T) \times \left[ t_0 - \rho ^2/4,t_0\right) \rightarrow T^\perp \end{aligned}$$

which satisfies the mean curvature flow in the classical sense and which can be extended smoothly on \(B_{\rho /2}(x_0) \cap (x_0+T)\) up to \(t=t_0\). This completes the proof in case \(\Theta ({\mathscr {V}},(x_0,t_0))=1\).

Step two. The proof that the same result holds when \(\Theta ({\mathscr {V}},(x_0,t_0)) \le 1+\varepsilon _1\) for some sufficiently small \(\varepsilon _1\) is by a standard blow-up argument. First, notice that it is sufficient to prove that there exists \(\varepsilon _1>0\) such that if \({\mathscr {V}}\) is a tangent flowFootnote 1 and \(\Theta ({\mathscr {V}},(0,0)) \le 1 + \varepsilon _1\) then \({\mathscr {V}}\) is a static k-dimensional plane with unit density.

To see this, let \(\{{\mathscr {V}}_j\}_{j \in {\mathbb {N}}}\) be a sequence of tangent flows such that \(\Theta ({\mathscr {V}}_j, (0,0)) \le 1 + {1/j}\), and notice that, for each j, the function

$$\begin{aligned} \tau \in \left( 0, \infty \right) \mapsto \frac{1}{(4\pi \tau )^{\frac{k}{2}}} \int _{\mathbb {R}^n} \exp \left( -\frac{|y|^2}{4\tau } \right) \, d\Vert (V_j)_{-\tau }\Vert (y) \end{aligned}$$

is constant, so that, in particular

$$\begin{aligned} \frac{1}{(4\pi )^{\frac{k}{2}}} \int _{\mathbb {R}^n} \exp \left( -\frac{|y|^2}{4} \right) \, d\Vert (V_j)_{-1}\Vert (y) = \Theta ({\mathscr {V}}_j,(0,0)) \le 1 + \frac{1}{j}. \end{aligned}$$

Apply next the compactness theorem for Brakke flows, and let \({\mathscr {V}}\) be the limit Brakke flow of a (not relabeled) subsequence of \(\{{\mathscr {V}}_j\}_j\). We have then

$$\begin{aligned} \begin{aligned} 1 \le \Theta ({\mathscr {V}}, (0,0))&\le \frac{1}{(4\pi )^{\frac{k}{2}}} \int _{\mathbb {R}^n} \exp \left( -\frac{|y|^2}{4} \right) \, d\Vert V_{-1}\Vert (y) \\&\le \liminf _{j \rightarrow \infty }\frac{1}{(4\pi )^{\frac{k}{2}}} \int _{\mathbb {R}^n} \exp \left( -\frac{|y|^2}{4} \right) \, d\Vert (V_j)_{-1}\Vert (y)\le 1, \end{aligned} \end{aligned}$$

and thus \(\Theta ({\mathscr {V}},(0,0))=1\). By Step one, \(\textrm{spt}\Vert V_t\Vert \) is a smooth graph evolving by mean curvature in some \(B_\rho (0)\) for all \(t \in \left[ -\rho ^2,0\right) \), and the flow can be extended smoothly in \(B_\rho (0)\) up to \(t=0\). Since \({\mathscr {V}}\) is the limit Brakke flow of \({\mathscr {V}}_j\), then for all sufficiently large j also the flow \({\mathscr {V}}_j\) satisfies all assumptions of Theorem 2.3 (with \(u \equiv 0\)) in a parabolic domain \(\textrm{C}(T,\rho /2) \times [-\rho ^2/16,0]\), and thus \({\mathscr {V}}_j\) is a smooth mean curvature flow in a neighborhood of \(x=0\) until the end-time \(t=0\). Since \({\mathscr {V}}_j\) is a tangent flow, it must then be a static k-dimensional plane with unit density, and the proof is complete. \(\square \)

Remark 2.4

In case \(\{V_t\}_{t \in (a,b]}\) is a k-dimensional Brakke flow in a domain \(U \subset \mathbb {R}^n\), the same proof goes through, except that we need a suitably modified monotonicity formula to make sense of the Gaussian density. More precisely, if \((x_0,t_0) \in U \times (a,b]\) and \(B_{2r}(x_0) \subset U\) then for any function \(\psi :B_{2r}(x_0) \rightarrow [0,1]\) that is smooth, compactly supported, equal to 1 on \(B_r(x_0)\) and satisfying a bound of the form \(r |\nabla \psi | + r^2 \Vert D^2\psi \Vert \le b\), the limit

$$\begin{aligned} \lim _{\tau \rightarrow 0^+} \frac{1}{(4\pi \tau )^{\frac{k}{2}}} \int _{\mathbb {R}^n} \exp \left( - \frac{|y-x_0|^2}{4\tau } \right) \, \psi (y) \, d\Vert V_{t_0-\tau }\Vert (y) \end{aligned}$$

exists and it is independent of \(\psi \). This limit is the Gaussian density \(\Theta ({\mathscr {V}}, (x_0,t_0))\). The same limit also exists in the case when \({\mathscr {V}}=\{V_t\}_{t \in (a,b]}\) is a k-dimensional Brakke flow in a domain U of an n-dimensional Riemannian manifold M, or, more generally, when \({\mathscr {V}}=\{V_t\}_{t\in (a,b]}\) is a flow with a locally bounded forcing term u, with the only caveat that the proof of existence of the limit involves a more complicated monotonicity formula. Once the existence of the density has been established, tangent flows to such a \({\mathscr {V}}\) at \((x_0,t_0)\) are Brakke flows in \(\mathbb {R}^n\) (in the manifold case, we are identifying \(\mathbb {R}^n\) with \(\textrm{Tan}_{x_0}M\)), and the proof proceeds verbatim. For the proof of the monotonicity formulas needed in these cases, the interested reader can consult [22, Sections 10 and 11].

3 Energy estimates

The main result of this section is the following theorem, which establishes that the deviation of the k-dimensional area of surfaces that are \(L^2\)-close to a plane T and move by (forced) unit-density Brakke flow from the area of a single k-dimensional disk can be estimated in terms of the \(L^2\)-height with respect to T. An analogous result was proved in [11, Theorem 5.7], but the version we are going to present here has an important advantage, which is ultimately the key to unlock the end-time regularity. More precisely, while [11, Theorem 5.7] concludes the validity of the estimate up to some waiting time both at the beginning and at the end of the time interval where the \(L^2\)-height is assumed to be small, here we extend the estimate arbitrarily near the end-time as long as we know that the area of the moving surfaces is a sufficiently large portion of the area of the disk (namely, as long as we know that the flow is not vanishing). The price to pay is that the estimate comes with a constant which deteriorates while approaching the end-time. The end-time regularity will result from appropriately balancing the size of this constant with the vanishing of the \(L^2\)-height along a blow-up sequence.

Theorem 3.1

Corresponding to \(E_1\in [1,\infty )\) and \(\tau \in \left( 0,\frac{1}{2}\right) \), there exist \(\varepsilon _{4}=\varepsilon _{4}(E_1,\tau ) \in (0,1)\) and \(K = K(E_1) \in \left( 1,\infty \right) \) independent of \(\tau \) with the following property. Given \(T \in \textbf{G}(n,k)\), suppose that \(\{V_t\}_{t \in \left[ -1,0\right] }\) and \(\{u(\cdot ,t)\}_{t \in \left[ -1,0\right] }\) satisfy (A1)–(A4) with \(U = \textrm{C}(T,1)\). Assume also that

$$\begin{aligned}{} & {} \exists \, C > 0 \, :\, \textrm{spt}\Vert V_t\Vert \subset \textrm{C}(T,1) \cap \{|T^\perp (x)| < C\} \qquad \forall \,t\in \left( -1,0\right] ; \end{aligned}$$
(3.1)
$$\begin{aligned}{} & {} \mu _*^2:= \displaystyle \sup _{t \in \left[ -1,0\right] } \int _{\textrm{C}(T,1)} |T^\perp (x)|^2\, d\Vert V_t\Vert (x) \le \varepsilon _{4}^2; \end{aligned}$$
(3.2)
$$\begin{aligned}{} & {} \Vert V_{-1}\Vert (\phi _T^2) - \textbf{c}\le \varepsilon _{4}^2; \end{aligned}$$
(3.3)
$$\begin{aligned}{} & {} C(u):= \displaystyle \int _{-1}^0\int _{\textrm{C}(T,1)} 2\, |u|^2\,\phi _T^2\, d\Vert V_t\Vert dt \le \varepsilon _{4}^2. \end{aligned}$$
(3.4)

Then,

$$\begin{aligned} \sup _{t\in \left[ -\frac{1}{2},0\right] } \Vert V_t\Vert (\phi _T^2) \le \textbf{c}+ K (\mu _*^2 + C(u)). \end{aligned}$$
(3.5)

Furthermore, if

$$\begin{aligned} \sup _{t\in \left[ -\tau ,0\right] } \Vert V_t\Vert (\phi _T^2) - \textbf{c}\ge - \varepsilon _{4}^2, \end{aligned}$$
(3.6)

then

$$\begin{aligned} \sup _{t\in \left[ -\frac{1}{2},-2\tau \right] } \left| \Vert V_t\Vert (\phi _T^2) - \textbf{c}\right| \le \frac{K}{\tau ^3} (\mu _*^2 + C(u)). \end{aligned}$$
(3.7)

Before coming to the proof of Theorem 3.1, we record here the following result, which is [11, Proposition 5.2].

Proposition 3.2

Corresponding to \(E_1\in [1,\infty )\) and \(\nu \in (0,1)\) there exist \(\alpha _2\in (0,1)\), \(\mu _1\in (0,1)\), and \(P_2\in [1,\infty )\) with the following property. For \(T \in \textbf{G}(n,k)\) and a unit density varifold \(V \in \textbf{IV}_k (\textrm{C}(T,1))\) with finite mass, define

$$\begin{aligned} \alpha ^2&:= \int _{\textrm{C}(T,1)} |h(V,x)|^2\,\phi _T^2(x) \, d\Vert V\Vert (x)\,, \end{aligned}$$
(3.8)
$$\begin{aligned} \mu ^2&:= \int _{\textrm{C}(T,1)} |T^\perp (x)|^2\,d\Vert V\Vert (x)\,. \end{aligned}$$
(3.9)

Suppose \(\textrm{spt}\Vert V\Vert \) is bounded and

$$\begin{aligned} \Vert V\Vert (B_r(x)) \le \omega _k r^k E_1 \quad \text{ for } \text{ all } B_r(x) \subset \textrm{C}(T,1). \end{aligned}$$
(3.10)
  1. (A)

    If

    $$\begin{aligned} \left| \Vert V\Vert (\phi _T^2) - \textbf{c}\right| \le \frac{\textbf{c}}{8}, \quad \alpha \le \alpha _2, \quad \text{ and } \mu \le \mu _1, \end{aligned}$$
    (3.11)

    then we have

    $$\begin{aligned} \left| \Vert V\Vert (\phi _T^2) - \textbf{c}\right| \le {\left\{ \begin{array}{ll} P_2(\alpha ^{\frac{2k}{k-2}} + \alpha ^{\frac{3}{2}} \mu ^{\frac{1}{2}} + \mu ^2) &{}\quad \text{ if } k > 2, \\ P_2 (\alpha ^{\frac{3}{2}} \mu ^{\frac{1}{2}} + \mu ^2) &{}\quad \text{ if } k \le 2. \end{array}\right. } \end{aligned}$$
    (3.12)
  2. (B)

    If, instead

    $$\begin{aligned} \frac{\textbf{c}}{8} < \left| \Vert V\Vert (\phi _T^2) - \textbf{c}\right| \le (1-\nu ) \textbf{c}\quad \text{ and } \mu \le \mu _1 \end{aligned}$$
    (3.13)

    then \(\alpha \ge \alpha _2\).

The following is an immediate corollary of Proposition 3.2, and it is [11, Corollary 5.3]

Corollary 3.3

Let \(\alpha _2,\mu _1\), and \(P_2\) be as in Proposition 3.2. Set \(\mu _2:= \min \{\mu _1, \left( \frac{\textbf{c}}{32P_2} \right) ^{1/2}\}\). For V and T as in Proposition 3.2, define \(\alpha \) and \(\mu \) as in (3.8) and (3.9). Also define

$$\begin{aligned} {\hat{E}}:= \Vert V\Vert (\phi _T^2) - \textbf{c}. \end{aligned}$$
(3.14)

Assume (3.10) as well as

$$\begin{aligned} \mu \le \mu _2, \quad \text{ and } \quad 2P_2\mu ^2 \le |{\hat{E}}| \le (1-\nu )\textbf{c}. \end{aligned}$$
(3.15)

Then, we have

$$\begin{aligned} \alpha ^2 \ge {\left\{ \begin{array}{ll} \min \left\{ \alpha _2^2, (4P_2)^{-\frac{k-2}{k}}|{\hat{E}}|^{\frac{k-2}{k}}, (4P_2)^{-\frac{4}{3}}\mu ^{-\frac{2}{3}} |{\hat{E}}|^{\frac{4}{3}} \right\} &{}\quad \text{ if } k > 2, \\ \min \left\{ \alpha _2^2, (2P_2)^{-\frac{4}{3}} \mu ^{-\frac{2}{3}} |{\hat{E}}|^{\frac{4}{3}} \right\} &{}\quad \text{ if } k \le 2 . \end{array}\right. } \end{aligned}$$
(3.16)

Proof of Theorem 3.1

The general scheme follows the proof of [11, Theorem 5.7]. We define the function

$$\begin{aligned}{} & {} t \in \left[ -1,0\right] \mapsto E(t) \nonumber \\{} & {} := \Vert V_t\Vert (\phi _T^2) - \textbf{c}- \int _{-1}^t \int _{\textrm{C}(T,1)} 2|u|^2\phi _T^2\, d\Vert V_s\Vert ds - K_2\mu _*^2 (1+t), \end{aligned}$$
(3.17)

where

$$\begin{aligned} K_2:= 80\,\sup \{5|\nabla \phi _T|^4 \phi _T^{-2} + |\nabla |\nabla \phi _T||^2\}. \end{aligned}$$

Arguing precisely as in the proof of [11, (5.53)], namely by testing Brakke’s inequality (2.12) with \(\varphi = \phi _T^2\), we conclude that

$$\begin{aligned}{} & {} E(t_2) - E(t_1) \le - \frac{1}{4} \int _{t_1}^{t_2} \int _{\textrm{C}(T,1)} |h(V_t,\cdot )|^2 \, \phi _T^2 \,d\Vert V_t\Vert dt \nonumber \\{} & {} \qquad \text{ for } \text{ every } -1 \le t_1 < t_2 \le 0. \end{aligned}$$
(3.18)

We first prove (3.5). Towards a contradiction, suppose that there exists \(t_* \in \left[ -\frac{1}{2},0\right] \) such that

$$\begin{aligned} \Vert V_{t_*}\Vert (\phi _T^2) - \textbf{c}> K (\mu _*^2 + C(u)), \end{aligned}$$
(3.19)

where \(1< K < \infty \) will be chosen later. In particular, from the definition of E(t) we have for every \(t \in \left[ -1,t_*\right] \) that

$$\begin{aligned} \Vert V_t\Vert (\phi _T^2) - \textbf{c}\ge E(t) \overset{(3.18)}{\ge } E(t_*) > K(\mu _*^2 + C(u)) - C(u) - K_2\mu _*^2 \ge \frac{K}{2}\mu _*^2\nonumber \\ \end{aligned}$$
(3.20)

if we choose \(K \ge \max \{1, 2 K_2\}\). On the other hand, we also have, due to (3.18), (3.2), (3.3), and (3.4),

$$\begin{aligned} \Vert V_t\Vert (\phi _T^2) - \textbf{c}\le E(t) + C(u) + K_2\mu _*^2 \le E(-1) + C(u) + K_2\mu _*^2 \le (K_2+2)\,\varepsilon _{4}^2 \le \varepsilon _{4} \textbf{c}\nonumber \\ \end{aligned}$$

for \(\varepsilon _{4}\) suitably small. In particular, if \(P_2\) is the constant from Proposition 3.2 corresponding to \(E_1\) and, for instance, \(\nu =1/2\), then choosing also \(K \ge 4P_2\) we have that

$$\begin{aligned} 2P_2 \mu _*^2 \le \Vert V_t\Vert (\phi _T^2) -\textbf{c}\le \varepsilon _{4} \textbf{c}\qquad \text{ for } \text{ every } t \in \left[ -1,t_*\right] . \end{aligned}$$
(3.21)

Hence, we can apply Corollary 3.3 with \(V=V_t\) for all \(t \in \left[ -1,t_*\right] \), and conclude that for a.e. \(t\in \left[ -1,t_*\right] \) it holds

$$\begin{aligned} \frac{1}{4} \,\int _{\textrm{C}(T,1)} |h(V_t,\cdot )|^2\phi _T^2\,d\Vert V_t\Vert \ge {\left\{ \begin{array}{ll} P \min \{1,E(t)^{\frac{k-2}{k}}, \mu _*^{-\frac{2}{3}}E(t)^{\frac{4}{3}}\} &{}\quad \text{ if } k>2, \\ P \min \{1,\mu _*^{-\frac{2}{3}}E(t)^{\frac{4}{3}}\} &{}\quad \text{ if } k\le 2, \end{array}\right. } \end{aligned}$$
(3.22)

where

$$\begin{aligned} P:= \frac{1}{4 \cdot 2^{4/3}} \min \{\alpha _2^2, (4P_2)^{-\frac{k-2}{k}}, (4P_2)^{-\frac{4}{3}}\}, \end{aligned}$$

and \(\alpha _2 \in \left( 0,1\right) \) is the same constant as in Proposition 3.2 corresponding to \(E_1\) and \(\nu = 1/2\). Let us consider the case \(k > 2\), as the case \(k \le 2\) is easier and can be treated similarly. Note that, since \(\varepsilon _{4} < 1\),

$$\begin{aligned} P\min \{1,E(t)^{\frac{k-2}{k}}, \mu _*^{-\frac{2}{3}}E(t)^{\frac{4}{3}}\} = {\left\{ \begin{array}{ll} P &{}\quad \text{ if } E(t) \ge 1,\\ P E(t)^{\frac{k-2}{k}} &{}\quad \text{ if } \mu _*^{\frac{2k}{k+6}} \le E(t) \le 1, \\ P \mu _*^{-\frac{2}{3}} E(t)^{\frac{4}{3}} &{}\quad \text{ if } E(t) \le \mu _*^{\frac{2k}{k+6}}. \end{array}\right. } \end{aligned}$$

On the other hand, for \(t \in \left[ -1,t_*\right] \) we have

$$\begin{aligned} E(t) \le E(-1) = \Vert V_{-1}\Vert (\phi _T^2) - \textbf{c}\le \varepsilon _{4}^2 < 1, \end{aligned}$$

so that the first alternative does not occur. Let \({\bar{t}}\) be the supremum of \(s \in [-1,t_*]\) such that \(\mu _*^{\frac{2k}{k+6}} \le E(t) \le 1\) for \(t \in \left[ -1, s \right] \). Then, (3.18) and (3.22) imply that the differential inequality \(E'(t) \le - P E(t)^{\frac{k-2}{k}}\) is satisfied a.e. on \(\left[ -1, {\bar{t}} \right] \). Integrating and using (3.3), we find then that

$$\begin{aligned} {\bar{t}} \le -1 + \frac{k \varepsilon _{4}^{\frac{4}{k}}}{2P}. \end{aligned}$$

In particular, for \(\varepsilon _{4}\) suitably small it is \({\bar{t}} < -\frac{3}{4}\). By the monotonicity of E(t), we then have that the differential inequality \(E'(t) \le - P \mu _*^{-\frac{2}{3}} E(t)^{\frac{4}{3}}\) is satisfied a.e. on \(\left[ {\bar{t}}, t_* \right] \), so that, integrating, we find

$$\begin{aligned} E(t_*) \le \left( \frac{3}{P (t_* - {\bar{t}})} \right) ^3 \mu _*^2. \end{aligned}$$
(3.23)

Since \(t_*-{\bar{t}} \ge 1/4\), (3.23) is in contradiction with (3.20) as soon as we choose \(K \ge \frac{4}{P^3} 12^3\). This completes the proof of (3.5). Assume now that (3.6) holds, and let \({\bar{t}} \in \left[ -\tau ,0\right] \) be such that

$$\begin{aligned} \Vert V_{{\bar{t}}}\Vert (\phi _T^2) - \textbf{c}\ge -\frac{3}{2} \varepsilon _{4}^2. \end{aligned}$$
(3.24)

Towards a contradiction, assume that (3.7) is violated: due to (3.5), this means that there exists \(t_* \in \left[ -\frac{1}{2},-2\tau \right] \) such that

$$\begin{aligned} E(t_*) \le \Vert V_{t_*}\Vert (\phi _T^2) - \textbf{c}< - \frac{K}{\tau ^3}(\mu _*^2+C(u)). \end{aligned}$$
(3.25)

We then have

$$\begin{aligned} E(t) \le - \frac{K}{\tau ^3}(\mu _*^2+C(u)) \qquad \text{ for } \text{ every } t \in \left[ t_*,{\bar{t}} \right] \end{aligned}$$
(3.26)

by monotonicity, and thus

$$\begin{aligned} \Vert V_t\Vert (\phi _T^2)-\textbf{c}\le E(t) + C(u) + K_2\mu _*^2 \le - \frac{K}{2} \mu _*^2 \qquad \text{ for } \text{ every } t \in \left[ t_*,{\bar{t}} \right] . \end{aligned}$$

On the other hand, again for \(t \in \left[ t_*,{\bar{t}} \right] \) we have

$$\begin{aligned} \Vert V_t\Vert (\phi _T^2) - \textbf{c}\ge E(t) \ge E({\bar{t}}) \ge - \left( \frac{5}{2} + K_2\right) \varepsilon _{4}^2 \ge -\varepsilon _{4} \textbf{c}, \end{aligned}$$

for \(\varepsilon _{4}\) sufficiently small, where we have used (3.24) together with (3.2) and (3.4). We can then apply again Corollary 3.3 with \(V=V_t\), \(t \in \left[ t_*,{\bar{t}}\right] \), and conclude that for a.e. \(t \in \left[ t_*,{\bar{t}} \right] \)

$$\begin{aligned} \begin{aligned}&\frac{1}{4} \,\int _{\textrm{C}(T,1)} |h(V_t,\cdot )|^2\phi _T^2\,d\Vert V_t\Vert \\ {}&\qquad \ge {\left\{ \begin{array}{ll} 2^{\frac{4}{3}} P \min \{1,\left( \textbf{c}- \Vert V_t\Vert (\phi _T^2)\right) ^{\frac{k-2}{k}}, \mu _*^{-\frac{2}{3}}\left( \textbf{c}- \Vert V_t\Vert (\phi _T^2)\right) ^{\frac{4}{3}}\} &{}\text{ if } k>2, \\ 2^{\frac{4}{3}} P \min \{1,\mu _*^{-\frac{2}{3}}\left( \textbf{c}- \Vert V_t\Vert (\phi _T^2)\right) ^{\frac{4}{3}}\} &{}\text{ if } k\le 2. \end{array}\right. } \end{aligned} \end{aligned}$$
(3.27)

On the other hand, as a consequence of (3.26) we have that for every \(t \in \left[ t_*,{\bar{t}} \right] \)

$$\begin{aligned}{} & {} \textbf{c}- \Vert V_t\Vert (\phi _T^2) \ge -E(t) - C(u) - K_2 \mu _*^2 \ge -E(t) - K (C(u) + \mu _*^2) \\{} & {} \quad \ge (-1+\tau ^3) E(t) \ge \frac{1}{2} (-E(t)), \end{aligned}$$

and thus

$$\begin{aligned} \frac{1}{4} \,\int _{\textrm{C}(T,1)} |h(V_t,\cdot )|^2\phi _T^2\,d\Vert V_t\Vert \ge {\left\{ \begin{array}{ll} P \min \{1,\left( -E(t)\right) ^{\frac{k-2}{k}}, \mu _*^{-\frac{2}{3}}\left( -E(t)\right) ^{\frac{4}{3}}\} &{}\text{ if } k>2, \\ P \min \{1,\mu _*^{-\frac{2}{3}}\left( -E(t)\right) ^{\frac{4}{3}}\} &{}\text{ if } k\le 2. \end{array}\right. }\nonumber \\ \end{aligned}$$
(3.28)

Arguing as above, we only treat the case \(k>2\), and we notice that \(-E(t)=|E(t)| < 1\). Assume that \({\hat{t}}\) is the infimum of \(s \in \left[ t_*, {\bar{t}} \right] \) such that \(|E(t)| \ge \mu _*^{\frac{2k}{k+6}}\) for all \(t \in \left[ s, {\bar{t}}\right] \). Then, (3.18) and (3.28) imply that the differential inequality \(E'(t) \le - P \left( - E(t) \right) ^{\frac{k-2}{k}}\) is satisfied a.e. on \(\left[ {\hat{t}}, {\bar{t}} \right] \). Integrating we find that

$$\begin{aligned} \frac{2P}{k} ({\bar{t}}-{\hat{t}}) \le \left( -E({\bar{t}})\right) ^{\frac{2}{k}} - \left( -E({\hat{t}})\right) ^{\frac{2}{k}} \le (\varepsilon _{4} \textbf{c})^{\frac{2}{k}}. \end{aligned}$$

In particular, for \(\varepsilon _{4}\) sufficiently small (depending on \(\tau \)) we have \({\hat{t}} \in \left[ -\frac{3}{2}\tau ,{\bar{t}}\right] \). Now, by monotonicity of E(t), it holds \(|E(t)| \le \mu _*^{\frac{2k}{k+6}}\) on \(\left[ t_*,{\hat{t}} \right] \), and thus the differential inequality \(E'(t) \le - P \mu _*^{-\frac{2}{3}} \left( -E(t)\right) ^{\frac{4}{3}}\) holds a.e. on \(\left[ t_*,{\hat{t}} \right] \). We integrate to find that

$$\begin{aligned} E(t_*) \ge - \left( \frac{3}{P ({\hat{t}} - t_*)} \right) ^{3} \mu _*^2 \ge - \left( \frac{6}{P \tau } \right) ^3 \mu _*^2, \end{aligned}$$

which contradicts (3.25) if \(K \ge 2 (6/P)^3\) and completes the proof of (3.7). \(\square \)

4 Lipschitz approximation

The following proposition states the existence of a Lipschitz approximation of the flow in space-time, with good estimates up to the end-time. The result is similar to [11, Theorem 7.5], the only difference being that the Lipschitz approximation is obtained up to the end-time. In the next Sect. 5, \(t=0\) in Proposition 4.1 will correspond to a time slightly before the end-time, up to which we have a good excess estimate.

Proposition 4.1

Corresponding to \(E_1\in [1,\infty )\), p and q, there exist \(\varepsilon _{5}\in (0,1)\), \(r_1\in (0,1)\) and \({c_{3}}\in [1,\infty )\) with the following property. For \(U=\textrm{C}(T,1)\), suppose that \(\{V_t\}_{t\in [-3/5,0]}\) and \(\{u(\cdot ,t)\}_{t\in [-3/5,0]}\) satisfy (A1)–(A4). Write \(V_t=\textbf{var}(M_t,1)\) for a.e. t and identify T with \({\mathbb {R}}^k\times \{0\}\). Suppose that we have

$$\begin{aligned}{} & {} \int _{\textrm{C}(T,1)\times [-3/5,0]} |h(V_t,\cdot )|^2\phi _T^2\,d\Vert V_t\Vert dt \le \varepsilon _{5} r_1^2/4, \end{aligned}$$
(4.1)
$$\begin{aligned}{} & {} \big |\Vert V_t\Vert (\phi _T^2)-\textbf{c}\big |\le \varepsilon _{5} \,\,\,\, \text{ for } \text{ all } t\in [-3/5,0], \end{aligned}$$
(4.2)
$$\begin{aligned}{} & {} \textrm{spt} \,\Vert V_t\Vert \cap \textrm{C}(T,1)\subset \{|T^{\perp }(x)|\le \varepsilon _{5}\}\,\,\, \text{ for } \text{ all } t\in [-3/5,0], \end{aligned}$$
(4.3)
$$\begin{aligned}{} & {} \Vert u\Vert _{L^{p,q}(\textrm{C}(T,1)\times [-3/5,0])}\le 1. \end{aligned}$$
(4.4)

Set

$$\begin{aligned} \beta ^2:=\int _{G_k(\textrm{C}(T,1))\times [-3/5,0]} \Vert S-T\Vert ^2\phi _T^2\, dV_t(\cdot ,S)dt \end{aligned}$$
(4.5)

and

$$\begin{aligned} \kappa ^2:=\left| \int _{-3/5}^0\left( \Vert V_t\Vert (\phi _{T,1/2}^2)- \frac{{\textbf {c}}}{2^k}\right) \,dt \right| . \end{aligned}$$
(4.6)

Then there exist maps \(f\,:\, B_{1/3}^k\times [-1/2,0]\rightarrow {\mathbb {R}}^{n-k}\) and \(F\,:\,B_{1/3}^k\times [-1/2,0]\rightarrow {\mathbb {R}}^n\times [-1/2,0]\) such that for all \((x,s),\,(y,t)\in B_{1/3}^k \times [-1/2,0]\),

$$\begin{aligned} \begin{aligned}&F(x,s)=(x,f(x,s),s), \\&\quad |f(x,s)-f(y,t)|\le c(n,k)\max \{|x-y|,|s-t|^{1/2}\}, \\&\quad |f(x,s)|\le \varepsilon _{5}, \end{aligned} \end{aligned}$$
(4.7)

and with the following property. Define

$$\begin{aligned} \begin{aligned} X:&=\left( \cup _{t\in [-1/2,0]} (M_t\cap \textrm{C}(T,1/3))\times \{t\}\right) \cap \textrm{image}\,F, \\ Y:&=(T\times \textrm{Id}_{{\mathbb {R}}} )(X). \end{aligned} \end{aligned}$$
(4.8)

Then

$$\begin{aligned}{} & {} (\Vert V_t\Vert \times dt)((\textrm{C}(T,1/3)\times [-1/2,0])\setminus X)\nonumber \\{} & {} \quad +{\mathcal {L}}^{k+1}((B_{1/3}^k\times [-1/2,0])\setminus Y) \le \kappa ^2+c_{3}\beta ^2. \end{aligned}$$
(4.9)

Proof

To be consistent with the notation in [11, Section 7], we change the time intervals \([-3/5,0]\) and \([-1/2,0]\) in the statement above to [0, 1] and [1/4, 1] respectively in the following, which does not change the proof in any essential way. With this replacement, we discuss the proof. We simply describe the exact locations where we need to change in [11, Section 7] and the equation numbers are those of [11] in the following for the rest of the proof. For [11, Proposition 7.1], one replaces the parabolic cylinder \(P_r(a,s)\) in (7.3) and (7.4) by \({\tilde{P}}_r(a,s)\) defined in Sect. 2 and the same conclusion (7.6) follows by the same proof. Next, no change is required in [11, Lemma 7.3], where one obtains a small constant \(r_1\in (0,1)\) depending only on \(E_1,\,p\) and q. In the proof of [11, Theorem 7.5], one replaces (1/4, 3/4) by (1/4, 1) and P by \({\tilde{P}}\) in (7.58), (7.59), (7.62), (7.65) and (7.66). The only essential modification is the part following (7.66) on the covering argument. The modified statement (7.66) is the following: For each \((x,s)\in B\), there exists some \(r(x,s)\in (0,r_1)\) such that

$$\begin{aligned} \int _{\overline{{\tilde{P}}_{r(x,s)}(x,s)}}\Vert S-T\Vert ^2\,dV_t(\cdot ,S)dt\ge \gamma (r(x,s))^{k+2}. \end{aligned}$$

This follows from the definition of A, (7.58). Thus \(\{\overline{{\tilde{P}}_{r(x,s)}(x,s)}\}_{(x,s)\in B}\) is a covering of B. Here, unlike \(P_r(x,s)\), since \({\tilde{P}}_{r}(x,s)\) is not a metric ball with respect to the metric \(d((x_1,s_1),(x_2,s_2)):=\max \{|x_1-x_2|,|s_1-s_2|^{1/2}\}\), we cannot invoke the standard Vitali covering lemma as given. On the other hand, by following the same proof of the Vitali lemma applied to \(\{\overline{{\tilde{P}}_{r(x,s)}(x,s)}\}_{(x,s)\in B}\) (see for example [15, Theorem 3.3]), one can prove that there exists a countable subset \(\{\overline{{\tilde{P}}_{r(x_j,s_j)}(x_j,s_j)}\}\subset \{\overline{{\tilde{P}}_{r(x,s)}(x,s)}\}_{(x,s)\in B}\) such that it is pairwise disjoint and

$$\begin{aligned} B\subset \cup _{(x,s)\in B} \overline{{\tilde{P}}_{r(x,s)}(x,s)} \subset \cup _{j} ({\mathbb {R}}^n\times (0,1])\cap \overline{P_{5r(x_j,s_j)}(x_j,s_j)}. \end{aligned}$$

Note that the right-hand side are the closed metric balls with respect to the parabolic distance. Then, using the above inequality and the property of the covering,

$$\begin{aligned} \begin{aligned} (\Vert V_t\Vert \times dt)(B)&\le \sum _j (\Vert V_t\Vert \times dt) (({\mathbb {R}}^n\times (0,1]\cap \overline{P_{5r(x_j,s_j)}(x_j,s_j)}\,) \\&\le \sum _j 5^{k+2}2 E_1 r(x_j,s_j)^{k+2} \\&\le \sum _j 5^{k+2}2E_1\gamma ^{-1}\int _{\overline{{{\tilde{P}}}_{r(x_j,s_j)}(x_j,s_j)}} \Vert S-T\Vert ^2\, dV_t(\cdot ,t)dt \\&\le 5^{k+2}2E_1\gamma ^{-1}\int _{\textrm{C}(T,13/24)\times (0,1)} \Vert S-T\Vert ^2\,dV_t(\cdot ,S)dt\le 5^{k+2}2\gamma ^{-1}\beta ^2. \end{aligned} \end{aligned}$$

The rest of the proof is the same. \(\square \)

Remark 4.2

In [11], the generalized Besicovitch covering theorem in [6, 2.8.14] was invoked for parabolic cylinders at the bottom of page 40. After the publication of [11], Ulrich Menne communicated the second-named author that the parabolic cylinders do not satisfy the assumption in [6, 2.8.14] (called directionally \(\xi ,\,\eta ,\,\zeta \) limited), so that the theorem is not applicable. However, one can fix the proof in [11] by using the Vitali covering lemma, which holds true for any metric balls, instead of using Besicovich. Later it was proved that, even though the precise assumption in [6] is not satisfied, the Besicovich covering theorem still holds true for parabolic cylinders of type P (not \({\tilde{P}}\)), see [10] for the proof.

5 Blow-up argument

We first state the regularity result for a domain which is at positive distance away from the end-time \(t=0\). This is a direct consequence of [11, Theorem 8.7] with modifications to shorten the waiting time near the end-time.

Proposition 5.1

Corresponding to \(E_1\in [1,\infty )\), \(\nu \in (0,1)\), p, q and \(\iota \in (0,1/4)\), there exist \(\varepsilon _{6}\in (0,1)\), \(c_{4}\in (1,\infty )\) with the following property. For \(T\in \textbf{G}(n,k)\), \(R\in (0,\infty )\), \(U=\textrm{C}(T,2R)\), suppose \(\{V_t\}_{t\in [-R^2,0]}\) and \(\{u(\cdot ,t)\}_{t\in [-R^2,0]}\) satisfy (A1)–(A4) and (2.13)–(2.16) with \(\varepsilon _{6}\) in place of \(\varepsilon _{2}\). Write \({\tilde{D}}:=(B_R \cap T)\times [-R^2/2,-\iota R^2]\). Then there are \(f\,:\,{\tilde{D}}\rightarrow T^\perp \) and \(F:\,{\tilde{D}}\rightarrow {\mathbb {R}}^n\) such that \(T(F(y,t))=y\) and \(T^{\perp }(F(y,t))=f(y,t)\) for all \((y,t)\in {\tilde{D}}\) and

$$\begin{aligned}{} & {} \textrm{spt}\,\Vert V_t\Vert \cap \textrm{C}(T,R)=\textrm{image}\,F(\cdot ,t) \,\, \text{ for } \text{ all } t\in [-R^2/2,-\iota R^2], \end{aligned}$$
(5.1)
$$\begin{aligned}{} & {} R^{-1} \Vert f\Vert _0+\Vert \nabla f\Vert _0+R^\zeta [ f]_{1+\zeta }\le c_{4}(\mu +\Vert u\Vert ), \end{aligned}$$
(5.2)

where the norms are measured on \((B_R \cap T)\times [-R^2/2,-\iota R^2]\).

Proof

We may assume that \(R=1\) by the parabolic change of variables. We first use the \(L^2-L^\infty \) height estimate [11, Proposition 6.4] with \(R=1\), \(\Lambda =1\), \(U=B_1(a)\) with \(a\in T\cap B_1\) (and the time-interval [0, 1] translated to \([-1,0]\)), so that there exist \(c_{5}=c_{5}(k,p,q)\) and \(c_{6}=c_{6}(n,k)\) such that, for all \(t\in [-4/5,0]\), we have

$$\begin{aligned} \textrm{spt}\,\Vert V_t\Vert \cap B_{4/5}(a) \subset \{x:\,|T^{\perp }(x)|\le {\tilde{\mu }}\}, \end{aligned}$$
(5.3)

where

$$\begin{aligned} {\tilde{\mu }}^2:=c_{6}\mu ^2+3 c_{5} \Vert u\Vert ^2 E_1^{1-{2/p}}. \end{aligned}$$
(5.4)

In particular, by moving a within \(T\cap B_1\), (5.3) shows

$$\begin{aligned} \textrm{spt}\,\Vert V_t\Vert \cap \textrm{C}(T,3/2)\cap \{x:\,|T^{\perp }(x)|\le 1/2\}\subset \{x:\,|T^\perp (x)|\le {\tilde{\mu }}\} \end{aligned}$$
(5.5)

for all \(t\in [-4/5,0]\). Using the lower density ratio bound (see [11, Corollary 6.3]), for all sufficiently small \(\varepsilon _{6}\) depending only on \(E_1\), p and q, one can show that

$$\begin{aligned} \textrm{spt}\,\Vert V_t\Vert \cap \textrm{C}(T,3/2)\cap \{x:\, |T^\perp (x)|>1/2\}=\emptyset \end{aligned}$$
(5.6)

for all \(t\in [-4/5,0]\). Thus, (5.5) and (5.6) show

$$\begin{aligned} \textrm{spt}\,\Vert V_t\Vert \cap \textrm{C}(T,3/2) \subset \{x:\,|T^\perp (x)|\le {\tilde{\mu }}\} \end{aligned}$$
(5.7)

for all \(t\in [-4/5,0]\). Next, we use [11, Theorem 8.7]. Corresponding to \(E_1\), p and q with \(\nu =1/2\), there exist \(\varepsilon _{7}\in (0,1)\) (\(\varepsilon _6\) in [11]), \(\sigma \in (0,1/2)\), \(\Lambda _{1}\in (2,\infty )\) (\(\Lambda _3\) in [11]) and \(c_{7}\in (1,\infty )\) (\(c_{16}\) in [11]) with the properties stated there. We identify T with \({\mathbb {R}}^k\times \{0\}\) in the following. We fix a small \(0<{\tilde{R}}\le 1/6\) depending only on \(\iota \) and \(\Lambda _{1}\) (for example, \({\tilde{R}}=\sqrt{\iota /(4\Lambda _{1})}\)) so that, for any \((x,t)\in B^k_1\times [-1/2,-\iota ]\), we have

$$\begin{aligned} B^k_{3{\tilde{R}}}(x)\times (t -\Lambda _1 \tilde{R}^2,t+\Lambda _{1}{\tilde{R}}^2)\subset B^k_{3/2}\times (-3/5,-\iota /2). \end{aligned}$$
(5.8)

The choice of such \({\tilde{R}}\) depends ultimately only on \(\iota \), \(E_1\), p and q. We use [11, Theorem 8.7] with \(R={\tilde{R}}\) and \((x,t)\in B_1^k\times [-1/2,-\iota ]\) as the origin. There are four assumptions in [11, Theorem 8.7], the smallness of height [11, (8.83)] and \(\Vert u\Vert \) [11, (8.84)], and the existence of \(t_1\) and \(t_2\) in [11, (8.85)] and [11, (8.86)] with respect to \(B^k_{3{\tilde{R}}}(x)\times (t-\Lambda _{1}{\tilde{R}}^2,t+\Lambda _{1}{\tilde{R}}^2)\) and \(\nu =1/2\). The first two conditions are fulfilled if we restrict \(\varepsilon _{6}\) so that \(\varepsilon _{6} {\tilde{R}}^{-(k+4)/2}<\varepsilon _{7}\). In the following, we prove that the latter two are satisfied by using a compactness argument. Let \(\phi _{T,{\tilde{R}},x}\) be defined by \(\phi _{T,{\tilde{R}},x}(y):=\phi _{T,{\tilde{R}}}(y-x)\). We claim that, given any \(\delta >0\), for all sufficiently small \(\varepsilon _{6}>0\) depending only on \(\iota ,\,E_1,\,\nu ,\,p,\,q\) and \(\delta \), we have

$$\begin{aligned} {\tilde{R}}^{-k} \Vert V_t\Vert (\phi _{T,{\tilde{R}},x}^2)\le \textbf{c}+\delta \end{aligned}$$
(5.9)

for all \((x,t)\in B^k_1\times [-3/5,0]\). Note that, by using the monotone decreasing property of E(t) corresponding to \(\phi _{T,{\tilde{R}},x}\) in place of \(\phi _T\) in (3.18), the increase of \(\Vert V_t\Vert (\phi _{T,{\tilde{R}},x}^2)\) in time can be made small by restricting \(\mu \) and \(\Vert u\Vert \) appropriately depending on \(\delta \) and \({\tilde{R}}\) (in the following, we may refer to this fact as “almost monotone property”), so we need to prove \({\tilde{R}}^{-k}\Vert V_{-3/5}\Vert (\phi _{T,{\tilde{R}},x}^2)\le \textbf{c}+\delta \) for all \(x\in B^k_1\). Assume for a contradiction that there exist \(\{V_t^{(m)}\}_{t\in [-1,0]}\) and \(\{u^{(m)}(\cdot ,t)\}_{t\in [-1,0]}\) satisfying the assumptions of the present theorem with \(\varepsilon =1/m\), and \(x_m\in B_1^k\) such that \({\tilde{R}}^{-k}\Vert V_{-3/5}^{(m)}\Vert (\phi _{T,{\tilde{R}},x_m}^2)>\textbf{c}+\delta \). Again by the almost monotone property, we have

$$\begin{aligned} \inf _{t\in [-4/5,-3/5]}{\tilde{R}}^{-k}\Vert V_t^{(m)}\Vert (\phi _{T,{\tilde{R}},x_m}^2)\ge \textbf{c}+\delta /2 \end{aligned}$$
(5.10)

for all large m. Since

$$\begin{aligned} \int _{-4/5}^{-3/5}\int _{\textrm{C}(T,3/2)}|h(V^{(m)}_t,\cdot )|^2\,d\Vert V^{(m)}_t\Vert dt \end{aligned}$$

is uniformly bounded by (3.18) and (A2), using Fatou’s lemma and (A1) we conclude that for almost all \(t_0\in [-4/5,-3/5]\), there exists a subsequence \(V_{t_0}^{(m_j)}\in \textbf{IV}_k(\textrm{C}(T,2))\) such that the \(L^2(\Vert V_{t_0}^{(m_j)}\Vert )\)-norms of \(\{h(V_{t_0}^{(m_j)})\}_j\) are bounded uniformly in \(\textrm{C}(T,3/2)\). Then, by Allard’s compactness theorem of integral varifolds, a further subsequence converges to \(V\in \textbf{IV}_k(\textrm{C}(T,3/2))\), and due to (5.7), it is supported on T. Since the \(L^2\)-norm of the generalized mean curvature is lower-semicontinuous under varifold convergence, V has \(h(V,\cdot )\in L^2(\Vert V\Vert )\) in \(\textrm{C}(T,3/2)\) and the multiplicity of V on T has to be a constant function with integer value, and by (5.10), the integer has to be \(\ge 2\). But this implies that \(\liminf _{j\rightarrow \infty }\Vert V_{t_0}^{(m_j)}\Vert (\phi _T^2)\ge \Vert V\Vert (\phi _T^2)\ge 2\textbf{c}\). Since \(t_0\ge -4/5\) and by the almost monotone property, one can obtain a contradiction to (2.13) for all large \(m_j\). This proves (5.9). Similarly, we claim that, given \(\delta >0\), for small \(\varepsilon _{6}>0\),

$$\begin{aligned} {\tilde{R}}^{-k}\Vert V_t\Vert (\phi _{T,{\tilde{R}}, x}^2)\ge \textbf{c}-\delta \end{aligned}$$
(5.11)

for all \((x,t)\in B_1^k\times [-3/5,-\iota /2]\). Again by the almost monotone property, we need to prove the claim at \(t=-\iota /2\). The similar contradiction argument applied to the time interval \([-\iota /2,-\iota /4]\) in place of \([-4/5,-3/5]\) (with the same notation) shows that, for almost all \(t_0\in [-\iota /2,-\iota /4]\), there exists a subsequence such that \(\lim _{j\rightarrow \infty } \Vert V_{t_0}^{(m_j)}\Vert =0\) on \(\textrm{C}(T,3/2)\). But then, with the clearing-out lemma (see [11, Corollary 6.3]), one can show that \((\Vert V_t^{(m_j)}\Vert \times dt)(\textrm{C}(T,1)\times (-\iota /8,0))=0\) for all large j (where \(\iota \) needs to be smaller than a constant depending only on k, n, p, q and \(E_1\) for the clearing-out lemma). This is a contradiction to (2.14). This proves (5.11). Now we are ready to apply [11, Theorem 8.7]: we choose a small \(\delta >0\) so that \(\textbf{c}-\delta >\textbf{c}/2\) and \(\textbf{c}+\delta <3\textbf{c}/2\) and let \(\varepsilon _{6}\) be restricted so that we have (5.9) and (5.11). Then for each \(T^{-1}(B^k_{3{\tilde{R}}}(x))\times (t-\Lambda _{1}{\tilde{R}}^2,t+\Lambda _{1}{\tilde{R}}^2)\) with \((x,t)\in B_1^k\times [-1/2,-\iota ]\), all the assumptions for [11, Theorem 8.7] are satisfied. Thus the support of \(\Vert V_t\Vert \) can be represented as the graph of a \(C^{1,\zeta }\) function in \(T^{-1}(B^k_{\sigma {\tilde{R}}}(x)) \times (t-{\tilde{R}}^2/4,t+{\tilde{R}}^2/4)\) with estimate in terms of \(\mu \) and \(\Vert u\Vert \). Since \(\textrm{C}(T,1)\times [-1/2,-\iota ]\) can be covered by a finite number of such domains, the support of the flow is represented as a \(C^{1,\zeta }\) graph over \(B^k_1\times [-1/2,-\iota ]\) with estimates in terms of \(\mu \) and \(\Vert u\Vert \). The resulting constant \(c_{4}\) depends only on \(E_1,\,\nu ,\,p,\,q,\,\iota \). This concludes the proof. \(\square \)

The constants in the claim of Proposition 5.1 deteriorate as \(\iota \) approaches to 0, and we will use it with a fixed \(\iota \) depending only on \(E_1\), \(\nu \) and \(\zeta \) in Proposition 5.3. We next prove the main decay estimate under the parabolic dilation centered at the end-time, which will be iterated to obtain the desired \(C^{1,\zeta }\) estimate.

Proposition 5.2

Corresponding to \(E_1\in [1,\infty )\), \(\nu \in (0,1)\), p and q there exist \(\varepsilon _{8}\in (0,1)\), \(\theta \in (0,1/4)\) and \(c_{8}\in (1,\infty )\) with the following property. For \(W\in \textbf{G}(n,k)\), \(0<R<\infty \) and \(U=\textrm{C}(W,2R)\), suppose that \(\{V_t\}_{t\in [-R^2,0]}\) and \(\{u(\cdot ,t)\}_{t\in [-R^2,0]}\) satisfy (A1)–(A4). Suppose

$$\begin{aligned}{} & {} T\in \textbf{G}(n,k)\,\,\text{ satisfies }\,\,\Vert T-W\Vert <\varepsilon _{8}, \end{aligned}$$
(5.12)
$$\begin{aligned}{} & {} A\in \textbf{A}(n,k)\,\,\text{ is } \text{ parallel } \text{ to }\,\,T, \end{aligned}$$
(5.13)
$$\begin{aligned}{} & {} \mu :=\left( R^{-k-4}\int _{-R^2}^0\int _{\textrm{C}(W,2R)} \textrm{dist} \,(x,A)^2\,d\Vert V_t\Vert dt\right) ^{1/2}<\varepsilon _{8}, \end{aligned}$$
(5.14)
$$\begin{aligned}{} & {} \Vert u\Vert :=R^\zeta \Vert u\Vert _{L^{p,q}(\textrm{C}(W,2R)\times (-R^2,0))}<\infty , \end{aligned}$$
(5.15)
$$\begin{aligned}{} & {} (\textrm{C}(W,\nu R)\times \{0\})\cap \textrm{spt}\,(\Vert V_t\Vert \times dt)\ne \emptyset , \end{aligned}$$
(5.16)
$$\begin{aligned}{} & {} R^{-k}\Vert V_{-4R^2/5}\Vert (\phi _{W,R}^2)\le (2-\nu )\textbf{c}. \end{aligned}$$
(5.17)

Then there are \({\tilde{T}}\in \textbf{G}(n,k)\) and \({\tilde{A}}\in \textbf{A}(n,k)\) such that

$$\begin{aligned}{} & {} {\tilde{A}} \text{ is } \text{ parallel } \text{ to } {\tilde{T}}, \end{aligned}$$
(5.18)
$$\begin{aligned}{} & {} \Vert T-{\tilde{T}}\Vert \le c_{8}\mu , \end{aligned}$$
(5.19)
$$\begin{aligned}{} & {} \left( (\theta R)^{-(k+4)}\int _{-(\theta R)^2}^0 \int _{\textrm{C}(W,2\theta R)} \textrm{dist}\,(x,{\tilde{A}})^2\,d\Vert V_t\Vert dt\right) ^{1/2} \le \theta ^{\zeta }\max \{\mu ,c_{8}\Vert u\Vert \}.\nonumber \\ \end{aligned}$$
(5.20)

Moreover, if \(\Vert u\Vert <\varepsilon _{8}\), we have

$$\begin{aligned} (\theta R)^{-k}\Vert V_{-4(\theta R)^2/5}\Vert (\phi ^2_{W,\theta R})\le (2-\nu )\textbf{c}. \end{aligned}$$
(5.21)

Proof

We may assume that \(R=1\) after a parabolic change of variables. The outline of proof is similar to [11, Proposition 8.1], with the crucial difference that we work with (5.16) and that the result is for a domain centered at the end-time point \((x,t)=(0,0)\). We give a description on the different points on the proof for this result. The proof proceeds by contradiction. We will fix \(\theta \in (0,1/4)\) later depending only on \(E_1\) and \(\zeta \). If the claim were false, then, for each \(m\in {\mathbb {N}}\) there exist \(\{V_t^{(m)}\}_{t\in [-1,0]}\), \(\{u^{(m)}(\cdot ,t)\}_{t\in [-1,0]}\) satisfying (A1)–(A4) on \(\textrm{C}(W^{(m)},2)\times [-1,0]\) for \(W^{(m)}\in \textbf{G}(n,k)\) such that, by assuming \(T={\mathbb {R}}^k\times \{0\}\) after suitable rotation,

$$\begin{aligned}{} & {} \Vert T-W^{(m)}\Vert \le 1/m, \end{aligned}$$
(5.22)
$$\begin{aligned}{} & {} \mu ^{(m)}:=\left( \int _{-1}^0\int _{\textrm{C}(W^{(m)},2)} |T^{\perp }(x)|^2\,d\Vert V_t^{(m)}\Vert dt\right) ^{1/2}\le 1/m, \end{aligned}$$
(5.23)

(5.16) and (5.17), but for any \({\tilde{T}}\in \textbf{G}(n,k)\) with \(\Vert T-{\tilde{T}}\Vert \le m\mu ^{(m)}\) and \({\tilde{A}}\in \textbf{A}(n,k)\) which is parallel to \({\tilde{T}}\), we have

$$\begin{aligned} \left( \theta ^{-(k+4)}\int _{-\theta ^2}^0\int _{\textrm{C}(W^{(m)},2\theta )} \textrm{dist}\,(x,{\tilde{A}})^2\,d\Vert V_t^{(m)}\Vert dt\right) ^{1/2}>\theta ^\zeta \max \{\mu ^{(m)},m\Vert u^{(m)}\Vert \}.\nonumber \\ \end{aligned}$$
(5.24)

By taking \({\tilde{A}}={\tilde{T}}=T\) in (5.24), we obtain

$$\begin{aligned} \theta ^\zeta \Vert u^{(m)}\Vert <\theta ^{-(k+4)/2}m^{-1} \mu ^{(m)}, \end{aligned}$$

which shows in particular that

$$\begin{aligned} \lim _{m\rightarrow \infty } (\mu ^{(m)})^{-1} \Vert u^{(m)}\Vert =0. \end{aligned}$$
(5.25)

By (5.25), (5.4) and (5.7), we have

$$\begin{aligned} \limsup _{m\rightarrow \infty } \left\{ \frac{|T^{\perp }(x)|}{\mu ^{(m)}}:\, x\in \textrm{spt}\Vert V_t^{(m)}\Vert \cap \textrm{C}(T,1)\right\} \le \sqrt{c_{6}} \end{aligned}$$
(5.26)

for all \(t\in [-4/5,0]\), where \(\sqrt{c_{6}}=c(n,k)\). The same argument used to prove (5.9) combined with (5.17) shows

$$\begin{aligned} \limsup _{m\rightarrow \infty }\Vert V_{-7/10}^{(m)}\Vert (\phi _T^2) \le \textbf{c}. \end{aligned}$$
(5.27)

Using (5.22) and the similar argument leading to (5.10), one can prove that

$$\begin{aligned} \liminf _{m\rightarrow \infty } \Vert V_{-\theta ^6/2}^{(m)}\Vert (\phi _T^2)\ge \textbf{c}. \end{aligned}$$
(5.28)

Then, with (5.26)–(5.28), for all sufficiently large m, we may apply Theorem 3.1 with \(\tau =\theta ^6/2\). Thus there exists a constant \(c_{9}=c_{9}(\theta ,\nu ,p,q,E_1)\) such that

$$\begin{aligned} \limsup _{m\rightarrow \infty } \left( \sup _{t\in [-3/5-\theta ^6,-\theta ^6]} (\mu ^{(m)})^{-2}\big |\Vert V_t^{(m)}\Vert (\phi _T^2)-\textbf{c}\big |\right) \le c_{9}. \end{aligned}$$
(5.29)

We now apply Proposition 4.1 with the time interval shifted from \([-3/5,0]\) to \([-3/5-\theta ^6,-\theta ^6]\). For all sufficiently large m, note that (4.2)–(4.4) are all satisfied due to (5.29), (5.26) and (5.25). The smallness condition of (4.1) can be proved by (A4) and (5.29) as it was done for (3.18). Thus we have Lipschitz functions \(f^{(m)}\) and \(F^{(m)}\) defined on \(B^k_{1/3}\times [-1/2-\theta ^6,-\theta ^6]\) with quantities (4.5) and (4.6) defined in terms of \(V^{(m)}\) and where \(f^{(m)}\) and \(F^{(m)}\) satisfy (4.7)–(4.9). Once we achieve this, arguing exactly as in [11, p.45], one can prove that the right-hand side of (4.9) corresponding to \(V^{(m)}\) can be bounded by \(c(\mu ^{(m)})^2\) with c depending only on \(\theta ,\,\nu ,\,E_1,\, p,\,q\). We define the blowup sequence by

$$\begin{aligned} {{\tilde{f}}}^{(m)}:=f^{(m)}/\mu ^{(m)} \end{aligned}$$
(5.30)

for all sufficiently large m on \(B^k_{1/3}\times [-1/2-\theta ^6,-\theta ^6]\). Writing \(\Omega ':=B^k_{1/3}\times (-1/2-\theta ^6,-\theta ^6]\), the verbatim proof for [11, Lemma 8.3, 8.4] gives the existence of a subsequence \(\{{\tilde{f}}^{(m_j)}\}\) and \({\tilde{f}}\in C^{\infty }(\Omega ')\) such that

$$\begin{aligned} \lim _{j\rightarrow \infty }\Vert {\tilde{f}}^{(m_j)}-{\tilde{f}}\Vert _{L^2(\Omega ')}=0 \,\,\, \text{ and } \,\,\, \frac{\partial {\tilde{f}}}{\partial t}-\Delta {\tilde{f}}=0\,\, \text{ on } \Omega '. \end{aligned}$$
(5.31)

At this point, it is important to note that (5.26) gives

$$\begin{aligned} \Vert {\tilde{f}}\Vert _{L^\infty (\Omega ')}\le \sqrt{c_{6}}, \end{aligned}$$
(5.32)

where \(c_{6}=c(n,k)\). We then define \(T^{(m)}\in \textbf{G}(n,k)\) as the graph of the map

$$\begin{aligned} x \in {\mathbb {R}}^k \mapsto \mu ^{(m)}\nabla {\tilde{f}}(0,-\theta ^6) \cdot x \in {\mathbb {R}}^{n-k}, \end{aligned}$$

which is the tangent space to the graph \(\{(x,\mu ^{(m)}{\tilde{f}}(x,-\theta ^6)) \, :\, x \in B^k_{1/3}\}\) at \(x=0\), and also define the affine plane \(A^{(m)}\in \textbf{A}(n,k)\) by \(A^{(m)}=T^{(m)}+(0,\mu ^{(m)}{\tilde{f}}(0,-\theta ^6))\). By the standard estimates for parabolic PDE, all the partial derivatives of \({\tilde{f}}\) on \(B^k_{2\theta }\times [-\theta ^2,-\theta ^6]\) are bounded in terms of constant multiple of \(\sqrt{c_{6}}\). In particular, there exists a constant \(c_{10}=c(n,k)\) such that

$$\begin{aligned} \int _{B_{2\theta }\times [-\theta ^2,-\theta ^6]} |{\tilde{f}}(x,t)-{\tilde{f}}(0,-\theta ^6)-\nabla {\tilde{f}} (0,-\theta ^6)\cdot x|^2\,d{\mathcal {H}}^k\le c_{10}\theta ^{k+6}.\nonumber \\ \end{aligned}$$
(5.33)

Following the verbatim proof in [11], this leads to

$$\begin{aligned} \begin{aligned}&\limsup _{m\rightarrow \infty } \Vert T-T^{(m)}\Vert \le c_{10}, \\&\quad \limsup _{m\rightarrow \infty } (\mu ^{(m)})^{-2} \int _{\textrm{C}(T,2\theta )\times (-\theta ^2,-\theta ^6)} \textrm{dist}\,(x,A^{(m)})^2\,d\Vert V_t^{(m)}\Vert dt\le c_{10} \theta ^{k+6}. \end{aligned}\nonumber \\ \end{aligned}$$
(5.34)

Thus, for all large m, we have

$$\begin{aligned} \theta ^{-(k+4)} \int _{\textrm{C}(T,2\theta )\times (-\theta ^2,-\theta ^6)} \textrm{dist}\,(x,A^{(m)})^2\,d\Vert V_t^{(m)}\Vert dt\le c_{10} \theta ^{2} (\mu ^{(m)})^{2}. \end{aligned}$$
(5.35)

On the integral over the time interval \((-\theta ^6,0)\), since \(\textrm{dist}\,(x,A^{(m)})\le c(c_{10})\mu ^{(m)}\) on the support of \(\Vert V_t^{(m)}\Vert \), combined with (A2), we have

$$\begin{aligned} \theta ^{-(k+4)}\int _{\textrm{C}(T,2\theta )\times (-\theta ^6,0)} \textrm{dist}\,(x,A^{(m)})^2\,d\Vert V_t^{(m)}\Vert dt\le c_{11} \theta ^{2} (\mu ^{(m)})^{2} \end{aligned}$$
(5.36)

where \(c_{11}\) depends only on \(c_{10}\) and \(E_1\). Then (5.35) and (5.36) show

$$\begin{aligned} \theta ^{-(k+4)} \int _{\textrm{C}(T,2\theta )\times (-\theta ^2,0)} \textrm{dist}\,(x,A^{(m)})^2\,d\Vert V_t^{(m)}\Vert dt\le ( c_{10}+ c_{11}) \theta ^{2} (\mu ^{(m)})^{2}.\nonumber \\ \end{aligned}$$
(5.37)

Now, choosing \(\theta \) small enough depending only on \(n,\,k,\,E_1,\zeta \), we may assume that \((c_{10}+ c_{11})\theta ^2 <\theta ^{2\zeta }/2\). Since T can be replaced by \(W^{(m)}\) for the limit (see [11]) in (5.37), we have a contradiction to (5.24). This completes the proof of claims (5.18)–(5.20). For (5.21), since \(\theta \) is fixed, we may argue as for the proof of (5.9) and restrict \(\varepsilon _{8}\) to make sure that (5.21) holds. This completes the proof. \(\square \)

It is possible to apply Proposition 5.2 iteratively; in combination with Proposition 5.1, we have then the following.

Proposition 5.3

Corresponding to \(E_1\in [1,\infty )\), \(\nu \in (0,1)\), p and q, there exist \(\varepsilon _{9}\in (0,1)\) and \(c_{12} \in (1,\infty )\) with the following property. For \(T\in \textbf{G}(n,k)\), \(R\in (0,\infty )\) and \(U=\textrm{C}(T,2R)\), suppose that \(\{V_t\}_{t\in [-R^2,0]}\) and \(\{u(\cdot ,t)\}_{t\in [-R^2,0]}\) satisfy (A1)–(A4). Suppose

$$\begin{aligned}{} & {} \mu :=\left( R^{-k-4}\int _{-R^2}^0 \int _{\textrm{C}(T,2R)}|T^\perp (x)|^2\,d\Vert V_t\Vert dt\right) ^{1/2}<\varepsilon _{9}, \end{aligned}$$
(5.38)
$$\begin{aligned}{} & {} \Vert u\Vert :=R^\zeta \Vert u\Vert _{L^{p,q}(\textrm{C}(T,2R)\times (-R^2,0))}<\varepsilon _{9}, \end{aligned}$$
(5.39)
$$\begin{aligned}{} & {} (T^{-1}(0)\times \{0\})\cap \textrm{spt}(\Vert V_t\Vert \times dt)\ne \emptyset , \end{aligned}$$
(5.40)
$$\begin{aligned}{} & {} R^{-k}\Vert V_{-4R^2/5}\Vert (\phi _{T,R}^2)\le (2-\nu )\textbf{c}. \end{aligned}$$
(5.41)

Identifying T as \({\mathbb {R}}^k\cong {\mathbb {R}}^k\times \{0\}\subset {\mathbb {R}}^n\), let \({\tilde{D}}:=\{(x,t)\in {\mathbb {R}}^k\times [-R^2/2,0)\,:\, |x|^2< |t|\}\). Then there exist \(f\,:\,{\tilde{D}} \rightarrow T^{\perp }\) and \(F\,:\,{\tilde{D}}\rightarrow {\mathbb {R}}^n\) such that \(F(x,t)=(x,f(x,t))\) for \((x,t)\in {\tilde{D}}\) and

  1. (1)

    \(\textrm{spt}\Vert V_t\Vert \cap \textrm{C}(T,\sqrt{|t|})=\textrm{Image}\, F(\cdot ,t)\,\,\text{ for } \text{ all } \, t\in [-R^2/2,0)\),

  2. (2)

    \(R^{-1}\Vert f\Vert _0+\Vert \nabla f\Vert _0+R^\zeta [ f]_{1+\zeta }\le c_{4}c_{12}\max \{\mu ,c_{8}\Vert u\Vert \}\).

Proof

We may set \(R=1\) without loss of generality. With \(E_1\), \(\nu \), p and q given, we use Proposition 5.2 to obtain \(\varepsilon _{8}\), \(\theta \) and \(c_{8}\). Setting \(\iota =\theta ^2/2\), we use Proposition 5.1 to obtain \(\varepsilon _{6}\) and \(c_{4}\). We choose \(\varepsilon _{9}\) so that

$$\begin{aligned}{} & {} \varepsilon _{9} \le \min \{\varepsilon _{6}, \varepsilon _{8}\}, \end{aligned}$$
(5.42)
$$\begin{aligned}{} & {} c_{8}\varepsilon _{9}<\varepsilon _{8}, \end{aligned}$$
(5.43)
$$\begin{aligned}{} & {} (c_{8})^2(1-\theta ^\zeta )^{-1}\varepsilon _{9}<\varepsilon _{8}. \end{aligned}$$
(5.44)

We first use Proposition 5.2 with \(W=A=T\), and note that (5.12)–(5.17) are satisfied due to (5.38)–(5.41) and (5.42). Thus there exist \(T_1\in \textbf{G}(n,k)\) and \(A_1\in \textbf{A}(n,k)\) such that (5.18)–(5.20) are satisfied with \(R=1\), \(W=T\), \({\tilde{A}}=A_1\) and \({\tilde{T}}=T_1\). Similarly, we may use Proposition 5.1 since (2.13)–(2.16) are satisfied with \(R=1\) and \(\varepsilon _{6}\), so that we have \(f_1\) and \(F_1\) defined on \(B_1^k\times [-1/2,-\theta ^2/2]\) satisfying (5.1) and (5.2). We next claim that Proposition 5.2 can be inductively used for \(R=\theta ^j\), \(j\in {\mathbb {N}}\), where we obtain \(T_j\in \textbf{G}(n,k)\) and \(A_j\in \textbf{A}(n,k)\) satisfying

$$\begin{aligned} \Vert T_j-T_{j-1}\Vert \le c_{8} \theta ^{(j-1)\zeta }\max \{\mu ,c_{8}\Vert u\Vert \}, \end{aligned}$$
(5.45)

where \(T_0:=T\), and writing \(\mu _j\) as \(\mu \) in (5.14) corresponding to \(A_j\) and \(R=\theta ^j\),

$$\begin{aligned} \mu _j\le \theta ^{j\zeta }\max \{\mu ,c_{8}\Vert u\Vert \}. \end{aligned}$$
(5.46)

The case \(j=1\) follows from Proposition 5.2. Assume that it is true until \(j\ge 1\). Then we check that (5.12)–(5.17) are true for \(W=T\), \(T=T_j\), \(A=A_j\) and \(R=\theta ^j\). We have

$$\begin{aligned} \Vert T_j-T\Vert\le & {} \sum _{l=1}^j\Vert T_{l}-T_{l-1}\Vert \le c_{8}\sum _{l=1}^j \theta ^{(l-1)\zeta }\max \{\mu ,c_{8}\Vert u\Vert \} \nonumber \\\le & {} (c_{8})^2(1-\theta ^\zeta )^{-1}\varepsilon _{9}< \varepsilon _{8} \end{aligned}$$
(5.47)

where we used (5.45), (5.38), (5.39) and (5.44). Thus (5.12) is satisfied. Since \(A_j\) and \(T_j\) are parallel, (5.13) is fine. By (5.42), (5.43) and (5.46), we have \(\mu _j<\varepsilon _{8}\), so that (5.14) is satisfied. The condition (5.16) follows from (5.40), and (5.42), (5.39) and (5.21) give (5.17) for j. Thus, we may apply Proposition 5.2 with \(R=\theta ^j\), and obtain \(T_{j+1}\) and \(A_{j+1}\) which are parallel and

$$\begin{aligned} \Vert T_{j+1}-T_j\Vert \le c_{8}\mu _j\le c_{8}\theta ^{j\zeta }\max \{\mu ,c_{8}\Vert u\Vert \}, \end{aligned}$$
(5.48)

where we used (5.46), and

$$\begin{aligned} \mu _{j+1}\le \theta ^\zeta \max \{\mu _j,\theta ^{j\zeta }c_{8}\Vert u\Vert \}\le \theta ^{(j+1)\zeta }\max \{\mu ,c_{8}\Vert u\Vert \} \end{aligned}$$
(5.49)

by (5.20) and (5.46). This closes the inductive step and proves (5.45) and (5.46) for all j. We next prove that we can apply Proposition 5.1 on each domain \(\textrm{C}(T_j,2\theta ^j)\times [-\theta ^{2j},-\theta ^{2(j+1)}/2]\) for all \(j\ge 1\). Note that for each \(j\ge 0\), by the same argument leading up to (5.7), we have

$$\begin{aligned} \begin{aligned} \textrm{spt}\Vert V_t\Vert \cap \textrm{C}(T,3\theta ^j/2)\}&\subset \{x:\, \theta ^{-2j}\textrm{dist}(x,A_j)^2\le c_{6} \mu _j^2+3 c_{5}\theta ^{2j\zeta }\Vert u\Vert ^2 E_1^{1-2/p}\} \\&\subset \{x:\,\textrm{dist}(x,A_j)\le \theta ^{j(1+\zeta )}c_{13} \varepsilon _{9}\} \,\,\, (c_{13}=c_{13}(n,k,E_1)) \end{aligned}\nonumber \\ \end{aligned}$$
(5.50)

for all \(t\in [-4\theta ^{2j}/5,0)\). To apply Proposition 5.1, we need to have T there replaced by \(A_j\), so we need to tilt the plane whose tilt is estimated by (5.47). For this reason, we may actually need to use a slightly smaller cylinder than \(C(T_j,2\theta ^j)\) so that the smallness of corresponding \(\mu _j\) (with respect to the distance function to \(A_j\)) can be assured from (5.46). Inductively, we know that the support of \(\Vert V_t\Vert \) in \(\textrm{C}(T_{j-1},\theta ^{j-1})\times [-\theta ^{2(j-1)}/2,-\theta ^{2j}/2]\) is a \(C^{1,\zeta }\) graph, so that the condition (2.13) is satisfied. Condition (2.14) follows from (5.40), and (2.15) follows from (5.46), (5.42) and (5.43). Thus we may apply Proposition 5.1 and obtain a graph representation \({\tilde{f}}_j\) over \(A_j\) with the \(C^{1,\zeta }\) estimate of the form \(c_{4}\theta ^{j\zeta } \max \{\mu ,c_{8}\Vert u\Vert \}\). Note that, by the implicit function theorem, one can equally represent the same set as a graph \(f_j\) over T. The norm \(\Vert \nabla f_j\Vert _0\) over \(B_{\theta ^j}^k\times [-\theta ^{2j}/2,-\theta ^{2(j+1)}/2]\) can be different by a constant multiple of \(\Vert T_j-T\Vert \) which is bounded as in (5.47). The Hölder semi-norm \([f]_{1+\zeta }\) has two terms, \([\nabla f]_\zeta \) and the \((1+\zeta )/2\)-Hölder semi-norm in time. The first is seen as the variation of the tangent space and one can see that it is bounded by a multiple of constant (which is close to 1) under the small rotation. The estimate for the latter is obtained by applying [11, Proposition 6.4] with the gradient Hölder norm, and the small rotation affects little. Hence we can obtain the desired \(C^{1,\zeta }\) estimate for \(f_j\) representing \(\textrm{spt}\Vert V_t\Vert \) over the domain \(B_{\theta ^j}^k\times [-\theta ^{2j}/2,-\theta ^{2(j+1)}/2]\), by \(2c_{4}\theta ^{j\zeta }\max \{\mu ,c_{8}\Vert u\Vert \}\). We next observe that

$$\begin{aligned} {\tilde{D}}=\{(x,t)\in {\mathbb {R}}^k\times [-1/2,0):\,|x|^2<|t|\}\subset \cup _{j=0}^\infty B_{\theta ^j}^k\times [-\theta ^{2j}/2, -\theta ^{2(j+1)}/2],\nonumber \\ \end{aligned}$$
(5.51)

so that we have a representation of \(\textrm{spt}\Vert V_t\Vert \) as the graph of a single function f over \({\tilde{D}}\). The estimate \(\Vert f\Vert _0+\Vert \nabla f\Vert _0 \le 2 c_{4} \max \{\mu , c_{8} \Vert u\Vert \}\) is immediate. For the Hölder semi-norm \([f]_{1+\zeta }\), we proceed as follows. Let \((y_1,s_1)\), \((y_2,s_2)\) be points in \({\tilde{D}}\) with \((y_1,s_1) \ne (y_2,s_2)\), assume without loss of generality that \(s_1 \le s_2\), and let \(h,l \ge 0\) be such that \((y_1,s_1) \in B^k_{\theta ^h} \times [ -\theta ^{2\,h}/2,-\theta ^{2(h+1)}/2]\) and \((y_2,s_2) \in B^k_{\theta ^{h+l}} \times [ -\theta ^{2(h+l)}/2,-\theta ^{2(h+l+1)}/2]\). By the triangle inequality, we estimate

$$\begin{aligned} \begin{aligned} \left| \nabla f(y_1,s_1) - \nabla f(y_2,s_2) \right|&\le 2 c_{4} \max \{\mu ,c_{8}\Vert u\Vert \} \left( |y_1-y_2|^\zeta + \frac{1}{2} \sum _{j=h}^{h+l} (\theta ^{2j})^{\zeta /2} \right) \\&\le c_{4} c_{12} \max \{\mu ,c_{8}\Vert u\Vert \} \theta ^{h\zeta } \\&\le c_{4} c_{12} \max \{\mu ,c_{8}\Vert u\Vert \} |s_1-s_2|^{\zeta /2}, \end{aligned} \end{aligned}$$

where \(c_{12}=c_{12}(k,p,q)\). The estimate for the second summand in \([f]_{1+\zeta }\) is analogous. The proof is now complete. \(\square \)

6 Proof of the main results

We are now ready to prove Theorems 2.2 and 2.3.

Proof of Theorem 2.2

By scaling, we may assume \(R=1\). Given \(\nu \in \left( 0,1\right) \), \(E_1 \in \left[ 1, \infty \right) \), p and q, let \(\varepsilon _{9}\), \(c_{4}\), \(c_{12}\) and \(c_{8}\) be as in Proposition 5.3. Let now \(\varepsilon _{2} \in \left( 0,1\right) \) and \(c_1 \in \left( 1, \infty \right) \) be such that the following conditions are satisfied:

$$\begin{aligned} \varepsilon _{2} \le \frac{\varepsilon _{9}}{2^{k+4}}\,, \qquad c_1 \ge 4\,\max \{2^{k+4}{c_{4}}c_{12},c_{4}c_{12}c_{8}\}\,. \end{aligned}$$
(6.1)

For \(T \in \textbf{G}(n,k)\), and \(U = \textrm{C}(T,2)\), suppose that \(\{V_t\}_{t \in \left[ -1,0\right] }\) and \(\{u(\cdot ,t)\}_{t \in \left[ -1,0\right] }\) satisfy (A1)–(A4) as well as (2.13)–(2.16). We identify, as usual, T with \(\mathbb {R}^k \cong \mathbb {R}^k \times \{0\} \subset \mathbb {R}^n\), and we claim the following: for every \(j \ge 1\), setting

$$\begin{aligned}&\sigma _j := \sum _{i=1}^j \frac{1}{i}\,, \qquad \tau _1 := \frac{1}{2} \,, \qquad \tau _{j+1} := \frac{1}{4\sigma _j} \,, \end{aligned}$$
(6.2)
$$\begin{aligned}&D_j := \left\{ (x,t) \in \mathbb {R}^k \times \left[ -\tau _j,0\right) \, :\, |x|^2 < \sigma _j |t| \right\} \,, \end{aligned}$$
(6.3)

there exist \(f_j :D_j \rightarrow T^\perp \) and \(F_j :D_j \rightarrow \mathbb {R}^n\) such that \(F_j(x,t)=(x,f_j(x,t))\), and

  1. (1)

    \(\textrm{spt}\Vert V_t\Vert \cap \textrm{C}(T,\sqrt{\sigma _j|t|}) = \textrm{Image}\, F_j(\cdot ,t) \text{ for } \text{ all } t \in \left[ -\tau _j,0\right) \),

  2. (2)

    \(\Vert f_j\Vert _0 + \Vert \nabla f_j\Vert _0 \le c_1 \max \{\mu ,\Vert u\Vert _{p,q}\}\).

Assume the claim for the moment. It is then an immediate consequence of (6.2) that

$$\begin{aligned} \sqrt{\sigma _j |t|} \ge 1/2 \text{ for } \text{ all } t \in \left[ -\tau _j, -\tau _{j+1} \right) , \end{aligned}$$

which implies that

$$\begin{aligned} B^k_{\frac{1}{2}} \times \left[ -\tau _j, -\tau _{j+1} \right) \subset D_j. \end{aligned}$$
(6.4)

Since \(\lim _{j \rightarrow \infty } \tau _j = 0\), (6.4) and (1)–(2) imply that one can define a function \(f :B^k_{\frac{1}{2}} \times \left[ -\frac{1}{4},0\right) \rightarrow T^\perp \) such that, setting \(F(x,t)=(x,f(x,t))\) for \((x,t) \in B^k_{\frac{1}{2}} \times \left[ -\frac{1}{4},0\right) \) one has

$$\begin{aligned}&\textrm{spt}\Vert V_t\Vert \cap \textrm{C}(T,1/2) = \textrm{image}\,F(\cdot , t) \text{ for } \text{ all } t\in \left[ -1/4,0\right) \,, \\&\quad \Vert f\Vert _0 + \Vert \nabla f \Vert _0 \le c_1 \max \{\mu ,\Vert u\Vert _{p,q}\}\,. \end{aligned}$$

that is (2.17) and part of the estimate in (2.18). In what follows, we will first prove the claim; then, we will show that the resulting function f also satisfies \([f]_{1+\zeta } \le c_1 \max \{\mu ,\Vert u\Vert _{p,q}\}\).

The proof of the claim is by induction on \(j\ge 1\). The induction base, \(j=1\), is Proposition 5.3. We then assume that the claim is true for j, and prove it for \(j+1\). Fix any point \((x_0,t_0) \in \partial D_j\), and translate in space-time so to consider the flow \(\{{\tilde{V}}_s\}_{s\in \left[ -1-t_0,0\right] }\), with \({\tilde{V}}_s:= (\tau _{x_0})_\sharp V_{s+t_0}\) where \(\tau _{x_0}(y):= y-x_0\). Set \({\tilde{R}}^2 = {\tilde{R}}_{t_0}^2:= \frac{1}{4}+\frac{t_0}{4}\), and notice that \(\textrm{C}(T,x_0,2 {\tilde{R}}) \subset \textrm{C}(T,0,2)\). In particular, \(\{{\tilde{V}}_s\}\) satisfies (A1)–(A4) in \(U=\textrm{C}(T,2{\tilde{R}})\) corresponding to the forcing term \({\tilde{u}}(y,s) = {\tilde{u}}_{(x_0,t_0)}(y,s):= u(y+x_0,s+t_0)\). We next claim that (5.38)–(5.41) are satisfied. We clearly have

$$\begin{aligned} \mu _{(x_0,t_0)}^2:= {\tilde{R}}^{-k-4} \int _{-{\tilde{R}}^2}^0 \int _{\textrm{C}(T,2{\tilde{R}})} |T^\perp (y)|^2\,d\Vert {\tilde{V}}_s\Vert (y)\,ds \le {\tilde{R}}^{-k-4} \mu ^2 \le 4^{k+4}\mu ^2 < \varepsilon _{9}^2 \end{aligned}$$

by (2.13) and (6.1). Moreover, \((T^{-1}(0)\times \{0\}) \cap \textrm{spt}(\Vert {\tilde{V}}_s\Vert \times ds) = (T^{-1}(x_0)\times \{t_0\}) \cap \textrm{spt}(\Vert V_t\Vert \times dt) \ne \emptyset \), because for any sequence \((x_h,t_0) \in D_j\) such that \(x_h \rightarrow x_0\) we have \((x_h,f_j(x_h,t_0)) \in T^{-1}(x_h) \cap \textrm{spt}\Vert V_{t_0\Vert }\) by (1), and thus \((T^{-1}(x_0) \times \{t_0\}) \cap \textrm{spt}(\Vert V_t\Vert \times dt)\) contains all subsequential limits of \((x_h, f_j (x_h,t_0),t_0)\). We also readily estimate

$$\begin{aligned} \Vert {\tilde{u}}\Vert _{L^{p,q}(\textrm{C}(T,2{\tilde{R}}) \times (-{\tilde{R}}^2,0))} \le \Vert u\Vert _{L^{p,q}(\textrm{C}(T,2) \times (-1,0))}, \end{aligned}$$

so that (2.16) implies (5.39). Finally, we have

$$\begin{aligned} {\tilde{R}}^{-k} \Vert {\tilde{V}}_{-4{\tilde{R}}^2/5}\Vert (\phi _{T,{\tilde{R}}}^2) = {\tilde{R}}^{-k} \Vert V_{-1/5+4t_0/5}\Vert (\phi ^2_{T,{\tilde{R}},x_0}) \le \textbf{c}+\delta , \end{aligned}$$

using the same argument leading to (5.9). We can then apply Proposition 5.3 and conclude after translating back the origin to \((x_0,t_0)\) that, setting

$$\begin{aligned} {\tilde{D}}^{(x_0,t_0)}:= \left\{ (x,t) \in \mathbb {R}^k \times \left[ t_0 - \frac{{\tilde{R}}_{t_0}^2}{2}, t_0 \right) \, :\, |x-x_0|^2 < |t-t_0| \right\} , \end{aligned}$$

there exist functions \(f^{(x_0,t_0)} :{\tilde{D}}^{(x_0,t_0)} \rightarrow T^\perp \) and \(F^{(x_0,t_0)} :{\tilde{D}}^{(x_0,t_0)} \rightarrow \mathbb {R}^n\) such that \(F^{(x_0,t_0)}(x,t)=(x,f^{(x_0,t_0)}(x,t))\) for all \((x,t) \in {\tilde{D}}^{(x_0,t_0)}\) and

\((1)_\star \):

\(\textrm{spt}\Vert V_t\Vert \cap \textrm{C}(T,x_0,\sqrt{|t-t_0|}) = \textrm{Image}\,F^{(x_0,t_0)}(\cdot ,t)\) for all \(t \in \left[ t_0 - \frac{{\tilde{R}}_{t_0}^2}{2}, t_0 \right) \),

\((2)_\star \):

\({\tilde{R}}_{t_0}^{-1} \Vert f^{(x_0,t_0)}\Vert _0 + \Vert \nabla f^{(x_0,t_0)}\Vert _0 + {\tilde{R}}_{t_0}^\zeta [f^{(x_0,t_0)}]_{1+\zeta } \le c_{4}c_{12} \max \{\mu _{(x_0,t_0)}, c_8 \Vert {\tilde{u}}_{(x_0,t_0)}\Vert \}\) .

In particular, there is a well posed extension of the functions \(f_j\) and \(F_{j}\) to the region

$$\begin{aligned} D_j \cup \bigcup _{(x_0,t_0) \in \partial D_j} {\tilde{D}}^{(x_0,t_0)}. \end{aligned}$$

We let \(f_{j+1}\) and \(F_{j+1}\) denote such extensions, and we proceed with the proof that conditions (1)–(2) hold true with \(j+1\) in place of j. To this aim, it is sufficient to show the following: for \(t \in \left[ -\tau _{j+1},0\right) \) and \(\sigma _j |t| \le |x|^2 < \sigma _{j+1} |t|\), there exists \((x_0,t_0) \in \partial D_j\) such that \((x,t) \in {\tilde{D}}^{(x_0,t_0)}\). Once this is established, indeed, one immediately gains that

$$\begin{aligned} D_{j+1} \subset D_j \cup \bigcup _{(x_0,t_0) \in \partial D_j} {\tilde{D}}^{(x_0,t_0)}, \end{aligned}$$
(6.5)

see Fig. 1, and (1) at step \(j+1\) follows immediately from (1) at step j and \((1)_\star \), while (2) at step \(j+1\) follows from (2) at step j and \((2)_\star \) thanks to (6.1)

Fig. 1
figure 1

An illustration of the first two parabolic regions \(D_j\): the region \(D_2\) is a subset of the union of \(D_1\) with suitable parabolic regions \({\tilde{D}}^{(x_0,t_0)}\) having vertices at points \((x_0,t_0) \in \partial D_1\) (black dots in the graph). The region \(D_3\) will be a subset of the union of \(D_2\) with parabolic regions \({\tilde{D}}^{(x_0,t_0)}\) having vertices at points \((x_0,t_0) \in \partial D_2\). As j grows, the opening of the regions \(D_j\) increases, as it is defined by the parameter \(\sigma _j \uparrow \infty \). The union of the regions \(D_j\) contains the cylinder \(B_{1/2}^k \times \left[ -1/4,0\right) \), over which we can conclude graphical parametrization and corresponding estimates for the flow

To prove the above claim, let then \((x,t) \in \mathbb {R}^{k} \times \left[ - \tau _{j+1},0\right) \) be such that \(\sigma _j |t| \le |x|^2 < \sigma _{j+1}|t|\), and set

$$\begin{aligned} t_0:= \frac{t}{\alpha }, \qquad x_0:= \sqrt{\frac{\sigma _j |t|}{\alpha }} \frac{x}{|x|} \end{aligned}$$
(6.6)

for some number \(\alpha =\alpha _j > 1\) to be determined. Notice that \((x_0,t_0) \in \partial D_j\) by construction. We then only need to prove that there exists \(\alpha > 1\) such that \((x,t) \in {\tilde{D}}^{(x_0,t_0)}\). On the other hand, by the definitions of \(t_0\) and \(x_0\) it holds that

$$\begin{aligned} |x-x_0|&= |x| - \sqrt{\frac{\sigma _j |t|}{\alpha }} < \sqrt{|t|} \left( \sqrt{\sigma _{j+1}} - \sqrt{\frac{\sigma _j}{\alpha }} \right) \\ \sqrt{|t-t_0|}&= \sqrt{|t|} \sqrt{1-\frac{1}{\alpha }}\,, \end{aligned}$$

so that, recalling the definition of \(\sigma _j\), \((x_0,t_0) \in {\tilde{D}}^{(x_0,t_0)}\) provided \(\alpha > 1\) is chosen so that

$$\begin{aligned} \sqrt{\sigma _j + \frac{1}{j+1}} - \sqrt{\frac{\sigma _j}{\alpha }} \le \sqrt{1-\frac{1}{\alpha }}. \end{aligned}$$
(6.7)

We now show that (6.7) has a solution \(\alpha =\alpha _j>1\) for every j. Direct calculation shows that \(\alpha =2\) is a solution to (6.7) when \(j=1\) and \(j=2\). On the other hand, it holds

$$\begin{aligned} \sqrt{\sigma _j + \frac{1}{j+1}} - \sqrt{\frac{\sigma _j}{\alpha }} = \frac{\sigma _j \left( 1-\frac{1}{\alpha }\right) + \frac{1}{j+1}}{\sqrt{\sigma _j + \frac{1}{j+1}} + \sqrt{\frac{\sigma _j}{\alpha }}} \le \frac{\sigma _j \left( 1-\frac{1}{\alpha }\right) + \frac{1}{j+1}}{\sqrt{\sigma _j}}, \end{aligned}$$

so that solutions to

$$\begin{aligned} \sigma _j \left( 1-\frac{1}{\alpha }\right) + \frac{1}{j+1} \le \sqrt{\sigma _j \left( 1-\frac{1}{\alpha }\right) } \end{aligned}$$
(6.8)

also solve (6.7). Changing variable

$$\begin{aligned} \xi := \sqrt{\sigma _j \left( 1-\frac{1}{\alpha }\right) }, \end{aligned}$$

(6.8) reduces to

$$\begin{aligned} \xi ^2 - \xi + \frac{1}{j+1} \le 0, \end{aligned}$$

which admits \(\xi = \frac{1}{2}\) as a solution for every \(j \ge 3\). Going back to the original variables, we have that the number \(\alpha =\alpha _j > 1\) such that \(\frac{1}{\alpha } = 1-\frac{1}{4\sigma _j}\) is a solution to (6.7) for \(j \ge 3\). This concludes the proof of (6.5).

We are only left with the proof of the estimate on the Hölder semi-norm \([f]_{1+\zeta }\). Given that \(\textrm{spt}\Vert V_t\Vert \cap \textrm{C}(T,1/2)\) is the graph of a function defined on \(B^k_{1/2}\) for all \(t \in [-1/4,0)\), we know now that for every \((x_0,t_0) \in B^k_{1/2} \times [-1/4,0)\) the flow \(\{{\tilde{V}}_s\}_{s \in [-1-t_0,0]}\) with \({\tilde{V}}_s = (\tau _{x_0})_\sharp V_{s+t_0}\) as above satisfies the assumptions of Proposition 5.3 with, say \(R=3/4\). In particular, we have \(C^{1,\zeta }\) estimates for f with \(c_{4} c_{12} \max \{\mu ,c_{8}\Vert u\Vert _{p,q}\}\) in the parabolic region \({\tilde{D}}^{(x_0,t_0)}=\{(x,t) \in \mathbb {R}^k \times [t_0 - 1/4,t_0) \, :\, |x-x_0|^2 < |t-t_0|\}\). To prove the desired Hölder estimate, let now \((y_1,s_1)\) and \((y_2,s_2)\) be points in \(B^k_{1/2} \times [-1/4,0)\) with \((y_1,s_1) \ne (y_2,s_2)\), and assume without loss of generality that \(s_1 \le s_2\). Consider the parabolic region \({\tilde{D}}^{(y_2,s_2)}\) with vertex at \((y_2,s_2)\). If \(|y_1-y_2|^2 < |s_1-s_2|\), then \((y_1,s_1) \in {\tilde{D}}^{(y_2,s_2)}\), and the estimate is a consequence of Proposition 5.3 and (6.1). Otherwise, if \(|y_1-y_2|^2 \ge |s_1-s_2| = s_2-s_1\), we use the triangle inequality to estimate

$$\begin{aligned} \begin{aligned} |\nabla f (y_1,s_1) - \nabla f (y_2,s_2)|&\le |\nabla f (y_2,s_2) - \nabla f (y_2,s_2-|y_1-y_2|^2)| \\&\qquad + |\nabla f (y_2,s_2-|y_1-y_2|^2) - \nabla f (y_1,s_2-|y_1-y_2|^2)| \\&\qquad + |\nabla f (y_1,s_2-|y_1-y_2|^2) - \nabla f(y_1,s_1)| \\&\le c_{4}c_{12} \max \{\mu ,c_{8}\Vert u\Vert _{p,q}\} \left( |y_1-y_2|^\zeta \right. \\&\qquad \left. + |y_1-y_2|^\zeta + 2^{\zeta /2} |y_1-y_2|^\zeta \right) , \end{aligned} \end{aligned}$$

which yields the estimate for \([\nabla f]_\zeta \) thanks to (6.1). The estimate for the second summand in \([f]_{1+\zeta }\) is analogous, and we omit it. The proof is complete. \(\square \)

Proof of Theorem 2.3

Here we briefly record the outline of the \(C^{2,\alpha }\) regularity of [20] and point out the key estimates. The idea is to look at a graphical distance function from the solution of the heat equation g, denoted by \(Q_g\) ( [20, Definition 4.1]) and one shows a decay estimate of the \(L^2\)-norm of \(Q_g\) by the blowup argument. The key identity is Lemma 4.2, which shows certain “sub-caloric” property of \(Q_g\), and the resulting \(L^\infty \) estimate Proposition 4.3, both of [20]. Note that the latter is an estimate up to the end-time. Since this is the basis of the blowup argument, if we have already \(C^{1,\zeta }\) graph representation up to the end-time, all the following argument in [20] works verbatim with obvious modifications of changing the domain of integration to the one with center at the end-time point from the center of the space-time domain. The second order Taylor expansion of the blow-up should be changed to the end-time point as well. The end result is the estimate away from the parabolic boundary, as stated in the claim. \(\square \)