Open and closed random walks with fixed edgelengths in $ \newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^d$

Jason Cantarella; Kyle Chapman; Philipp Reiter; Clayton Shonkwiler

doi:10.1088/1751-8121/aade0a

1. Introduction

Random walks in space with fixed edgelengths have been of interest to statistical physicists and chemists since Lord Rayleigh's day. These walks model polymers in solution (at least under θ-solvent conditions) [9, 14, 20] and are similarly interesting in computational geometry and mathematics as a space of 'linkages' [3, 16]. While 2- and 3-dimensional walks are the most relevant to this case, high-dimensional random walks often shed light on the lower dimensional situation [21].

In this paper, we will consider the relationship between open and closed random walks of fixed edgelengths (in polymeric language, the shapes of linear and ring polymers). We will provide an explicit algorithm for finding the nearest closed polygon with given edgelengths to almost any collection of edge directions, and use our construction to provide tail bounds on the fraction of polygon space within a fixed distance of the closed polygons in any dimension. Our results will be strongest for equilateral polygons, but provide explicit bounds for any collection of edgelengths.

To establish notation, we describe random walks in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^d$ with (fixed) positive edgelengths w_i by their edge clouds $(w_1, \hat{x}_1), \ldots , (w_n, \hat{x}_n)$ where $\hat{x}_i \in S^{d-1}$ is the direction of the ith edge. The space of polygonal arms $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ is topologically equivalent to $(S^{d-1}){}^n$ . If we let $\omega_i = \frac{w_i}{\sum w_i}$ be the relative edgelengths, then we can define the submanifold $\{\pmb{x} : \sum \omega_i \hat{x}_i = \vec {0} \}$ of closed polygons $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ .

Using Bernstein's inequality (e.g. [8]), there is an easy concentration inequality which suggests that the endpoints of random arms are close together. For equilateral polygons in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^3$ , this takes the simple form

Theorem 1. If $\pmb{x}$ is chosen randomly in $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ with edges $\hat{x}_1, \ldots , \hat{x}_n$ ,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left( \frac{1}{n} \left\| \sum \hat{x}_i \right\| < t \right) \geqslant 1 - 3 \, {\rm e}^{-n t^2 \cdot \frac{3}{6 + 2\sqrt{3} t}}. \nonumber \end{align*}$

That is, the center of mass of a random edge cloud is very likely to be close to the origin. We can clearly close a random polygon in $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ by subtracting the (small) $\frac{1}{n} \sum \hat{x}_i$ from each edge. That closed polygon is clearly near the original arm, but it is no longer equilateral. This raises the question of whether we can generally close a polygon in $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ (or $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ ) while preserving edgelengths and changing the polygon only a small amount. This question is the focus of this paper.

Given $\pmb{x}$ and $\pmb{y}$ in $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ , we view both as vectors in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^{dn}$ and measure the distance between them accordingly. We call this the chordal distance because it does not measure the arc on the spheres of radius w_i for each pair of edges, but rather measures the straight line distance between edge vectors.

Our first main result is proposition 10, which shows that the chordal distance between a random $\newcommand{\Arm}{{\rm Arm}} \pmb{x} \in \Arm(n, d, 1)$ and the nearest $\newcommand{\Pol}{{\rm Pol}} \pmb{y} \in \Pol(n, d, 1)$ converges in distribution to a Nakagami- $(\frac{d}{2}, \frac{d}{d-1})$ random variable with PDF proportional to $x^{d-1} {\rm e}^{-\frac{d-1}{2} x^2}$ as $n \rightarrow \infty$ .

Our second main result is a general probabilistic bound on the chordal distance to closure for random polygons in any $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ . For equilateral polygons in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^3$ , our main theorem (corollary 19) takes the very simple form

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left(d_{\rm chordal}(\pmb{x},\Pol(n,3,1)) < t \right) \geqslant 1 - 6 \exp\left( \frac{-t^2}{4} \right) \nonumber \end{align*}$

for $t < \frac{\sqrt{n}}{200 \sqrt{2}}$ .

Here is a broad overview of our arguments. Given a polygon $\pmb{x}$ in $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ , we will provide an explicit construction for a nearby closed polygon in $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ , which we call the geometric median closure of $\pmb{x}$ (denoted $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ ). It will be clear how to construct the geodesic in $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ from $\pmb{x}$ to $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ . For equilateral polygons, we show $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is the closest closed polygon to $\pmb{x}$ in chordal distance (theorem 8).

The distance between $\pmb{x}$ and $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ depends on the norm $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \| \gm \|$ of the geometric median (or Fermat–Weber point) of the edge cloud (proposition 12). For equilateral polygons, we will be able to leverage existing results of Niemiro [19] to find the asymptotic distribution of the geometric median of a random point cloud (proposition 9). Combining this with the matrix Chernoff inequalities proves our first main result (proposition 10).

The second main result follows from a concentration inequality for a random polygon in any $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ , which bounds the probability of a large $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \| \gm \|$ in terms of n, d, and w. This concentration result (theorem 18) follows from parallel uses of the scalar and matrix Bernstein inequalities to control the expected properties of a random edge cloud, together with the definition of the geometric median as the minimum of a convex function.

Last, we will observe that the pushforward measure on closed polygons obtained by closing random open polygons appears to converge rapidly to the uniform distribution on closed polygons (conjecture 22). Since these closures involve only very small motions of any part of the polygon, local features (such as small knots) should be preserved—it would follow (conjecture 23) that the rate of production of local knots in open and closed arcs should be almost the same.

2. Constructing a nearby closed polygon

As mentioned above, we view n-edge polygons (up to translation) in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^d$ as collections of edge vectors⁴ $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \vec {x}_i \in \R^d$ . The vertices are obtained by summing the $\vec {x}_i$ from an arbitrary basepoint. In this section of the paper, we will assume only that the lengths of the edges of the polygon are fixed to some arbitrary $w_i = \|\vec {x}_i\|$ . We will think of these fixed edgelength polygons in two ways:

as a weighted point cloud on the unit sphere $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} S^{d-1} \subset \R^d$ where the points are denoted $\hat{x}_i = \vec {x}_i/\|\vec {x}_i\|$ and the weights are the w_i. We will call $(w_i, \hat{x}_i)$ the edge cloud of the polygon.
as a point $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \pmb{x} \in \prod S^{d-1}(w_i) \subset (\R^d){}^n = \R^{dn}$ (where S^d−1(r) is the sphere of radius r). We will call $\pmb{x}$ the vector of edges of the polygon.

The space of these polygons will be denoted $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w) = \prod S^{d-1}(w_i)$ . Within this space, there is a submanifold $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ of closed polygons defined by the condition $\sum w_i \hat{x}_i = \vec {0}$ . (Equivalently, $\pmb{x}$ is closed if it lies in the codimension d subspace of $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^{dn}$ normal to the $\pmb{n}^{\,j} = (\hat{e}_j, \ldots , \hat{e}_j)$ , where $\hat{e}_1, \ldots , \hat{e}_d$ is the standard basis in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^d$ .) Both $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ and $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ are Riemannian manifolds with standard metrics, but it will be useful to use two additional metrics as well:

Definition 2. The chordal metric on $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ is given by

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\R}{\mathbb{R}} \newcommand{\m}{\mathcal} \displaystyle d_{\rm chordal}(\pmb{x},\pmb{y}) = \|\pmb{x} - \pmb{y}\|_{\R^{dn}} = \left( \sum \|w_i \hat{x}_i - w_i \hat{y}_i\|_{\R^d}^2 \right)^{\frac{1}{2}}. \nonumber \end{align*}$

The max-angular metric on $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ is given by

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\m}{\mathcal} \displaystyle d_{{{\rm max\mbox{-}angular}}}(\pmb{x},\pmb{y}) = \max_i \angle(\vec {x}_i,\vec {y}_i). \nonumber \end{align*}$

We now make an important definition:

Definition 3. A geometric median (also known as a Fermat–Weber point) of an edge cloud $(w_i, \hat{x}_i)$ is any $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ which minimizes the weighted average distance function $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Adxy}{\Adx(\vec {y})} \Adxy$ given by

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Adxy}{\Adx(\vec {y})} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \displaystyle \Adxy = \sum_i \omega_i \| \hat{x}_i - \vec {y} \| \nonumber \end{align*}$

where $\omega_i = \frac{w_i}{\sum w_i}$ . To clarify notation, we will only use $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ for points which are a geometric median of a weighted point cloud; the point cloud will be clear from the context.

This is a very old construction with a beautiful theory around it; see the nice review in [7]. We note that the geometric median differs from the center of mass (or geometric mean) of the points, which minimizes the weighted average of the squared distances between $\vec {y}$ and the $\hat{x}_i$ and that the geometric median is unique unless the points are all colinear and the geometric median is not one of the points.

This section is devoted to analyzing the following construction:

Definition 4. Suppose $\pmb{x}$ is a polygon and $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is a geometric median of its edge cloud $(w_i, \hat{x}_i)$ which is not one of the $\hat{x}_i$ . The geometric median closure $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ of $\pmb{x}$ is the polygon whose edge cloud has the same weights and edge directions obtained by recentering the $\hat{x}_i$ on $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ and renormalizing: $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ has edge cloud $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \left( w_i, \frac{\hat{x}_i - \gm}{\left\| \hat{x}_i - \gm \right\|} \right)$ .

If every geometric median of $(w_i, \hat{x}_i)$ is one of the $\hat{x}_i$ , $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is not defined. If $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is defined, we say that $\pmb{x}$ is median-closeable.

Of course, we need to justify our choice of name by proving that $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is closed. The key observation is the following lemma, which follows by direct computation:

Lemma 5. The function $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Adxy}{\Adx(\vec {y})} \Adxy$ is a convex function of $\vec {y}$ . The gradient is given by

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\gradAdxy}{\nabla\!\Adxy} \newcommand{\gradAdx}{\nabla\!\Adx} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\Adxy}{\Adx(\vec {y})} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \newcommand{\grad}{\nabla} \displaystyle \gradAdxy = \sum \omega_i \frac{\vec {y} - \hat{x}_i}{\|\vec {y} - \hat{x}_i\|}. \nonumber \end{align*}$

The Hessian of $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Adxy}{\Adx(\vec {y})} \Adxy$ is given by

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\HAdxy}{\mathcal{H}\Adxy} \newcommand{\HAdx}{\mathcal{H}\Adx} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\Adxy}{\Adx(\vec {y})} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \HAdxy = \left(\sum_i \frac{\omega_i}{\norm{\hat{x}_i - \vec {y}}}\right) I_d - \left(\sum_i \frac{\omega_i}{\norm{\hat{x}_i - \vec {y}}^3} (\vec {y} - \hat{x}_i)(\vec {y} - \hat{x}_i)^T \right). \nonumber \end{align*}$

Proposition 6. If $\pmb{x}$ is median-closeable, $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is a unique closed polygon with edgelengths w_i.

Proof. The proof follows from assembling several standard facts about the geometric median. These are in [18], but are easily checked by hand.

As it is a sum of convex functions, the average distance function $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx$ is convex. Away from the $\hat{x}_i$ , it is differentiable. If the points $\hat{x}_i$ are not colinear, $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx$ is strictly convex and $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is unique. If the points $\hat{x}_i$ are colinear, either the geometric median is one of the $\hat{x}_i$ or the set of geometric medians consists of the interval between two $\hat{x}_i$ .

Any geometric median which is not one of the $\hat{x}_i$ must be a critical point of the average distance function. For any such $\newcommand{\m}{\mathcal} \vec {\mu}$ , using lemma 5,

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\gm}{\vec {\mu}} \newcommand{\gradAdx}{\nabla\!\Adx} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \newcommand{\grad}{\nabla} \newcommand{\m}{\mathcal} \displaystyle \sum_i w_i \frac{\hat{x}_i - \gm}{\left\|\hat{x}_i - \gm\right\|} = -\left(\sum w_i\right) \gradAdx(\gm) = 0. \label{eq:grad total distance} \nonumber \end{align} \tag{ 1 }$

This implies that $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is closed.

If $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is unique, then $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is obviously unique. If $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is not unique, the $\hat{x}_i$ are colinear, and $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is on the line segment between two of the $\hat{x}_i$ . In this case, it is not hard to see that (1) implies that the edges of $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ are two antipodal groups of points on S^d−1, each containing n/2 points, regardless of where we take $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ on the segment. □

Our next goal is to prove an optimality property for the geometric median closure. We will start by proving a more general fact about recentering and renormalizing:

Proposition 7. Let $\hat{x}_i$ be a point cloud in $(S^{d-1}){}^n$ , and take any $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \vec {p}$ not an $\hat{x}_i$ . Given any set of weights w_i we let $r(\pmb{x};\vec {p}, w)$ denote the renormalized and recentered point cloud with weights w_i and $\vec {s}$ denote its weighted sum:

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle r(\pmb{x};\vec {p},w) := \left( w_i, \frac{\hat{x}_i - \vec {p}}{\left\| \hat{x}_i - \vec {p}\right\|} \right) \quad {\rm and} \quad \vec {s} := \sum_i w_i \frac{\hat{x}_i - \vec {p}}{\left\| \hat{x}_i - \vec {p}\right\|}. \nonumber \end{align*}$

If $\hat{x}=(\hat{x}_1, \ldots , \hat{x}_n)$ and $\pmb{r}$ is the vector of edges corresponding to the edge cloud $(w_i, \hat{r}_i)$ , then

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \pmb{r} = {\rm argmin}_{\sum w_i \hat{y}_i = \vec {s}} \|\pmb{y} - \hat{x}\|, \nonumber \end{align*}$

that is, $\pmb{r}$ is the closest vector of edges to $\hat{x}$ (in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^{dn}$ ) with edge weights w_i and vector sum $\vec {s}$ .

Proof. Suppose that $(w_i, \hat{y}_i)$ is a point cloud with the same weights which also has $\sum_i w_i \hat{y}_i = \vec {s}$ and $\pmb{y}$ is the corresponding vector of edges in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^{dn}$ . Let $\pmb{v} = \pmb{y} - \pmb{r}$ . Since $\sum_i \vec {y}_i = \sum_i \vec {r}_i$ , we know $\sum \vec {v}_i = \vec {0}$ .

Remembering that $\|\vec {y}_i\| = w_i = \|\vec {r}_i\|$ , we compute

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \left\langle \vec {v}_i, \vec {r}_i \right\rangle = \left\langle \vec {y}_i - \vec {r}_i, \vec {r}_i \right\rangle = \left\langle \vec {y}_i, \vec {r}_i \right\rangle - w_i^2 \leqslant 0. \nonumber \end{align*}$

Since $\vec {r}_i$ is a positive scalar multiple of $\hat{x}_i - \vec {p}$ , this implies that $\left\langle \vec {v}_i, \hat{x}_i - \vec {p} \right\rangle \leqslant 0$ , and so we have $\left\langle\vec {v}_i, \hat{x}_i\right\rangle \leqslant \left\langle \vec {v}_i, \vec {p} \right\rangle$ . Since $\sum \vec {v}_i = \vec {0}$ , we see

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \left\langle \pmb{v}, \hat{x} \right\rangle = \sum_i \left\langle \vec {v}_i, \hat{x}_i \right\rangle \leqslant \sum_i \left\langle \vec {v}_i, \vec {p} \right\rangle = \Big\langle \sum_i \vec {v}_i, \vec {p} \Big\rangle = 0, \nonumber \end{align*}$

or that $-\left\langle \hat{x}, \pmb{v} \right\rangle \geqslant 0$ . Using the facts $\left\langle \pmb{y}, \pmb{y} \right\rangle = \sum w_i^2 = \left\langle \pmb{r}, \pmb{r} \right\rangle$ and $\pmb{y} = \pmb{r} + \pmb{v}$ ,

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \left\langle \hat{x} - \pmb{y},\hat{x} - \pmb{y} \right\rangle &= \left\langle \hat{x}, \hat{x} \right\rangle + \left\langle \pmb{y}, \pmb{y} \right\rangle - 2 \left\langle \hat{x}, \pmb{y} \right\rangle \nonumber \\ &= \left\langle \hat{x}, \hat{x} \right\rangle + \left\langle \pmb{r},\pmb{r} \right\rangle - 2 \left\langle \hat{x}, \pmb{r} \right\rangle - 2 \left\langle\hat{x},\pmb{v} \right\rangle \nonumber \\ &= \left\langle\hat{x} - \pmb{r}, \hat{x} -\pmb{r} \right\rangle - 2 \left\langle \hat{x}, \pmb{v} \right\rangle \nonumber \end{align*}$

so $\left\| \hat{x} - \pmb{y} \right\| \geqslant \left\| \hat{x} - \pmb{r} \right\|$ , as claimed. □

Note that the statement of proposition 7 is a little awkward for polygons which are not equilateral. We originally conjectured that $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \pmb{r} \in (\R^d){}^n$ was closest to $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \pmb{x} \in (\R^d){}^n$ , as the components of both are vectors in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^d$ with the same lengths w_i. But our proof shows that this conjecture was not true; instead, $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \pmb{r} \in (\R^d){}^n$ is closest to $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \hat{x} \in (\R^d){}^n$ , whose components are all unit vectors.

This subtlety vanishes for equilateral polygons (where the w_i = 1), which is the case of primary interest. In that case, combining propositions 6 and 7 with definition 4, we have

Theorem 8. If $\pmb{x}$ is a median-closeable equilateral polygon, $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is the closed equilateral polygon closest to $\pmb{x}$ in the chordal metric.

Remarks. This construction may seem unexpected, but it has deep roots. In [15], Kapovich and Millson provide an analogous closure construction which associates a unique closed equilateral polygon to any equilateral polygon where no more than half the edge vectors coincide by viewing the unit ball as the Poincaré ball model of hyperbolic space and (essentially) recentering and renormalizing in hyperbolic geometry around a point called the 'conformal median' (see [6]) which is in many ways parallel to the geometric median. This is an example of a 'Geometric Invariant Theory' (or GIT) quotient: see [13]. These ideas inspired our work above: we did not adopt them entirely only because working in hyperbolic geometry makes the whole endeavor seem much more abstract and because we have not managed to prove an optimality property for their construction analogous to theorem 8.

3. Asymptotics of the geometric median and the distance to closure

Now that we have established the connection between the geometric median and closure, we will establish some facts about the large-n behavior of the geometric median. Since the geometric median is a symmetric estimator of a large number of i.i.d. random variables, it seems natural to expect that the distribution of μ should converge to a multivariate normal, even though the classical central limit theorem does not apply. In fact, this is true:

Proposition 9. Let n points $\hat{x}_i$ be sampled independently and uniformly on S^d−1, with geometric median $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ . The random variable $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \sqrt{n}\, \gm$ converges in distribution to $\newcommand{\m}{\mathcal} \mathcal{N}\left(\vec {0}, \frac{d}{(d-1){}^2}I_d\right)$ as $n \rightarrow \infty$ . This implies that $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\sqrt{n}\, \gm}$ converges in distribution to a Nakagami $\left(\frac{d}{2}, \left(\frac{d}{d-1}\right){}^2\right)$ random variable.

Proof. We start by defining $\newcommand{\Ed}{{\rm Ed}} \Ed(\vec {y})$ to be the expected distance from $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \vec {y} \in \R^d$ to the unit sphere; formally,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Vol}{{\rm Vol}} \newcommand{\dVol}{\thinspace{\rm dVol}} \newcommand{\Ed}{{\rm Ed}} \displaystyle \Ed(\vec {y}) := \frac{1}{\Vol~S^{d-1}} \int_{\hat{x} \in S^{d-1}} \|\hat{x} - \vec {y}\| \dVol_{S^{d-1}}. \nonumber \end{align*}$

Observe that, by symmetry, the minimizer of $\newcommand{\Ed}{{\rm Ed}} \Ed(\vec {y})$ is the origin. Now the geometric median of a finite collection of points $\hat{x}_i$ uniformly sampled from the sphere is the minimizer of the average distance $\newcommand{\Ad}{{\rm Ad}} \Ad$ to the $\hat{x}_i$ (definition 3). For a large number of points, we expect $\newcommand{\Ad}{{\rm Ad}} \Ad$ to be close to $\newcommand{\Ed}{{\rm Ed}} \Ed$ as a function, and hence that the minimizers of the functions should be nearby as well.

In fact, Niemiro studied exactly this situation, showing⁵ ([19, p 1517], see Haberman [11]) that

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\gm}{\vec {\mu}} \newcommand{\m}{\mathcal} \displaystyle \sqrt{n}\, \gm \overset{d}{\rightarrow} \mathcal{N}(\vec {0},\mathcal{H}^{-1} V \mathcal{H}^{-1}) \nonumber \end{align*}$

where $V$ is the covariance matrix of a random point $\hat{x}$ on S^d−1 and $\newcommand{\m}{\mathcal} \mathcal{H}$ is the Hessian of $\newcommand{\Ed}{{\rm Ed}} \Ed$ , evaluated at the origin.

The off-diagonal elements of $V$ are zero by symmetry. Using cylindrical coordinates on S^d−1 with axis $\hat{e}_i$ , the ith diagonal entry in the covariance matrix is computed by the integral

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\dx}{\,\mathrm{d}x} \newcommand{\m}{\mathcal} \displaystyle \sigma_i^2 = \frac{1}{\sqrt{\pi}}\frac{\Gamma(\frac{d}{2})}{\Gamma(\frac{d - 1}{2})} \int_{-1}^1 x^2 (1 - x^2)^{\frac{d - 3}{2}} \,\dx = \frac{1}{d}. \nonumber \end{align*}$

We prove in the appendix (proposition A.1) that the expected distance function $\newcommand{\Ed}{{\rm Ed}} \Ed(\vec {y})$ is given as a function of $r = \|\vec {y}\|$ by

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Ed}{{\rm Ed}} \displaystyle \Ed(r) = \, _2F_1 \left( -\frac{1}{2}, \frac{1-d}{2}; \frac{d}{2}; r^2 \right). \nonumber \end{align*}$

When d is odd, the standard Taylor series representation of the hypergeometric function truncates, and $\newcommand{\Ed}{{\rm Ed}} \Ed(r)$ is a polynomial in r. For example, when d = 3 we have $\newcommand{\Ed}{{\rm Ed}} \Ed(r) = 1+\frac{r^2}{3}$ . In turn, a straightforward computation shows that the Hessian of $\newcommand{\Ed}{{\rm Ed}} \Ed$ evaluated at the origin is simply

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Ed}{{\rm Ed}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{H} = \mathcal{H} \Ed(\vec {0}) = \frac{d-1}{d} I_d, \nonumber \end{align*}$

where I_d is the $d \times d$ identity matrix. This completes the proof of the first statement. To get the second, we note that the norm of a Gaussian $\newcommand{\m}{\mathcal} \mathcal{N}(\vec {0}, \sigma^2I_d)$ random variate is Nakagami $\left(\frac{d}{2}, {d}\sigma^2\right)$ -distributed. □

We now see that the geometric median is becoming asymptotically normal, and concentrating around the origin. In fact, numerical experiments show that the rate of convergence is rather fast (see figure 1). We can use this to prove an asymptotic result for the distance to closure for equilateral polygons.

**Figure 1.** For various n, we generated 250 000 random elements of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ and computed $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \sqrt{n}\, \|\gm\|$ , where $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is the geometric median of the edge cloud. The random variable $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \sqrt{n}\, \|\gm\|$ converges to a Nakagami $\left(\frac{3}{2}, \frac{9}{4}\right)$ distribution by proposition 9. Its pdf is the solid curve above. Though we do not show it, the behavior in other dimensions is quite similar: by n = 50 the density of the limiting Nakagami $\left(\frac{d}{2}, \left(\frac{d}{d-1}\right)^2\right)$ distribution matches the histogram rather well.
Download figure:
Standard image High-resolution image

$ \newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ — **Figure 1.** For various n, we generated 250 000 random elements of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ and computed $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \sqrt{n}\, \|\gm\|$ , where $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is the geometric median of the edge cloud. The random variable $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \sqrt{n}\, \|\gm\|$ converges to a Nakagami $\left(\frac{3}{2}, \frac{9}{4}\right)$ distribution by proposition 9. Its pdf is the solid curve above. Though we do not show it, the behavior in other dimensions is quite similar: by n = 50 the density of the limiting Nakagami $\left(\frac{d}{2}, \left(\frac{d}{d-1}\right)^2\right)$ distribution matches the histogram rather well.
Download figure:
Standard image High-resolution image

Proposition 10. For a random equilateral n-gon $\pmb{x}$ with edges $\hat{x}_i$ sampled independently and uniformly from S^d−1, the random variable $\newcommand{\Pol}{{\rm Pol}} d_{\rm chordal}(\pmb{x}, \Pol(n, d, 1))$ converges in distribution to a Nakagami $\left(\frac{d}{2}, \frac{d}{d-1}\right)$ as $n \rightarrow \infty$ .

Proof. We know from theorem 8 that $\newcommand{\Pol}{{\rm Pol}} d_{\rm chordal}(\pmb{x}, \Pol(n, d, 1))$ is actually the chordal distance from $\pmb{x}$ to $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ . To estimate this distance, we will make use of the recentering and renormalizing map $r(\pmb{x}, \vec {p}, 1)$ from proposition 7.

When $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \norm{\mu}$ is small, we can estimate

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\gmc}{{\rm gmc}} \newcommand{\gm}{\vec {\mu}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \norm{\gmc(\pmb{x}) - \pmb{x}} = \norm{r(\pmb{x};\gm,1) - r(\pmb{x};\vec {0},1)} \sim \norm{\gm} \norm{D_{\hat{\mu}}r(\pmb{x};\vec {0},1)} \nonumber \end{align*}$

where $\newcommand{\m}{\mathcal} D_{\hat{\mu}} r(\pmb{x};\vec {0}, 1)$ is the derivative of $r(\pmb{x};\vec {v}, 1)$ with respect to the vector $\vec {v}$ in the direction of the unit vector $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \hat{\mu} = \frac{\gm}{\norm{\gm}}$ (while leaving the $\pmb{x}$ variables constant).

Using the definition of $r(\pmb{x};\vec {p}, 1)$ , a direct computation reveals that

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \norm{D_{\hat{\mu}}r(\pmb{x};\vec {0},1)} = \left( n - \sum_i \left\langle \hat{x}_i, \hat{\mu} \right\rangle^2 \right)^{\frac{1}{2}} = \sqrt{n} \left(1 - \frac{1}{n} \sum_i \left\langle \hat{x}_i, \hat{\mu} \right\rangle^2 \right)^{\frac{1}{2}}. \nonumber \end{align*}$

Since $\newcommand{\m}{\mathcal} \hat{\mu}$ is a unit vector, the sum is the Rayleigh quotient for the matrix $X = \frac{1}{n} \sum_i \hat{x}_i \hat{x}_i^T$ , and so obeys the estimates

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\lmax}{\lambda_{{\rm max}}} \newcommand{\lmin}{\lambda_{{\rm min}}} \newcommand{\m}{\mathcal} \displaystyle \lmin (X) \leqslant \frac{1}{n} \sum_i \left\langle \hat{x}_i, \hat{\mu} \right\rangle^2 \leqslant \lmax (X). \nonumber \end{align*}$

Now $\newcommand{\lmin}{\lambda_{{\rm min}}} \lmin (X)$ and $\newcommand{\lmax}{\lambda_{{\rm max}}} \lmax (X)$ are also random variables depending on the $\hat{x}_i$ , but we can use the matrix Chernoff inequalities [22, remark 5.3] to bound the probability that they are far from $\frac{1}{d}$ .

It's quite standard to prove that $\newcommand{\m}{\mathcal} \mathcal{E}(\hat{x}_i \hat{x}_i^T) = \frac{1}{d} I_d$ , so $\newcommand{\m}{\mathcal} \mathcal{E}(X) = \frac{1}{d} I_d$ . The matrix Chernoff inequalities then reduce to

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\lmin}{\lambda_{{\rm min}}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left\{\lmin (X) < (1 - \delta) \frac{1}{d} \right\} \leqslant d \left(\frac{{\rm e}^{-\delta}}{(1-\delta)^{1-\delta}} \right)^{\frac{n}{d}} \label{eq:lmin lower} \nonumber \end{align} \tag{ 2 }$

and

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\lmax}{\lambda_{{\rm max}}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left\{\lmax (X) > (1+\delta) \frac{1}{d} \right\} \leqslant d \left(\frac{{\rm e}^{\delta}}{(1+\delta)^{1+\delta}} \right)^{\frac{n}{d}}. \label{eq:lmax upper} \nonumber \end{align} \tag{ 3 }$

For any $\delta > 0$ , the quantities raised to $\frac{n}{d}$ are <1, and so as $n \rightarrow \infty$ the probability that the bounds in (2) and (3) both hold $\rightarrow 1$ . In turn, this means that for any fixed $\delta > 0$ ,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P} \left\{\left| \frac{1}{d} - \frac{1}{n} \sum_i \left\langle \hat{x}_i, \hat{\mu} \right\rangle^2 \right| > \frac{\delta}{d} \right\} \rightarrow 0 \nonumber \end{align*}$

and so the random variable $\newcommand{\m}{\mathcal} \frac{1}{n} \sum_i \left\langle \hat{x}_i, \hat{\mu} \right\rangle^2$ converges in probability to $\frac{1}{d}$ . By the continuous mapping theorem, this means that $\newcommand{\m}{\mathcal} {\left(1 - \frac{1}{n} \sum_i \left\langle \hat{x}_i, \hat{\mu} \right\rangle^2\right)}^{1/2} \overset{p}{\rightarrow}\sqrt{\frac{d-1}{d}}$ .

We can now rewrite the random variable $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm} \norm{D_{\hat{\mu}}r(\pmb{x};\vec {0}, 1)}$ as the product of $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\sqrt{n}\, \gm}$ , which by proposition 9 converges in distribution to a Nakagami $\left(\frac{d}{2}, \left(\frac{d}{d-1}\right)^2\right)$ random variable, and $\newcommand{\m}{\mathcal} \left(1 - \frac{1}{n} \sum_i \left\langle \hat{x}_i, \hat{\mu} \right\rangle^2\right)^{1/2}$ , which we have just proved converges in probability to the constant random variable $\sqrt{\frac{d-1}{d}}$ .

Using Slutsky's theorem and a little algebra, this implies that the product converges in distribution to a Nakagami $\left(\frac{d}{2}, \frac{d}{d-1}\right)$ random variable, as claimed. □

We have now learned something interesting: the distribution of chordal distances to closure should be converging to a distribution which does not depend on the number of edges! This is surprising because the diameter of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, 1)$ is clearly $\Theta(\sqrt{n}) \rightarrow \infty$ . This means that some arms might indeed be very far from closure—but they are very rare. We will look for this feature in the more specific probability inequalities to come.

We can also see how fast the tail of the distribution of d_chordal can be expected to decay. The survival function of the Nakagami distribution is an incomplete Gamma function. Using [5, 8.10.1], we can show that there is a constant $C(d) > 0$ so that if x is Nakagami $\left(\frac{d}{2}, \frac{d}{d-1}\right)$ , then

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P} \left\{x < t \right\} \geqslant 1 - C(d) \, t^{d-2} {\rm e}^{-\frac{d-1}{2} t^2}. \label{eq:precise tail bound} \nonumber \end{align} \tag{ 4 }$

Again, this is confirmed by numerical experiment for n = 1000, the empirical distribution of the distance to closure tracks the corresponding Nakagami distribution very well, see figure 2.

**Figure 2.** For $d=2, 3, 4, 10$ , we generated 250 000 random elements of $\newcommand{\Arm}{{\rm Arm}} \Arm(1000, d, 1)$ and computed their chordal distance to $\newcommand{\Pol}{{\rm Pol}} \Pol(1000, d, 1)$ using theorem 8. This plot shows the histograms of chordal distance to closure together with the densities of Nakagami $\left(\frac{d}{2}, \frac{d}{d-1}\right)$ distributions.
Download figure:
Standard image High-resolution image

**Figure 2.** For $d=2, 3, 4, 10$ , we generated 250 000 random elements of $\newcommand{\Arm}{{\rm Arm}} \Arm(1000, d, 1)$ and computed their chordal distance to $\newcommand{\Pol}{{\rm Pol}} \Pol(1000, d, 1)$ using theorem 8. This plot shows the histograms of chordal distance to closure together with the densities of Nakagami $\left(\frac{d}{2}, \frac{d}{d-1}\right)$ distributions.
Download figure:
Standard image High-resolution image

4. Concentration inequalities for $\boldsymbol{ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}}$ and d_chordal

We now know what to expect in the large-n limit, at least for equilateral polygons. For most applications in physics (4) is sufficient; even for small n, experimental evidence shows that the bound is quite close to the truth, see figure 1.

However, for mathematical applications, it's helpful to have a hard bound on the chordal distance to closure even if we sacrifice some accuracy. Learning from (4), we see that we should aim for a tail bound for d_chordal which does not depend on n and is proportional to ${\rm e}^{-\alpha t^2}$ for some $\alpha < \frac{d-1}{2}$ . We will get exactly such a bound in corollary 19 at the end of the section. Our bounds will apply for finite n, and also apply to the non-equilateral case, where it is not even clear what the large-n limit should mean.

4.1. A bound connecting ${ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}}$ and d_chordal

To start with, we prove a hard bound on the relationship between the geometric median and our two measures of distance in polygon space. First, we note that our procedure of recentering and renormalizing changes each $\hat{x}_i$ by a controlled amount.

Lemma 11. If $\hat{x}_i \in S^{d-1}$ and $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \vec {p} \in \R^d$ is any vector with $\| \vec {p} \| < 1$ , then $\left\| \hat{x}_i - \frac{\hat{x}_i - \vec {p}}{\|\hat{x_i} - \vec {p}\|} \right\| \leqslant \sqrt{2} \| \vec {p} \|$ and $\newcommand{\arc}[1]{\gamma_{#1}} \angle (\hat{x}_i, \frac{\hat{x}_i - \vec {p}}{\|\hat{x}_i - \vec {p}\|}) \leqslant \arcsin \|\vec {p}\|< \frac{\pi}{2} \| \vec {p} \|$ .

Proof. This is a calculus exercise; it is straightforward to establish the (sharp) bound

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \left\| \hat{x}_i - \frac{\hat{x}_i - \vec {p}}{\|\hat{x_i} - \vec {p}\|} \right\| \leqslant \sqrt{2 - 2\sqrt{1 - \|\vec {p}\|^2}}. \nonumber \end{align*}$

Further, it is easy to check that the right-hand side is a convex function of $\| \vec {p} \|$ which is equal to 0 when $\|\vec {p}\|=0$ , and $\sqrt{2}$ when $\|\vec {p}\|=1$ , so it is bounded above by the line $\sqrt{2}\, \| \vec {p} \|$ . The angle bound is also straightforward. □

We now can give a bound on the distance between a given $\newcommand{\Arm}{{\rm Arm}} \pmb{x} \in \Arm(n, d, w)$ and $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ in terms of the norm of the geometric median $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ of the edge cloud $(\hat{x}_i, w_i)$ .

Proposition 12. If the edge cloud $(\hat{x}_i, w_i)$ has geometric median $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ with $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \|\gm\| < 1$ ,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\gm}{\vec {\mu}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \newcommand{\arc}[1]{\gamma_{#1}} \displaystyle d_{{\rm chordal}}(\pmb{x},\Pol(n,d,w)) < \sqrt{2 \sum \omega_i^2}\, \|\gm\| \quad {\rm and} \quad d_{{\rm max}\mbox{-}{\rm angular}}(\pmb{x},\Pol(n,d,w)) < \arcsin \|\gm\|. \nonumber \end{align*}$

Proof. Since $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \|\gm\| < 1$ , our polygon is median-closeable and $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc(\pmb{x})$ is a closed polygon with edge cloud $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \left(w_i, \frac{\hat{x}_i - \gm}{\|\hat{x}_i - \gm\|}\right)$ . Lemma 11 immediately yields the bound on $\newcommand{\m}{\mathcal} d_{{{\rm max\mbox{-}angular}}}$ ; to get the chordal distance bound, we write

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\gm}{\vec {\mu}} \newcommand{\m}{\mathcal} \displaystyle \sqrt{\sum \left\| w_i \,\hat{x}_i - w_i \frac{\hat{x}_i - \gm}{\|\hat{x}_i - \gm\|} \right\|^2 } \leqslant \sqrt{\sum w_i^2 \cdot 2 \| \gm \|^2 }. \nonumber \end{align*}$

□

4.2. Strategy for the tail bound

To derive our explicit tail bound on the norm of the geometric median, our strategy is as follows. First, we will prove two probabilistic bounds: an upper bound on $\newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \|\gradAdx(\vec {0})\|$ and a positive lower bound on $\newcommand{\m}{\mathcal} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\HAdx}{\mathcal{H}\Adx} \newcommand{\lmin}{\lambda_{{\rm min}}} \lmin(\HAdx(\vec {0}))$ . These will come from scalar and matrix versions of Bernstein's inequality.

If we restrict $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx$ to a scalar function $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx(z)$ on a ray from the origin, these bounds yield an upper bound on $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} |\Adx'(0)|$ and a lower bound on $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx''(0)$ . We will get a uniform lower bound on $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx''(z)$ for $z \in [0, \frac{1}{50}]$ by showing that, $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx''(z) \geqslant \Adx''(0) - 7 z$ on this interval. We prove this using the special structure of $\newcommand{\m}{\mathcal} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\HAdx}{\mathcal{H}\Adx} \HAdx$ .

By Taylor's theorem, there is some z_* in $[0, z]$ so that

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \displaystyle \Adx'(z) = \Adx'(0) + z \Adx''(z_*) \geqslant -|\Adx'(0)| + \lambda z. \nonumber \end{align*}$

This means that for $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} z > |\Adx'(0)|/\lambda$ , this directional derivative must be positive: in particular, since the geometric median $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is by definition a point where $\newcommand{\m}{\mathcal} \newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gm}{\vec {\mu}} \gradAd(\gm) = \vec {0}$ , $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ can lie no farther than $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} |\Adx'(0)|/\lambda$ from the origin.

4.3. A probabilistic bound on ${ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \|\gradAdx(\vec {0})\| = \norm{\sum \omega_i \hat{x}_i}}$

We want to bound the norm of the gradient $\newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \gradAdx(\vec {0})$ , which we recall from lemma 5 is equal to $\sum \omega_i \hat{x}_i$ . We start with a lemma which helps us understand the effect of variable weights $\omega_i$ .

Lemma 13. For any collection of n non-negative real numbers w_i, if we define $\omega_i = \frac{w_i}{\sum w_i}$ ,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \displaystyle n \geqslant 1 + n^2 \Var\,\omega_i = n \sum \omega_i^2 \geqslant 1, \nonumber \end{align*}$

where $\newcommand{\Var}{{\rm Var}} \Var\,\omega_i$ is the variance of $\{\omega_1, \dots, \omega_n\}$ . We have equality on the left precisely when all but one of the w_i equal zero and equality on the right precisely when all the w_i are equal.

Proof. Starting with the definition of variance, and remembering that $\sum \omega_i = 1$ ,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \displaystyle \Var\,\omega_i = \frac{1}{n} \sum \omega_i^2 - \left( \frac{1}{n} \sum \omega_i \right)^2 = \frac{1}{n} \sum \omega_i^2 - \frac{1}{n^2}. \nonumber \end{align*}$

Solving for $\sum \omega_i^2$ ,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \displaystyle \sum \omega_i^2 = \frac{1 + n^2 \Var\,\omega_i}{n} \nonumber \end{align*}$

which proves the central equality. Since $\newcommand{\Var}{{\rm Var}} \Var\,\omega_i \geqslant 0$ with equality precisely when all the $\omega_i$ are equal, the inequality on the right follows easily.

To prove the inequality on the left, we invoke the Bhatia–Davis inequality [1], which says that since $0 \leqslant \omega_i \leqslant 1$ and the mean of the $\omega_i$ is $\frac{1}{n}$ , we have $\newcommand{\Var}{{\rm Var}} \Var\,\omega_i \leqslant (1 - \frac{1}{n})(\frac{1}{n} - 0)$ with equality precisely when one $\omega_i = 1$ and the remainder are zero. □

Now we can give our first result:

Proposition 14. If we have n points $\hat{x}_i$ sampled independently and uniformly from S^d−1, and n weights $\omega_i \geqslant 0$ with $\sum_i \omega_i = 1$ and $\newcommand{\m}{\mathcal} \Omega = \max_i \omega_i$ , then for any t > 0

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \mathcal{P}\left(\norm{\sum \omega_i \hat{x}_i} > t \right) \leqslant d \exp\left(- \frac{3 n t^2}{2 n t \, \Omega \sqrt{d} + 6 (1 + n^2 \Var\,\omega_i)} \right). \nonumber \end{align*}$

If the $\omega_i$ are all equal (the polygon is equilateral), this simplifies to

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \mathcal{P}\left(\norm{\frac{1}{n} \sum \hat{x}_i} > t \right) \leqslant d \exp\left(- \frac{3 n t^2}{6+2 t \sqrt{d}}\right). \nonumber \end{align*}$

Proof. We will use Bernstein's inequality [8, theorem 1.2]: suppose $X_1, \dots, X_n$ are independent random variables with $\newcommand{\m}{\mathcal} X_i - \mathcal{E}(X_i) \leqslant b$ for each i, the variance of each X_i is given by $\sigma_i^2$ , and $X = \sum X_i$ (with variance $\sigma^2 = \sum \sigma_i^2$ ). Then for any t > 0,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left( X > \mathcal{E}(X) + t \right) \leqslant \exp\left( -\frac{t^2}{2 \sigma^2 \left( 1 + \frac{b t}{3 \sigma^2} \right)} \right). \nonumber \end{align*}$

For any unit vector $\vec {v}$ , we can set $X_i = \left\langle \omega_i \hat{x}_i, \vec {v} \right\rangle$ . These random variables clearly have expectation 0 and $\newcommand{\m}{\mathcal} X_i - \mathcal{E}(X_i) \leqslant \omega_i \leqslant \Omega$ . Using cylindrical coordinates on S^d−1 with axis $\vec {v}$ , the variance is computed by the integral

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\dx}{\,\mathrm{d}x} \newcommand{\m}{\mathcal} \displaystyle \sigma_i^2 = \frac{1}{\sqrt{\pi}}\frac{\Gamma(\frac{d}{2})}{\Gamma(\frac{d - 1}{2})} \int_{-1}^1 (\omega_i x)^2 (1 - x^2)^{\frac{d - 3}{2}} \,\dx = \frac{\omega_i^2}{d}. \nonumber \end{align*}$

Using lemma 13, this implies

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \displaystyle \sigma^2 = \frac{1}{d} \sum \omega_i^2 = \frac{1 + n^2 \Var\,\omega_i}{{ d} n}. \nonumber \end{align*}$

This proves that for any $\vec {v}$ ,

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left( \left\langle\sum\omega_i \hat{x}_i,\vec {v} \right\rangle > t \right) \leqslant \exp\left( -n {d} t^2 \frac{3}{6 + 2\, {d} n t \, \Omega + 6 n^2 \Var\,\omega_i} \right). \nonumber \end{align*}$

Applying this inequality d times for $\vec {v}=\hat{e}_1, \dots, \hat{e}_d$ , and using the union bound, we can bound the $L^\infty$ norm of $\sum \omega_i \hat{x}_i$ :

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \mathcal{P}\left( \norm{\sum \omega_i \hat{x}_i}_\infty = \max_j \left(\sum\omega_i \hat{x}_i\right)_j > t \right) \leqslant d \exp\left( -n { d} t^2 \frac{3}{6 + 2\, { d} n t \, \Omega + 6 n^2 \Var\,\omega_i} \right). \nonumber \end{align*}$

But we know that for any $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \vec {u}\in\R^d$ we have $\newcommand{\norm}[1]{\left\| #1 \right\|} \frac{1}{\sqrt{d}} \norm{\vec {u}} \leqslant \norm{\vec {u}}_\infty$ , so

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \mathcal{P}\left( \frac{1}{\sqrt{d}} \norm{\sum \omega_i \hat{x}_i} > t \right) \leqslant \mathcal{P}\left( \norm{\sum \omega_i \hat{x}_i}_\infty > t \right) \leqslant d \exp\left( -n {d} t^2 \frac{3}{6 + 2\, {d} n t \, \Omega + 6 n^2 \Var\,\omega_i} \right). \nonumber \end{align*}$

Replacing t by $\frac{t}{\sqrt{d}}$ yields the statement of the proposition. □

The terms Ω and $\newcommand{\Var}{{\rm Var\,}} \Var \omega_i$ in the statement of proposition 14 at first seem mysterious. However, if you read them in light of lemma 13, they become clearer.

At one extreme, if one $\omega_i$ is close to 1 and the remaining $\omega_j$ are small, the sum $\newcommand{\norm}[1]{\left\| #1 \right\|} \norm{\sum_i \omega_i \hat{x}_i} \sim 1$ regardless of n, and $\newcommand{\norm}[1]{\left\| #1 \right\|} \norm{\sum_i \omega_i \hat{x}_i}$ cannot concentrate on 0 as $n \rightarrow \infty$ . To see this in the statement of the proposition, observe that in this case Ω and $\newcommand{\Var}{{\rm Var\,}} \Var \omega_i$ approach their maximum values of $\Omega \sim 1$ and $\newcommand{\Var}{{\rm Var\,}} 1 + n^2 \Var \omega_i \sim n$ , the n's in numerator and denominator cancel, and the exponent no longer depends on n at all.

At the other extreme, if the $\omega_i$ are all equal, Ω and $\newcommand{\Var}{{\rm Var\,}} \Var \omega_i$ are minimized: $\Omega = \frac{1}{n}$ and $\newcommand{\Var}{{\rm Var\,}} {\Var \omega_i = 0}$ . In this case, the denominator in the exponent does not depend on n and $\newcommand{\norm}[1]{\left\| #1 \right\|} \norm{\sum \omega_i \hat{x}_i}$ concentrates on 0 as fast as possible. We can compare this result to that of Khoi [17], who showed in a different sense that the equilateral polygons are the 'most flexible' of all the fixed edgelength polygons.

In the middle, if the $\omega_i$ are variable, but the number of comparably large $\omega_i$ increases, Ω and $\newcommand{\Var}{{\rm Var\,}} \Var \omega_i$ act to slow the rate of concentration, but they do not stop it: $\newcommand{\norm}[1]{\left\| #1 \right\|} \norm{\sum_i \omega_i \hat{x}_i}$ still concentrates on 0 as $n \rightarrow \infty$ .

4.4. A probabilistic bound on ${ \newcommand{\m}{\mathcal} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\HAdx}{\mathcal{H}\Adx} \newcommand{\lmin}{\lambda_{{\rm min}}} \lmin(\HAdx(\vec {0})) = \lmin\left(I - \sum \omega_i \hat{x}_i \hat{x}_i^T\right)}$

We now want to bound the lowest eigenvalue of the Hessian of $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx$ at the origin. Again using lemma 5, we see that $\newcommand{\m}{\mathcal} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\HAdx}{\mathcal{H}\Adx} \HAdx(\vec {0}) = I - \sum \omega_i \hat{x}_i \hat{x}_i^T$ , where the quantities being summed are outer products of the vectors $\hat{x}_i$ . That is, they are the symmetric, positive semidefinite projection matrices which project to the lines spanned by the $\hat{x}_i$ . We now show

Proposition 15. If we have n points $\hat{x}_i$ sampled independently and uniformly from S^d−1, and n weights $0 \leqslant \omega_i$ with $\sum_i \omega_i = 1$ and $\newcommand{\m}{\mathcal} \Omega = \max_i \omega_i$ , then for any t > 0

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var}} \newcommand{\wxxt}{\omega_i \hat{x}_i \hat{x}_i^T} \newcommand{\lmin}{\lambda_{{\rm min}}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left( \lmin\left(I - \sum \wxxt \right) > \frac{d-1}{d} - t \right) \leqslant d \exp\left( -\frac{d}{d-1} \cdot \frac{3 {d} n t^2}{2 n t\, \Omega d +6(1 + n^2 \Var\,\omega_i} \right). \nonumber \end{align*}$

If the $\omega_i$ are all equal (the polygon is equilateral), this simplifies to

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\xxt}{\hat{x}_i \hat{x}_i^T} \newcommand{\lmin}{\lambda_{{\rm min}}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left( \lmin\left(I - \frac{1}{n}\sum \xxt \right) > \frac{d-1}{d} - t \right) \leqslant d \exp\left( -\frac{d}{d-1} \cdot \frac{3 {d} n t^2}{2 t d + 6} \right). \nonumber \end{align*}$

Proof. The statement is similar to the statement of proposition 14, so it should not be surprising that this also follows from a Bernstein inequality, this time for matrices [22, theorem 1.4]: suppose $X_1, \dots, X_n$ are independent random symmetric $d \times d$ matrices, $\newcommand{\m}{\mathcal} \mathcal{E}(X_i) = 0$ , $\newcommand{\lmax}{\lambda_{{\rm max}}} \lmax(X_i) \leqslant b$ , the 'matrix variance' of each X_i is given by $\newcommand{\m}{\mathcal} \sigma_i^2 = \mathcal{E}(X_i^2)$ , and $X = \sum X_i$ (with 'scalar variance' $\newcommand{\norm}[1]{\left\| #1 \right\|} \sigma^2 = \norm{\sum \sigma_i^2}$ ). Then for any t > 0,

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\lmax}{\lambda_{{\rm max}}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left( \lmax(X) \geqslant t \right) \leqslant d \exp\left( -\frac{t^2}{2 \sigma^2 \left( 1 + \frac{b t}{3 \sigma^2} \right)} \right). \label{eq:matrix bernstein} \nonumber \end{align} \tag{ 5 }$

We will set $\newcommand{\xxt}{\hat{x}_i \hat{x}_i^T} X_i = \omega_i\left(\xxt - \frac{1}{d} I_d\right)$ . These are clearly symmetric $d \times d$ matrices and it is a standard computation to show that $\newcommand{\m}{\mathcal} \mathcal{E}(X_i) = 0$ .

We now prove that $\newcommand{\lmax}{\lambda_{{\rm max}}} \lmax(X_i) \leqslant \Omega \frac{d-1}{d}$ . For any matrix A, the eigenvalues of A + kI_d are simply k added to the eigenvalues of A (see [12, theorem 2.4.8.1]). So

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\xxt}{\hat{x}_i \hat{x}_i^T} \newcommand{\lmax}{\lambda_{{\rm max}}} \displaystyle \lmax(X_i) = \omega_i \left( \lmax(\xxt) - \frac{1}{d} \right) = \omega_i \frac{d-1}{d} \leqslant \Omega \frac{d-1}{d} \nonumber \end{align*}$

since the largest eigenvalue of a projection matrix like $\newcommand{\xxt}{\hat{x}_i \hat{x}_i^T} \xxt$ is 1.

Next, we want to show that $\newcommand{\m}{\mathcal} \sigma_i^2 = \mathcal{E}(X_i^2) = \omega_i^2 \frac{d-1}{d^2} I_d$ . A direct computation reveals

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\xxt}{\hat{x}_i \hat{x}_i^T} \displaystyle X_i^2 = \omega_i^2 \left( \left(1 - \frac{2}{d}\right) \xxt + \frac{1}{d^2} I_d \right) \nonumber \end{align*}$

and the result follows from our previous computation that $\newcommand{\m}{\mathcal} \newcommand{\xxt}{\hat{x}_i \hat{x}_i^T} \mathcal{E}\left(\xxt\right) = \frac{1}{d}I_d$ . Summing the $\sigma_i^2$ and taking the operator norm, we get

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \sigma^2 = \frac{d-1}{d^2} \sum \omega_i^2. \nonumber \end{align*}$

Plugging b and σ into (5) yields a bound on the probability that $\newcommand{\lmax}{\lambda_{{\rm max}}} \lmax(X) > t$ or, since $\newcommand{\lmax}{\lambda_{{\rm max}}} \newcommand{\wxxt}{\omega_i \hat{x}_i \hat{x}_i^T} {\lmax(X) = \lmax(\sum \wxxt) - \frac{1}{d}}$ , that $\newcommand{\lmax}{\lambda_{{\rm max}}} \newcommand{\wxxt}{\omega_i \hat{x}_i \hat{x}_i^T} \lmax(\sum \wxxt) > \frac{1}{d} + t$ . This completes the proof. □

We note that this concentration inequality is better than proposition 14: there is an extra factor of d in the numerator which means that the concentration gets faster as d increases. The effect of variable edgelengths is to slow (or stop) the concentration, just as in proposition 14; the same comments on the role of Ω and $\newcommand{\Var}{{\rm Var\,}} \Var \omega_i$ apply here.

4.5. A bound on the change in the radial second derivative

For any point $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \vec {s} \in \R^{d}$ , the second derivative of $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx$ along the ray through $\vec {s}$ is given by evaluating the Hessian as a quadratic form on the vector $\vec {s}$ itself. Our last proposition gave us a lower bound on the result at the origin; we now show that this can not change too fast as we move away from the origin.

Proposition 16. For $\newcommand{\norm}[1]{\left\| #1 \right\|} \norm{\vec {s}} < 1$ we have

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\HAdx}{\mathcal{H}\Adx} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \frac{\left\langle \HAdx(\vec {s}) \vec {s}, \vec {s} \right\rangle}{\left\langle\vec {s},\vec {s}\right\rangle} - \frac{\left\langle \HAdx(0) \vec {s},\vec {s} \right\rangle}{\left\langle \vec {s}, \vec {s}\right\rangle} \geqslant -\norm{\vec {s}} \frac{6 + \norm{\vec {s}} + \norm{\vec {s}}^2}{(1 - \norm{\vec {s}})^3}. \nonumber \end{align*}$

Since the fraction at right is increasing in $\newcommand{\norm}[1]{\left\| #1 \right\|} \norm{\vec {s}}$ , we can easily simplify the statement given a better upper bound on $\newcommand{\norm}[1]{\left\| #1 \right\|} \norm{\vec {s}}$ . In particular, for $\newcommand{\norm}[1]{\left\| #1 \right\|} \norm{\vec {s}} < \frac{1}{50}$ , the right-hand side $\newcommand{\norm}[1]{\left\| #1 \right\|} \geqslant - 7 \norm{\vec {s}}$ .

Proof. Using lemma 5, we see that

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\HAdx}{\mathcal{H}\Adx} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle &\left\langle \HAdx(\vec {s}) \vec {s}, \vec {s} \right\rangle - \left\langle \HAdx(0) \vec {s},\vec {s} \right\rangle \nonumber \\ &\qquad \qquad \qquad =\left(\!\left( \sum \frac{\omega_i}{\norm{\hat{x}_i - \vec {s}}}\right) - 1\right) \left\langle \vec {s}, \vec {s} \right\rangle - \sum \omega_i \left( \frac{\left\langle\hat{x}_i - \vec {s},\vec {s}\right\rangle^2}{\norm{\hat{x}_i-\vec {s}}^3} - \left\langle \hat{x}_i, \vec {s}\right\rangle^2 \right). \nonumber \end{align*}$

Using the estimates $\newcommand{\norm}[1]{\left\| #1 \right\|} 1 - \norm{\vec {s}} \leqslant \norm{\hat{x}_i - \vec {s}} \leqslant 1 + \norm{\vec {s}}$ and recalling that $\sum \omega_i = 1$ , we can underestimate the right hand side by

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle -\frac{\norm{\vec {s}}^3}{1 + \norm{\vec {s}}} - \sum \omega_i \left( \frac{\left\langle\hat{x}_i - \vec {s},\vec {s}\right\rangle^2}{(1 - \norm{\vec {s}})^3} - \left\langle \hat{x}_i, \vec {s}\right\rangle^2 \right) \geqslant -\frac{\norm{\vec {s}}^3}{1 + \norm{\vec {s}}} - \frac{5 \norm{\vec {s}}^3 + \norm{\vec {s}}^4 + \norm{\vec {s}}^5}{(1 - \norm{\vec {s}})^3} \nonumber \end{align*}$

where the second part follows from finding a common denominator, expanding, and cancelling, using Cauchy–Schwartz carefully to underestimate the inner product terms as needed. Observing that $\newcommand{\norm}[1]{\left\| #1 \right\|} 1 + \norm{\vec {s}} > 1 > (1 - \norm{\vec {s}}){}^3$ allows us to underestimate $\newcommand{\norm}[1]{\left\| #1 \right\|} -\frac{\norm{\vec {s}}^3} {1 + \norm{\vec {s}}} \geqslant -\frac{\norm{\vec {s}}^3}{(1 - \norm{\vec {s}}){}^3}$ , completing the proof. □

4.6. Bounding the norm of the geometric median

We are now in a position to bound the norm of the geometric median! This will proceed in two stages: first, we will use the Poincaré–Hopf index theorem to show that $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm} < \frac{1}{50}$ under certain hypotheses. Then we can immediately bootstrap to get a sharper bound.

Proposition 17. If $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \frac{5}{1000} > ||\grad \Ad_x(\vec {0}) = \norm{\sum \omega_i \hat{x}_i}$ , $\newcommand{\m}{\mathcal} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\HAdx}{\mathcal{H}\Adx} \newcommand{\lmin}{\lambda_{{\rm min}}} \frac{d-1}{d} - \frac{1}{100} < \lambda_\text{min}(\mathcal{H} \Ad_x(\vec {0}))$ , and $d \geqslant 2$ , then $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \frac{1}{50} \geqslant \norm{\vec {\mu}}$ .

Proof. Given our hypothesis on $\newcommand{\lmin}{\lambda_{{\rm min}}} \lmin$ of the Hessian, we know that the $\hat{x}_i$ are not all colinear. This means that $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ is the unique point inside S^d−1 where the vector field $\newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \gradAdx$ vanishes. We will now show that $\newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Adxy}{\Adx(\vec {y})} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \newcommand{\gradAdxy}{\nabla\!\Adxy} \gradAdxy$ has a zero inside the sphere of radius $\frac{1}{50}$ ; by uniqueness, this point must be the geometric median.

Along any ray from the origin, we may restrict $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx$ to a scalar function $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx(z)$ . Using proposition 16, on the interval $[0, \frac{1}{50}]$ our hypotheses imply

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \newcommand{\m}{\mathcal} \displaystyle \Adx'(0) \geqslant -\frac{5}{1000} \quad {{\rm and}} \quad \Adx''(z) \geqslant \frac{1}{2} - \frac{1}{100} - \frac{7}{50} = \frac{7}{20}. \nonumber \end{align*}$

By Taylor's theorem, there is some $z_* \in [0, \frac{1}{50}]$ so that

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \displaystyle \Adx'(\frac{1}{50}) = \Adx'(0) + \frac{1}{50} \Adx''(z_*) \geqslant -\frac{5}{1000} + \frac{1}{50}\cdot \frac{7}{20} = \frac{2}{1000} > 0. \nonumber \end{align*}$

This means that the directional derivative of $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx$ in the outward direction is positive on the boundary of the sphere of radius $\frac{1}{50}$ , or that $\newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Adxy}{\Adx(\vec {y})} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \newcommand{\gradAdxy}{\nabla\!\Adxy} \gradAdxy$ points outward on this sphere. In particular, this implies that the vector field has index 1 on the sphere, and so by the Poincaré–Hopf index theorem must vanish at some point inside the sphere. □

We can now prove our main theorem.

Theorem 18. If we have n points $\hat{x}_i$ sampled uniformly on S^d−1 ( $d \geqslant 2$ ), n weights $\omega_i > 0$ so that $\sum \omega_i = 1$ , and $\newcommand{\m}{\mathcal} \max \omega_i = \Omega$ , then for any $t < \frac{5}{1000}$ we have

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var\,}} \newcommand{\gm}{\vec {\mu}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \mathcal{P}\left( \norm{\gm} < \frac{t}{\frac{d-1}{d} - \frac{3}{20}} \right) \geqslant 1 - 2 d \exp\left(- \frac{3 n t^2}{2 n t \Omega \sqrt{d} + 6 (1 + n^2 \Var \omega_i)} \right). \label{eq:main bound} \nonumber \end{align} \tag{ 6 }$

If all the $\omega_i$ are equal (the polygon is equilateral)

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\gm}{\vec {\mu}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \mathcal{P}\left( \norm{\gm} < \frac{t}{\frac{d-1}{d} - \frac{3}{20}} \right) \geqslant 1 - 2 d \exp\left( - \frac{3 n t^2}{2 \sqrt{d} t + 6} \right). \nonumber \end{align*}$

For d = 3, we have the further simplification

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\gm}{\vec {\mu}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \mathcal{P}\left( \norm{\gm} < t \right) \geqslant 1 - 6 \exp\left( -\frac{n t^2}{9} \right). \nonumber \end{align*}$

Proof. We first define two random events: $\newcommand{\m}{\mathcal} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\HAdx}{\mathcal{H}\Adx} \newcommand{\lmin}{\lambda_{{\rm min}}} \lmin(\HAdx(\vec {0})) > \frac{d-1}{d} - \frac{1}{100}$ (event A) and $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \norm{\gradAdx(\vec {0})} < t < \frac{5}{1000}$ (event B), which will happen for some choices of $\hat{x}_i$ . Suppose both events occur.

As in proposition 17, we restrict $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx$ to a scalar function $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx(z)$ on a ray; this time, the ray is assumed to pass through $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \gm$ . By Taylor's theorem, if we evaluate at $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} z = \norm{\gm}$ , there is some $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} 0\leqslant z_* \leqslant \norm{\gm}$ so that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\gm}{\vec {\mu}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle 0 = \Adx'(\norm{\gm}) = \Adx'(0) + \norm{\gm} \Adx''(z_*). \label{eq:taylor theorem setup} \nonumber \end{align} \tag{ 7 }$

Since we are assuming $A \land B$ , the hypotheses of proposition 17 are satisfied and $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm} < \frac{1}{50}$ . In turn, this means that proposition 16 holds at z_*, and

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\lmin}{\lambda_{{\rm min}}} \newcommand{\HAdx}{\mathcal{H}\Adx} \newcommand{\HAd}{\mathcal{H}\Ad} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\Ad}{{\rm Ad}} \newcommand{\m}{\mathcal} \displaystyle \Adx''(z_*) \geqslant \Adx''(0) - \frac{7}{50} \geqslant \lmin(\HAdx(\vec {0})) - \frac{7}{50}. \nonumber \end{align*}$

Since A, we have $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx''(z_*) > \frac{d-1}{d} - \frac{3}{20}$ . As before, since $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx'(\vec {0})$ is a directional derivative, it satisfies $\newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \Adx'(0) \geqslant -\|\gradAdx(\vec {0})\| > -t$ . We can plug both estimates into (7) and solve for $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}$ , obtaining

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\gm}{\vec {\mu}} \newcommand{\m}{\mathcal} \newcommand{\norm}[1]{\left\| #1 \right\|} \displaystyle \norm{\gm} < \frac{t}{\frac{d-1}{d} - \frac{3}{20}}. \nonumber \end{align*}$

If we call this event C, we have shown that $\newcommand{\im}{{\rm Im}}A \land B \implies C$ , and hence that $\newcommand{\m}{\mathcal} \mathcal{P}(C) \geqslant \mathcal{P}(A \land B)$ . This means that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}(\lnot C) \leqslant \mathcal{P}(\lnot(A \land B)) = \mathcal{P}(\lnot A \lor \lnot B) \leqslant \mathcal{P}(\lnot A) + \mathcal{P}(\lnot B). \label{eq:logic} \nonumber \end{align} \tag{ 8 }$

Now $\newcommand{\m}{\mathcal} \mathcal{P}(\lnot A)$ was bounded above in proposition 15, while $\newcommand{\m}{\mathcal} \mathcal{P}(\lnot B)$ was bounded above in proposition 14. We now compare these upper bounds, noting that we have chosen $t_* = \frac{1}{100}$ in the statement of proposition 15 while the t in proposition 14 is smaller—less than $\frac{5}{1000} = \frac{1}{200}$ . The bounds are

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var\,}} \newcommand{\m}{\mathcal} \displaystyle d \exp\left( -\frac{d}{d-1} \cdot \frac{3 d n t_*^2}{2 n t_* \Omega d +6(1 + n^2 \Var \omega_i)} \right) \quad {{\rm and}}\quad d \exp\left(- \frac{3 n t^2}{2 n t \Omega \sqrt{d} + 6 (1 + n^2 \Var \omega_i)} \right). \nonumber \end{align*}$

Of course, it suffices to compare the absolute values of the fractions inside the exponential functions (since both are negative). We can simplify the comparison by rewriting these as

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var\,}} \newcommand{\m}{\mathcal} \displaystyle \frac{d}{d-1} \cdot \frac{3 n t_*}{2 n \Omega + \frac{6(1 + n^2 \Var \omega_i)}{d t_*}} \quad {{\rm and}}\quad \frac{3 n t}{2 n \Omega \sqrt{d} + \frac{6 (1 + n^2 \Var \omega_i)}{t}}. \nonumber \end{align*}$

It is now evident that if we compare the right fraction with the second fraction on the left, the numerator on the right is smaller and each term in the denominator is larger (recall t < t_*). Multiplying by $\frac{d}{d-1} > 1$ makes the left hand side even larger. Restoring the minus sign reverses this conclusion, and we see that our bound on $\newcommand{\m}{\mathcal} \mathcal{P}(\lnot B)$ is larger than our bound on $\newcommand{\m}{\mathcal} \mathcal{P}(\lnot A)$ , as claimed. Returning this conclusion to (8), we see $\newcommand{\m}{\mathcal} \mathcal{P}(\lnot C) \leqslant 2 \, \mathcal{P}(\lnot B)$ , which is the first statement of the theorem.

The simplification when all the $\omega_i = \frac{1}{n}$ is an immediate consequence. To simplify to d = 3, we observe that $\frac{t}{\frac{2}{3} - \frac{3}{20}} = \frac{60}{31} t$ ; substituting $t \rightarrow \frac{31}{60} t$ on the right hand side yields an expression in the form $\newcommand{\e}{{\rm e}} 1 - 6 \exp(-f(t) n t^2)$ , where $f(t)$ is a rational function bounded below by $\frac{1}{9}$ for $t \in [0, \frac{5}{1000}]$ . □

We now make a few remarks. First, if you carefully examine proposition 16, the lower bound on $\newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \Adx''(z)$ improves as $z \rightarrow 0$ . One can wring some extra information out of this, but the improvement in the final bound is minimal. Similarly, it is clear that one could set $t_* < \frac{1}{100}$ in our bound on $\newcommand{\m}{\mathcal} \mathcal{P}(\lnot A)$ without losing the conclusion, as long as t_* < t. Again, this does not significantly improve things.

5. Distances and angles

We now want to restate our main theorem 18 in terms of the chordal and max-angular distance from a random arm to the nearest closed polygon using proposition 12.

Corollary 19. If we have n points $\hat{x}_i$ sampled uniformly on S^d−1 ( $d \geqslant 2$ ), n weights $\omega_i > 0$ so that $\sum \omega_i = 1$ , and $\newcommand{\m}{\mathcal} \max \omega_i = \Omega$ , then for any $t < \frac{5}{1000} \cdot \frac{1}{\sqrt{2\sum\omega_i^2}}$ we have

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var \,}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left(d_{\rm chordal}(\pmb{x},\Pol(n,d,w)) < \frac{t}{\frac{d-1}{d} - \frac{3}{20}}\right) \geqslant 1 - 2 d \exp\left( \frac{-3 t^2}{3 + \Omega \,t \sqrt{\frac{2 d n}{1 + n^2 \Var \omega_i}}} \right). \nonumber \end{align*}$

If all the $\omega_i$ are equal (the polygon is equilateral), for $t < \frac{5}{1000} \cdot \sqrt{\frac{n}{2}}$ we have

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left(d_{\rm chordal}(\pmb{x},\Pol(n,d,1)) < \frac{t}{\frac{d-1}{d} - \frac{3}{20}}\right) \geqslant 1 - 2 d \exp\left( \frac{-t^2}{1 + \frac{\sqrt{d}}{600}} \right). \nonumber \end{align*}$

In dimension 3, this simplifies (again, for $t < \frac{5}{1000} \cdot \sqrt{\frac{n}{2}}$ ), as

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left(d_{\rm chordal}(\pmb{x},\Pol(n,3,1)) < t \right) \geqslant 1 - 6 \exp\left( \frac{-t^2}{4} \right). \nonumber \end{align*}$

The problem with corollary 19 is that the hypotheses (on t) are disappointingly restrictive: for $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ , we need $n > 538\, 519$ to extend the domain of t to the point where the right-hand side becomes positive! On the other hand, numerical experiments (figure 3) comparing our bounds to experimental data and to the large-n Nakagami distribution proved in proposition 10 show that the conclusions of corollary 19 cannot be made much stronger. Further, these experiments suggest that, at least in the equilateral case, one should be able to entirely remove the upper bound on t—we leave this as

**Figure 3.** For $d=2, 3, 4, 10$ , we generated 250 000 random elements of $\newcommand{\Arm}{{\rm Arm}} \Arm(10, d, 1)$ . The plots show the implied bound from corollary 19 (solid), the empirical CDF of chordal distance to closure for those samples which were median-closeable (dots), and the CDF of the Nakagami $\left(\frac{d}{2}, \frac{d}{d-1}\right)$ distribution (dashed) given by proposition 10 for the large-n limit (which is only slightly different, even though n = 10 is quite small). Though the hypotheses of corollary 19 are only satisfied when $t<\frac{5}{1000}\sqrt{5}\approx 0.011\, 18$ , the data strongly suggests that the bound is valid on a much larger range. We see from the plots that the bound cannot be dramatically improved.
Download figure:
Standard image High-resolution image

Conjecture 20. The conclusions of corollary 19 hold for any t > 0.

We now proceed to prove corollary 19.

Proof of corollary 19. Proposition 12 tells us $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\Pol}{{\rm Pol}} \newcommand{\gm}{\vec {\mu}} d_{\rm chordal}(\pmb{x}, \Pol(n, d, w))<\sqrt{2 \sum \omega_i^2} \norm{\gm}$ , so to get a bound on the probability that $\newcommand{\Pol}{{\rm Pol}} d_{\rm chordal}(\pmb{x}, \Pol(n, d, w)) < \frac{t}{\frac{d-1}{d} - \frac{3}{20}}$ we need to make the substitution $t \rightarrow t \sqrt{2 \sum \omega_i^2}$ on the right hand side of (6). Recalling that lemma 13 shows $\newcommand{\Var}{{\rm Var\,}} \sum \omega_i^2 = \frac{1 + n^2 \Var \omega_i}{n}$ and carefully simplifying yields the first result.

For the second result, it follows immediately from the assumption that $\omega_i = \frac{1}{n}$ that the first result simplifies to

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left(d_{\rm chordal}(\pmb{x},\Pol(n,d,1)) < \frac{t}{\frac{d-1}{d} - \frac{3}{20}}\right) \geqslant 1 - 2 d \exp\left( \frac{-3 t^2}{3 + t\sqrt{\frac{2 d}{n}}} \right). \nonumber \end{align*}$

Using our upper bound on t, we see that the right hand side obeys

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle 1 - 2 d \exp\left( \frac{-3 t^2}{3 + t\sqrt{\frac{2 d}{n}}} \right) > 1 - 2 d \exp\left( \frac{-3 t^2}{3 + \frac{\sqrt{d}}{200}} \right) \nonumber \end{align*}$

which immediately implies the second result.

For the third result, we simplify the fraction on the left hand side and substitute $t \rightarrow \frac{31}{60} t$ as we did above in the simplification of theorem 18; the complicated constant that results as the coefficient of t² in the exponent is slightly less than $-\frac{1}{4}$ . □

The statements for the maximum angular change in edge direction are similar, but somewhat easier to prove because the relationship between $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}$ and the max-angular distance is simpler.

Corollary 21. If we have n points $\hat{x}_i$ sampled uniformly on S^d−1 ( $d \geqslant 2$ ), n weights $\omega_i > 0$ so that $\sum \omega_i = 1$ , and $\newcommand{\m}{\mathcal} \max \omega_i = \Omega$ , then for any $t < \frac{5}{1000}$ we have

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Var}{{\rm Var\,}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left(d_{{\rm max}\mbox{-}{\rm angular}}(\pmb{x},\Pol(n,d,w)) < \frac{t}{\frac{d-1}{d} - \frac{3}{20}}\right) \geqslant 1 - 2 d \exp\left(- \frac{13 n t^2}{9 n t \, \Omega \sqrt{d} + 30 (1 + n^2 \Var \omega_i)} \right). \nonumber \end{align*}$

If all the $\omega_i$ are equal (the polygon is equilateral), for $t < \frac{5}{1000}$ we have

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left(d_{{\rm max}\mbox{-}{\rm angular}}(\pmb{x},\Pol(n,d,1)) < \frac{t}{\frac{d-1}{d} - \frac{3}{20}}\right) \geqslant 1 - 2 d \exp\left( \frac{-26 n t^2}{60 + \frac{9\sqrt{d}}{100}} \right). \nonumber \end{align*}$

In dimension 3, this simplifies (again, for $t < \frac{5}{1000}$ ), as

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Pol}{{\rm Pol}} \newcommand{\m}{\mathcal} \displaystyle \mathcal{P}\left(d_{{\rm max}\mbox{-}{\rm angular}}(\pmb{x},\Pol(n,3,1)) < t \right) \geqslant 1 - 6 \exp\left( -\frac{n t^2}{9} \right). \nonumber \end{align*}$

Proof. We know from proposition 12 that $\newcommand{\arc}[1]{\gamma_{#1}} \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\Pol}{{\rm Pol}} \newcommand{\gm}{\vec {\mu}} d_{{\rm max}\mbox{-}{\rm angular}}(\pmb{x}, \Pol(n, d, w)) < \arcsin \norm{\gm}$ . Since we're only going to apply this bound when $\newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm} < \frac{t}{\frac{d-1}{d} - \frac{3}{20}} < \frac{1}{70}$ (since $d \geqslant 2$ and $t \leqslant \frac{5}{1000}$ ), we can safely make the overestimate $\newcommand{\arc}[1]{\gamma_{#1}} \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \arcsin \norm{\gm} \leqslant \frac{14}{13} \norm{\gm}$ .

Substituting $t \rightarrow \frac{13}{14}t$ in (6) leads us to replace 3 by $3 (\frac{13}{14}){}^2 < 2.6$ in the coefficient of nt² in the numerator and 2 by $2 (\frac{13}{14}) > 1.8$ in the coefficient of t in the denominator. Simplifying gives us the first statement.

To reach the second statement, we first observe that $\omega_i = \frac{1}{n}$ means $\newcommand{\Var}{{\rm Var\,}} \Var \omega_i = 0$ and $\Omega = \frac{1}{n}$ . Substituting these into the first statement (and overestimating the t in the denominator by $\frac{5}{1000}$ ) yields the result.

Finally, the third statement (as in the proof of corollary 19) requires us to substitute $t \rightarrow \frac{31}{60} t$ to simplify the left-hand side. The resulting complicated coefficient of nt² on the right-hand side is about $-0.115\, 521 < -\frac{1}{9}$ . □

6. Discussion

From corollary 21 we see that closing a random arm is unlikely to change any edge very much. In particular, we should expect local features to be preserved by closure, as in the case of the local trefoil knot shown in figure 4. This suggests that closing up an arm is unlikely to destroy any local knots: in other words, the probability of local knotting in the standard measure on $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, w)$ should be essentially the same as the probability of local knotting in the pushforward measure on $\newcommand{\Pol}{{\rm Pol}} \Pol(n, 3, w)$ via the map $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \pmb{x} \mapsto \gmc(\pmb{x})$ .

**Figure 4.** A 10 000 step equilateral arm in $\newcommand{\m}{\mathcal} \mathbb{R}^3$ containing a small trefoil (top left) and its geometric median closure (bottom right). The intermediate images show equally-spaced points along the geodesic between the arm and its closure in $\newcommand{\Arm}{{\rm Arm}} \Arm(10\, 000, 3, 1)$ . The failure to close of the arm is ≈101.118 and the geometric median has norm $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \|\gm\|\approx 0.015\, 1318$ . The chordal distance between the arm and its closure is ≈ $1.236\, 96$ and $\newcommand{\m}{\mathcal} d_{{\rm max}\mbox{-}{\rm angular}}\approx 0.015\, 1324$ , which agrees with the bound $\newcommand{\arc}[1]{\gamma_{#1}} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \arcsin \|\gm\|$ to eleven decimal places.
Download figure:
Standard image High-resolution image

$ \newcommand{\m}{\mathcal} \mathbb{R}^3$ — **Figure 4.** A 10 000 step equilateral arm in $\newcommand{\m}{\mathcal} \mathbb{R}^3$ containing a small trefoil (top left) and its geometric median closure (bottom right). The intermediate images show equally-spaced points along the geodesic between the arm and its closure in $\newcommand{\Arm}{{\rm Arm}} \Arm(10\, 000, 3, 1)$ . The failure to close of the arm is ≈101.118 and the geometric median has norm $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \|\gm\|\approx 0.015\, 1318$ . The chordal distance between the arm and its closure is ≈ $1.236\, 96$ and $\newcommand{\m}{\mathcal} d_{{\rm max}\mbox{-}{\rm angular}}\approx 0.015\, 1324$ , which agrees with the bound $\newcommand{\arc}[1]{\gamma_{#1}} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \arcsin \|\gm\|$ to eleven decimal places.
Download figure:
Standard image High-resolution image

Of course, this map is not defined on all of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ , but we know from theorem 18 that it is defined on all but an exponentially small fraction of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)=\prod S^{d-1}(w_i)$ ; pushing forward the restriction of the product measure to the domain of $\newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \newcommand{\gmc}{{\rm gmc}} \gmc$ produces what we will call the pushforward measure on $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ . On the other hand, the standard probability measure on $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ is simply the volume measure induced by the Riemannian metric it inherits from $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ . Since we have seen in corollaries 19 and 21 that almost all of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ is within a fixed distance of $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ , it is reasonable to expect that this pushforward measure is close to the standard measure.

Indeed, this seems to be true. Rayleigh [20] showed that the distribution of end-to-end distances in a random element of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ is

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle \Phi_n(\ell) = \frac{1}{2 \pi^2 \ell} \int_0^\infty x \sin (\ell x)\, {\rm sinc}^n x\, {\rm d}x. \nonumber \end{align*}$

We note that a closed form for $\Phi_n$ is classical (see [14, 2.181]). Since a random closed polygon is formed from two random arms, conditioned on the hypothesis that their end-to-end distances are the same, the pdf of the length of the chord connecting vertices 0 and k in an polygon of n edges turns out to be given by

$\begin{align*} \newcommand{\e}{{\rm e}} \displaystyle {\rm Chord}_{n,k}(\ell) = \frac{1}{C(n)} 4 \pi \ell^2 \Phi_k(\ell) \Phi_{n-k}(\ell) \nonumber \end{align*}$

where the factor of $\newcommand{\e}{{\rm e}} 4 \pi \ell^2$ comes from the fact that vertex k lies on a sphere of radius $\newcommand{\e}{{\rm e}} \ell$ and $C(n)$ is the volume of polygon space (which is known; see [2] for an identification between polygon space and a certain polytope which yields an explicit, though complicated, formula for $C(n)$ ).

Therefore, the extent to which the distributions of the chordlengths match ${\rm Chord}_{n, k}$ gives a sense of how close a given distribution on $\newcommand{\Pol}{{\rm Pol}} \Pol(n, 3, w)$ is to the standard one. For n = 4 and 5, we can see in figure 5 that the pushforward measure from $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, 1)$ is not particularly close to the standard measure. However, as n increases these statistics cannot distinguish between the pushforward measure and the standard measure; see figure 6.

**Figure 5.** For $n=4, 5$ , we generated 1000 000 random equilateral n-edge arms in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^3$ , computed their geometric median closures (when they existed), and then computed the distance from the first to the third vertex in the resulting closed n-gon. The histograms show the resulting distributions of chordlengths as well as the density of the chordlength for the standard distribution on $\newcommand{\Pol}{{\rm Pol}} \Pol(n, 3, 1)$ . Closure failed for 2474 quadrilaterals and for 117 pentagons.
Download figure:
Standard image High-resolution image

**Figure 5.** For $n=4, 5$ , we generated 1000 000 random equilateral n-edge arms in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^3$ , computed their geometric median closures (when they existed), and then computed the distance from the first to the third vertex in the resulting closed n-gon. The histograms show the resulting distributions of chordlengths as well as the density of the chordlength for the standard distribution on $\newcommand{\Pol}{{\rm Pol}} \Pol(n, 3, 1)$ . Closure failed for 2474 quadrilaterals and for 117 pentagons.
Download figure:
Standard image High-resolution image

**Figure 6.** We generated 1000 000 random equilateral 10-edge arms in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^3$ . All 1000 000 had geometric median closures, and these are the histograms of distances from the first vertex to the ith vertex in the resulting closed 10-gons, along with the chordlength densities for the standard measure on equilateral 10-gons.
Download figure:
Standard image High-resolution image

$ \newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^3$ — **Figure 6.** We generated 1000 000 random equilateral 10-edge arms in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^3$ . All 1000 000 had geometric median closures, and these are the histograms of distances from the first vertex to the ith vertex in the resulting closed 10-gons, along with the chordlength densities for the standard measure on equilateral 10-gons.
Download figure:
Standard image High-resolution image

Conjecture 22. As $n \to \infty$ , the pushforward measure from $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ to $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ converges to the standard measure.

Assuming the truth of this conjecture implies that, at least for large n, random elements of $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ look essentially like geometric median closures of random elements of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ . Since corollary 21 implies that individual edges are practically unchanged by closure, this would mean that all local phenomena happen at essentially the same rate in $\newcommand{\Arm}{{\rm Arm}} \Arm(n, d, w)$ and $\newcommand{\Pol}{{\rm Pol}} \Pol(n, d, w)$ .

When d = 3, a particularly important local phenomenon is that of local knotting. Say that a subsegment $\varsigma$ of $\newcommand{\Arm}{{\rm Arm}} \pmb{x} \in \Arm(n, 3, w)$ is an r-local knot if it only intersects the boundary of a ball B of radius r at its endpoints and $(B, \varsigma)$ forms a knotted ball-arc pair. Let $K^{\rm Arm}(n, w, k, r)$ be the probability that a length-k arc of a random element of $\newcommand{\Arm}{{\rm Arm}} \Arm(n, 3, w)$ is an r-local knot, and similarly for $K^{\rm Pol}(n, w, k, r)$ .

Conjecture 23. For small r and large n and for $k \ll n$ , $K^{\rm Arm}(n, w, k, r)\simeq K^{\rm Pol}(n, w, k, r)$ .

Acknowledgments

This paper is a contribution to the Festschrift for Stu Whittington, a giant in the area of random polymers and random knots. We are indebted to Stu for years of insightful talks, perceptive questions, and remarkable mathematical results. His interest, enthusiasm, and explanation of the importance of these questions to the polymer science community have shaped our mathematical trajectory more than we can say.

We are also grateful for the continued support of the Simons Foundation (#524120 to Cantarella, #354225 to Shonkwiler), the German Research Foundation (DFG-Grant RE 3930/1–1, to Reiter), and the organizers of the 'Workshop on Topological Knots and Polymers' at Ochanomizu University, where key steps in the present work came together. In particular, we are indebted to Cristian Micheletti, Tetsuo Deguchi, Rob Kusner (for reducing everything to conformal geometry yet again!), Eric Rawdon, and Erik Schreyer for many helpful conversations. As always, we look to Yuanan Diao for inspiration—one of the motivations for this paper was the desire to find an alternate proof of [4].

Appendix. Proof of the hypergeometric formula for the expected distance to the sphere

Recall that $\newcommand{\Ed}{{\rm Ed}} \Ed(\vec {y})$ is the expected distance from the point $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \vec {y} \in \R^d$ to the unit sphere. Since it is spherically symmetric, $\newcommand{\Ed}{{\rm Ed}} \Ed(\vec {y})$ depends only on $\|\vec {y}\|$ .

Proposition A.1. $\newcommand{\Ed}{{\rm Ed}} \Ed(\vec {y})$ is given as a function of $r = \|\vec {y}\|$ by

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Ed}{{\rm Ed}} \displaystyle \Ed(r) = \, _2F_1 \left( -\frac{1}{2}, \frac{1-d}{2}; \frac{d}{2}; r^2 \right). \nonumber \end{align*}$

We can compute $\newcommand{\Ed}{{\rm Ed}} \operatorname{Ed}(r) := \operatorname{Ed}((0,\dots,0,r))$ by evaluating at $\vec {y} = (0, \dots, 0, r)$ :

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Vol}{{\rm Vol}} \newcommand{\dVol}{\thinspace{\rm dVol}} \newcommand{\Ed}{{\rm Ed}} \displaystyle \label{eq:ed simplification 1} \Ed(r) &=\frac{1}{\Vol \,S^{d-1}} \int_{\hat{x} \in S^{d-1}} \|\hat{x} - (0, \dots, 0, r)\| \dVol_{S^{d-1}} \nonumber \\ &= \frac{1}{\Vol \, S^{d-1}} \int_{\hat{x} \in S^{d-1}} \sqrt{1+r^2-2x_d r} \dVol_{S^{d-1}},\nonumber \end{align} \tag{ A.1 }$

where the integrand only depends on the last coordinate x_d of the point on S^d−1. Then the formula for $\newcommand{\Ed}{{\rm Ed}} \Ed$ will follow from a more general formula for functions on the sphere which only depend on a single coordinate:

Lemma A.2. Suppose $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \phi:S^{d-1} \to \R$ depends only on x_i; i.e. $\phi(\hat{x}) = \phi(x_i)$ . Then

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Vol}{{\rm Vol}} \newcommand{\dVol}{\thinspace{\rm dVol}} \newcommand{\dx}{\,\mathrm{d}x} \newcommand{\m}{\mathcal} \displaystyle \int_{\hat{x} \in S^{d-1}} \phi(\hat{x}) \dVol_{S^{d-1}} = \Vol \, S^{d-2} \int_{-1}^1 \phi(x_i)(1-x_i^2)^{\frac{d-3}{2}} \dx_i. \nonumber \end{align*}$

Proof. Let $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \pi_i: S^{d-1} \to \R$ be projection to the ith coordinate. Then the projection of $\nabla \pi_i$ to the tangent space of S^d−1 has norm $\sqrt{1-x_i^2}$ , and hence the smooth coarea formula implies that

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\dArea}{\thinspace{\rm dArea}} \newcommand{\dVol}{\thinspace{\rm dVol}} \newcommand{\dx}{\,\mathrm{d}x} \newcommand{\m}{\mathcal} \displaystyle \int_{\hat{x} \in S^{d-1}} \phi(\hat{x}) \dVol_{S^{d-1}} = \int_{x_i \in [-1,1]} \int_{\vec {y} \in \pi_i^{-1}(x_i)} \frac{\phi(\vec {y})}{\sqrt{1-x_i^2}} \dArea_{\pi_i^{-1}(x_i)} \dx_i. \nonumber \end{align*}$

Since $\pi_i^{-1}(x_i)$ is a $(d-2)$ -dimensional sphere of radius $\sqrt{1-x_i^2}$ , which has area form $\newcommand{\dArea}{\thinspace{\rm dArea}} \left(1-x_i^2\right){}^{\frac{d-2}{2}}\dArea_{S^{d-2}}$ , and since ϕ is constant on each level set, the above reduces to

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Vol}{{\rm Vol}} \newcommand{\dx}{\,\mathrm{d}x} \newcommand{\m}{\mathcal} \displaystyle \Vol \, S^{d-2} \int_{-1}^1 \phi(x_i)(1-x_i^2)^{\frac{d-3}{2}} \dx_i \nonumber \end{align*}$

as desired. □

Combining this result with (A.1) shows that

$\begin{align} \newcommand{\e}{{\rm e}} \newcommand{\Vol}{{\rm Vol}} \newcommand{\dx}{\,\mathrm{d}x} \newcommand{\Ed}{{\rm Ed}} \newcommand{\m}{\mathcal} \displaystyle \label{eq:ed simplification 2} \Ed(r) = \frac{\Vol \, S^{d-2}}{\Vol \, S^{d-1}} \int_{-1}^1 (1-x_d^2)^{\frac{d-3}{2}} \sqrt{1+r^2-2x_d r} \dx_d, \nonumber \end{align} \tag{ A.2 }$

which is the starting point of our derivation of the hypergeometric formula.

Proof of proposition A.1. Using $\newcommand{\Vol}{{\rm Vol}} \Vol \,S^k = \frac{2\pi^{\frac{k+1}{2}}}{\Gamma(\frac{k+1}{2})}$ and the gamma function duplication formula ${\Gamma(\zeta)\Gamma(\zeta+\frac{1}{2}) = 2^{1-2\zeta}\sqrt{\pi}\Gamma(2\zeta)}$ , we can write the ratio of sphere volumes as

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Vol}{{\rm Vol}} \displaystyle \frac{\Vol \, S^{d-2}}{\Vol \, S^{d-1}} = \frac{2^{2-d}\Gamma(d-1)}{\Gamma(\frac{d-1}{2})\Gamma(\frac{d-1}{2})}. \nonumber \end{align*}$

Substituting this into (A.2) and completing the square inside the square root yields

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\dx}{\,\mathrm{d}x} \newcommand{\m}{\mathcal} \displaystyle (1+r) \frac{2^{2-d}\Gamma(d-1)}{\Gamma(\frac{d-1}{2})\Gamma(\frac{d-1}{2})} \int_{-1}^1 \left(1-\frac{2r}{(1+r)^2}(1+x_d)\right)^{\frac{1}{2}} (1-x_d)^{\frac{d-3}{2}}(1+x_d)^{\frac{d-3}{2}} \dx_d. \nonumber \end{align*}$

Making the substitution $u=\frac{1+x_d}{2}$ produces

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\dx}{\,\mathrm{d}x} \newcommand{\Ed}{{\rm Ed}} \newcommand{\m}{\mathcal} \displaystyle \Ed(r) = (1+r) \frac{2^{2-d}\Gamma(d-1)}{\Gamma(\frac{d-1}{2})\Gamma(\frac{d-1}{2})} \int_{-1}^1 u^{\frac{d-3}{2}}(1-u)^{\frac{d-3}{2}}\left(1-\frac{4r}{(1+r)^2}u\right)^{\frac{1}{2}} \dx_d, \nonumber \end{align*}$

which is the standard integral representation of $(1+r)\, _2F_1\left(-\frac{1}{2}, \frac{d-1}{2}; d-1; \frac{4r}{(1+r){}^2}\right)$ . In turn, applying Kummer's quadratic transformation [10, 9.134.3] yields the desired formula

$\begin{align*} \newcommand{\e}{{\rm e}} \newcommand{\Ed}{{\rm Ed}} \displaystyle \Ed(r) = \, _2F_1\left(-\frac{1}{2}, \frac{1-d}{2}; \frac{d}{2}; r^2\right). \nonumber \end{align*}$

□

Open and closed random walks with fixed edgelengths in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^d$

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Constructing a nearby closed polygon

3. Asymptotics of the geometric median and the distance to closure

4. Concentration inequalities for $\boldsymbol{ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}}$ and d_chordal

4.1. A bound connecting ${ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}}$ and d_chordal

4.2. Strategy for the tail bound

4.3. A probabilistic bound on ${ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \|\gradAdx(\vec {0})\| = \norm{\sum \omega_i \hat{x}_i}}$

4.5. A bound on the change in the radial second derivative

4.6. Bounding the norm of the geometric median

5. Distances and angles

6. Discussion

Acknowledgments

Appendix. Proof of the hypergeometric formula for the expected distance to the sphere

Footnotes

Open and closed random walks with fixed edgelengths in \newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^d

Article metrics

Submit

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Constructing a nearby closed polygon

3. Asymptotics of the geometric median and the distance to closure

4. Concentration inequalities for \boldsymbol{ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}} and dchordal

4.1. A bound connecting { \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}} and dchordal

4.2. Strategy for the tail bound

4.3. A probabilistic bound on { \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \|\gradAdx(\vec {0})\| = \norm{\sum \omega_i \hat{x}_i}}

4.5. A bound on the change in the radial second derivative

4.6. Bounding the norm of the geometric median

5. Distances and angles

6. Discussion

Acknowledgments

Appendix. Proof of the hypergeometric formula for the expected distance to the sphere

Footnotes

Open and closed random walks with fixed edgelengths in $\newcommand{\m}{\mathcal} \newcommand{\R}{\mathbb{R}} \R^d$

4. Concentration inequalities for $\boldsymbol{ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}}$ and d_chordal

4.1. A bound connecting ${ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\m}{\mathcal} \newcommand{\gm}{\vec {\mu}} \norm{\gm}}$ and d_chordal

4.3. A probabilistic bound on ${ \newcommand{\norm}[1]{\left\| #1 \right\|} \newcommand{\grad}{\nabla} \newcommand{\Ad}{{\rm Ad}} \newcommand{\Adx}{\Ad_{\pmb{x}}} \newcommand{\gradAd}{\nabla\!\Ad} \newcommand{\gradAdx}{\nabla\!\Adx} \|\gradAdx(\vec {0})\| = \norm{\sum \omega_i \hat{x}_i}}$