Skip to main content
Log in

Sliced and Radon Wasserstein Barycenters of Measures

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

This article details two approaches to compute barycenters of measures using 1-D Wasserstein distances along radial projections of the input measures. The first method makes use of the Radon transform of the measures, and the second is the solution of a convex optimization problem over the space of measures. We show several properties of these barycenters and explain their relationship. We show numerical approximation schemes based on a discrete Radon transform and on the resolution of a non-convex optimization problem. We explore the respective merits and drawbacks of each approach on applications to two image processing problems: color transfer and texture mixing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. https://github.com/gpeyre/2014-JMIV-SlicedTransport

References

  1. Agueh, M., Carlier, G.: Barycenters in the wasserstein space. SIAM J. Math. Anal. 43(2), 904–924 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  2. Averbuch, A., Coifman, R., Donoho, D., Israeli, M., Shkolnisky, Y., Sedelnikov, I.: A framework for discrete integral transformations: II. The 2D discrete radon transform. SIAM J. Sci. Comput. 30(2), 785–803 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  3. Benamou, J.D., Brenier, Y.: A computational fluid mechanics solution of the monge-kantorovich mass transfer problem. Numer. Math. 84(3), 375–393 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  4. Benamou, J.D., Froese, B.D., Oberman, A.M.: A viscosity solution approach to the Monge-Ampere formulation of the Optimal Transportation Problem. arXiv:1208.4873v2 (2013, unpublished)

  5. Bertsekas, D.: The auction algorithm: a distributed relaxation method for the assignment problem. Ann. Operat. Res. 14, 105–123 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  6. Bigot, J., Klein, T.: Consistent estimation of a population barycenter in the wasserstein space. Preprint arXiv:1212.2562v3 (2014)

  7. Boman, J., Lindskog, F.: Support theorems for the radon transform and Cramèr–Wold theorems. J. Theor. Prob. 22(3), 683–710 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  8. Bonneel, N., van de Panne, M., Paris, S., Heidrich, W.: Displacement interpolation using lagrangian mass transport. ACM Trans. Graph. (SIGGRAPH ASIA’11) 30(6), 1–12 (2011)

    Article  Google Scholar 

  9. Brady, M.L.: A fast discrete approximation algorithm for the radon transform. J. Comput. 27(1), 107–119 (1998)

    MATH  MathSciNet  Google Scholar 

  10. Cuturi, M., Doucet, A.: Fast computation of wasserstein barycenters. arXiv:1310.4375v1 (2013, unpublished)

  11. Dellacherie, C., Meyer, P.A.: Probabilities and Potential Math. Stud. 29. North Holland, Amsterdam (1978)

    Google Scholar 

  12. Delon, J.: Movie and video scale-time equalization application to flicker reduction. IEEE Trans. Image Process. 15(1), 241–248 (2006)

    Article  MathSciNet  Google Scholar 

  13. Desolneux, A., Moisan, L., Ronsin, S.: A compact representation of random phase and Gaussian textures. In: Proc. the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1381–1384 (2012)

  14. Digne, J., Cohen-Steiner, D., Alliez, P., Goes, F., Desbrun, M.: Feature-preserving surface reconstruction and simplification from defect-laden point sets. J. Math. Imaging Vis. 48(2), 369–382 (2013)

    Google Scholar 

  15. Ferradans, S., Xia, G.S., Peyré, G., Aujol, J.F.: Optimal transport mixing of gaussian texture models. In: Proc. SSVM’13 (2013)

  16. Galerne, B., Gousseau, Y., Morel, J.M.: Random phase textures: theory and synthesis. IEEE Trans. Image Process. 20(1), 257–267 (2011)

    Article  MathSciNet  Google Scholar 

  17. Galerne, B., Lagae, A., Lefebvre, S., Drettakis, G.: Gabor noise by example. ACM Trans. Graph. (Proceedings of ACM SIGGRAPH 2012) 31(4), 73.1–73.9 (2012)

    Google Scholar 

  18. Haker, S., Zhu, L., Tannenbaum, A., Angenent, S.: Optimal mass transport for registration and warping. Int. J. Comput. Vis. 60(3), 225–240 (2004)

    Article  Google Scholar 

  19. Helgason, S.: The Radon Transform. Birkhauser, Boston (1980)

    Book  MATH  Google Scholar 

  20. Kantorovich, L.: On the transfer of masses. Doklady Akademii Nauk 37(2), 227–229 (1942). (in russian)

    Google Scholar 

  21. Kuhn, H.W.: The Hungarian method of solving the assignment problem. Naval Res. Logist. Quart. 2, 83–97 (1955)

    Article  MathSciNet  Google Scholar 

  22. Matusik, W., Zwicker, M., Durand, F.: Texture design using a simplicial complex of morphable textures. ACM Trans. Graph. 24(3), 787–794 (2005)

    Article  Google Scholar 

  23. McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  24. Mérigot, Q.: A multiscale approach to optimal transport. Comput. Graph. Forum 30(5), 1583–1592 (2011)

    Article  Google Scholar 

  25. Papadakis, N., Peyré, G., Oudet, E.: Optimal transport with proximal splitting. SIAM J. Imaging Sci. 7(1), 212–238 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  26. Pitié, F., Kokaram, A.C., Dahyot, R.: N-Dimensional probability density function transfer and its application to color transfer. In: Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, vol. 2, pp. 1434–1439. (2005)

  27. Rabin, J., Delon, J., Gousseau, Y.: Removing artefacts from color and contrast modifications. IEEE Trans. Image Process. 20(11), 3073–3085 (2011)

    Article  MathSciNet  Google Scholar 

  28. Rabin, J., Peyré, G., Delon, J., Bernot, M.: Wasserstein barycenter and its application to texture mixing. In: Scale Space and Variational Methods in Computer Vision (SSVM’11), vol. 6667, pp. 435–446 (2011).

  29. Reinhard, E., Pouli, T.: Colour spaces for colour transfer. In: Proceedings of the Third international conference on Computational color imaging. CCIW’11, pp. 1–15. Springer, Berlin (2011)

  30. Rubner, Y., Tomasi, C., Guibas, L.: A metric for distributions with applications to image databases. In: IEEE International Conference on Computer Vision (ICCV’98), pp. 59–66 (1998)

  31. Solodov, M.: Incremental gradient algorithms with stepsizes bounded away from zero. Comput. Optim. Appl. 11(1), 23–35 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  32. Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics Series. American Mathematical Society (2003)

Download references

Acknowledgments

We thank Marco Cuturi for applying his method to our dataset and for sharing his results. We thank Thouis R. Jones for useful feedback on our draft, and anonymous reviewers for their help in improving this paper. We also thank the authors of all the images used to demonstrate our color transfers. This work has been partially supported by NSF CGV-1111415. Gabriel Peyré acknowledges support from the European Research Council (ERC project SIGMA-Vision).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriel Peyré.

Appendices

Appendix 1: Proofs of Section 2

Proof of Proposition 1

From the definition (5), one verifies that

$$\begin{aligned} \text {W}_{\mathbb {R}^d}(\varphi _{s,u} \sharp \mu _1,\varphi _{s,u} \sharp \mu _2) = s \text {W}_{\mathbb {R}^d}(\mu _1,\mu _2), \end{aligned}$$
(49)

so that

$$\begin{aligned} \fancyscript{E}_{s,u}(\mu )&= \sum _{i \in I} \lambda _i \text {W}_{\mathbb {R}^d}( \varphi _{s,u} \sharp \mu _i, \mu )^2\\&= s^2 \sum _{i \in I} \lambda _i \text {W}_{\mathbb {R}^d}( \mu _i, \varphi _{s,u}^{-1} \sharp \mu )^2 = s^2 \fancyscript{E}_{1,0}(\tilde{\mu }), \end{aligned}$$

where we have introduced the following change of variable

$$\begin{aligned} \mu = \varphi _{s,u} \sharp \tilde{\mu } \quad \Longleftrightarrow \quad \tilde{\mu } = \varphi _{s,u}^{-1} \sharp \mu , \end{aligned}$$

(note that \(\varphi _{s,u}^{-1} = \varphi _{s^{-1},-s^{-1}u}\)). One thus has

$$\begin{aligned} \underset{ \mu }{{{\mathrm{argmin}}}}\; \fancyscript{E}_{s,u}(\mu )&= \varphi _{s,u} \sharp \underset{ \tilde{\mu }}{{{\mathrm{argmin}}}}\; \fancyscript{E}_{1,0}(\tilde{\mu }) \end{aligned}$$

which proves (7). Property (8) is proved similarly. Properties (9) and (11) directly follow from  (8). \(\square \)

Proof of Proposition 2

We aim at determining \((s^\star ,u^\star )\) such that

$$\begin{aligned} \mu ^\star \in \text {Bar}_{\mathbb {R}^d}^W(\mu _i,\lambda _i)_{i \in I} \quad \text {where} \quad \left\{ \begin{array}{l} \mu ^\star = \varphi ^\star \sharp \mu , \\ \mu _i = \varphi _i \sharp \mu , \end{array} \right. \end{aligned}$$

and where for simplicity we have denoted \(\varphi _i = \varphi _{s_i,u_i}\) and \(\varphi ^\star = \varphi _{s^\star ,u^\star }\). First, let us notice that

$$\begin{aligned} \varphi _{s,u}(x) = \nabla \left( \frac{s}{2}|| x+u/s ||^2 \right) , \end{aligned}$$

so that the set \(\fancyscript{T}\) of maps of the form \(\varphi _{s,u}\) is a subset of gradients of convex functions. This point is important since optimal maps between \(\mu _i\) and \(\mu ^\star \) are characterized as the gradient of convex functions that push forward \(\mu _i\) onto \(\mu ^\star \), see [32]. Following [1], we thus only need to show that

$$\begin{aligned}&\sum _{i \in I} \lambda _i T_i = \mathrm {Id}_{\mathbb {R}^d} \quad \text {where} \quad T_i = \varphi ^\star \circ \varphi _i^{-1} = \varphi _{ \tilde{s}_i, \tilde{u}_i }\\&\quad \text {where} \quad \left\{ \begin{array}{l} \tilde{s}_i = s^\star s_i^{-1} \\ \tilde{u}_i = u^{\star }-s^\star s_i^{-1} u_i \end{array} \right. \end{aligned}$$

since \(T_i \sharp \mu _i = \mu ^\star \) and \(T_i \in \fancyscript{T}\) is a gradient of a convex function. So that \(\mu ^\star \) is a barycenter if and only if

$$\begin{aligned} \sum _{i \in I} \lambda _i T_i&= \sum _{i \in I} \lambda _i \varphi _{\tilde{s}_i, \tilde{u}_i} \\&= \varphi _{\sum _{i \in I} \lambda _i \tilde{s}_i, \sum _{i \in I} \lambda _i \tilde{u}_i } = \mathrm {Id}_{\mathbb {R}^d} = \varphi _{1,0}. \end{aligned}$$

This in turn is equivalent to the relationships

$$\begin{aligned} \sum _{i \in I} \lambda _i \tilde{s}_i = 1 \quad \text {and} \quad \sum _{i \in I} \lambda _i \tilde{u}_i = 0, \end{aligned}$$

which corresponds to (14). \(\square \)

Proof of Proposition 3

The proof is done in [1] for \(\mu = \mu _j\) for some \(j \in I\), which is supposed to be absolutely continuous. It extends to an arbitrary measure \(\mu \). \(\square \)

Proof of Corollary 1

When using \(\mu \), the uniform and normalized measure on \([0,1]\), with the notation of Proposition (3), one has \(T_i = C_{\mu _i}^+\). This is indeed a classical result for 1-D optimal transport, see for instance [1], Section 6.1. One then recognizes that formula (18) is the same as formula (15). \(\square \)

Proof of Proposition 4

One has

$$\begin{aligned} \nu ^\star&\in \underset{\nu \in \bar{\fancyscript{M}}_1^+(\varOmega ^d)}{{{\mathrm{argmin}}}}\; \sum _{i \in I} \lambda _i \text {W}_{\varOmega ^d}(\nu _i,\nu )^2 \\&= \underset{\nu \in \bar{\fancyscript{M}}_1^+(\varOmega ^d)}{{{\mathrm{argmin}}}}\; \int _{\mathbb {S}^{d-1}} \sum _{i \in I} \lambda _i \text {W}_{\mathbb {R}}( \nu _i^\theta , \nu ^\theta )^2 \mathrm {d}\theta . \end{aligned}$$

This is equivalent to the fact that for almost all \(\theta \in \mathbb {S}^{d-1}\), one has

$$\begin{aligned} \nu ^{\star , \theta } \in \text {Bar}_{\mathbb {R}}^W(\nu _i^\theta ,\lambda _i)_{i \in I}. \end{aligned}$$

\(\square \)

Proof of Proposition 5

Proof of (20). Similarly to the proof of (7), the proof of (20) is obtained by using the following invariance of the Wasserstein distance on \(\varOmega ^d\)

$$\begin{aligned} \text {W}_{\varOmega ^d}( \psi _{s,u} \sharp \nu _1, \psi _{s,u} \sharp \nu _2 ) = s \text {W}_{\varOmega ^d}( \nu _1, \nu _2 ). \end{aligned}$$
(50)

Proof of (21). One has that \(\nu ^\star \in \text {Bar}_{\varOmega ^d}^W( \psi _{s_i,u_i} \sharp \nu , \lambda _i )_{i \in I}\) is equivalent to

$$\begin{aligned} \text {for almost all}\, \theta \in \mathbb {S}^{d-1}, \,\, (\nu ^\star )^\theta \in \text {Bar}_{\mathbb {R}}^W( \varphi _{s_i,\langle u_i,\,\theta \rangle } \sharp \nu ^\theta , \lambda _i )_{i \in I}. \end{aligned}$$

Using the property of proposition 2 for \(d=1\), one obtains that

$$\begin{aligned} \text {Bar}_{\mathbb {R}}^W( \varphi _{s_i,\langle u_i,\,\theta \rangle } \sharp \nu ^\theta , \lambda _i )_{i \in I} \;\ni \; \varphi _{s^\star ,\langle u^\star ,\,\theta \rangle } \sharp \nu ^\theta , \end{aligned}$$

which gives the desired result. \(\square \)

Appendix 2: Proofs of Section 3

Proof of Proposition 6

For all \(g \in \fancyscript{C}_0(\varOmega ^d)\), one has

$$\begin{aligned}&\int _{\mathbb {S}^{d-1}}\int _{\mathbb {R}} g(t, \theta ) \mathrm {d}(R(\mu )^\theta )(t) \mathrm {d}\theta = \int _{\varOmega ^d}g(t, \theta )\mathrm {d}(R(\mu ))(t, \theta ) \\&\qquad = \int _{\mathbb {R}^d}(R^*g)(x)\mathrm {d}\mu (x) \\&\qquad = \int _{\mathbb {R}^d}\int _{\mathbb {S}^{d-1}} g(P_\theta (x),\theta ) \mathrm {d}\theta \mathrm {d}\mu (x)\\&\qquad = \int _{\mathbb {S}^{d-1}} \int _{\mathbb {R}} g(y,\theta ) \mathrm {d}(P_\theta \sharp \mu )(y) \mathrm {d}\theta . \end{aligned}$$

\(\square \)

Proof of Lemma 1

Proof of (26): For all \(g \in \fancyscript{C}_0(\varOmega ^d)\), one has

$$\begin{aligned} \int _{\mathbb {R}^d} g \mathrm {d}[ R (\varphi _{s,u} \sharp \mu )]&= \int _{\mathbb {R}^d} R^*(g) \mathrm {d}[ \varphi _{s,u} \sharp \mu ] \\&= \int _{\mathbb {R}^d} \int _{\mathbb {S}^{d-1}} g(\langle sx+u,\,\theta \rangle ,\theta ) \mathrm {d}\theta \mathrm {d}\mu (x) \\&= \!\int _{\mathbb {R}^d}\! \int _{\mathbb {S}^{d-1}} \!(g \!\circ \!\psi _{s,u})(\langle x,\,\theta \rangle ,\theta ) \mathrm {d}\theta \mathrm {d}\mu (x) \\&= \int _{\mathbb {R}^d} (g \circ \psi _{s,u}) \mathrm {d}[R(\mu )] \\&= \int _{\mathbb {R}^d} g \mathrm {d}[ \psi _{s,u} \sharp R(\mu )]. \end{aligned}$$

Proof of (27): First we notice, using (22), that

$$\begin{aligned}&R( f \circ \varphi _{s,u} )(t,\theta ) = \int _{{\mathbb R}^{d-1} } f\left( s(t\theta + U_\theta \gamma ) + u \right) \mathrm {d}\gamma \\&\qquad = \int _{{\mathbb R}^{d-1} } f\left( st\theta + U_\theta s\gamma + \langle u,\,\theta \rangle \theta + U_\theta (U_\theta )^{T}u \right) \mathrm {d}\gamma \\&\qquad = \int _{{\mathbb R}^{d-1} } f\left( (st+\langle u,\,\theta \rangle )\theta + U_\theta (s\gamma + (U_\theta )^{T}u) \right) \mathrm {d}\gamma \\&\qquad = s^{1-d} \int _{{\mathbb R}^{d-1} } f\left( \psi _{s,u} (t,\theta ) \theta + U_\theta \gamma ' \right) \mathrm {d}\gamma ' \end{aligned}$$

which proves

$$\begin{aligned} R( f \circ \varphi _{s,u} ) = s^{1-d} R(f) \circ \psi _{s,u}. \end{aligned}$$
(51)

We write \(H = (R^*R)^{-1}\) the filtering operator with kernel \(h^+\). One has, for smooth functions \(f \in \fancyscript{S}(\mathbb {R}^d)\), denoting \(\fancyscript{F}(f)=\hat{f}\),

$$\begin{aligned} \fancyscript{F}( H(f \circ \varphi _{s,u}) )&= c^{-1}|| \omega ||^{1-d} \hat{f}(s\omega ) e^{-\mathrm {i}\langle \omega ,\,u\rangle }, \\ \fancyscript{F}( H(f) \circ \varphi _{s,u} )&= c^{-1}|| s\omega ||^{1-d} \hat{f}(s\omega ) e^{-\mathrm {i}\langle \omega ,\,u\rangle }, \end{aligned}$$

and hence

$$\begin{aligned} H(f) \circ \varphi _{s,u} = s^{1-d} H(f \circ \varphi _{s,u}). \end{aligned}$$
(52)

This shows, using (51) and (52) that for all \(f \in \fancyscript{D}(\mathbb {R}^d)\),

$$\begin{aligned} \int _{\mathbb {R}^d} f \mathrm {d}[ R^+( \psi _{s,u} \sharp \nu ) ]&= \int _{\mathbb {R}^d} (RHf) \circ \psi _{s,u} \mathrm {d}\nu \\&= s^{d-1} \int _{\mathbb {R}^d} R(H(f) \circ \varphi _{s,u}) \mathrm {d}\nu \\&= \int _{\mathbb {R}^d} RH(f \circ \varphi _{s,u}) \mathrm {d}\nu \\&= \int _{\mathbb {R}^d} f \mathrm {d}[ \varphi _{s,u} \sharp R^+(\nu ) ]. \end{aligned}$$

Proof of (28): the proof is similar to the one of (26). \(\square \)

Proof of Proposition 8

Using Lemma 1, one has

$$\begin{aligned} \text {Bar}_{\mathbb {R}^d}^R(\varphi _{s,u} \sharp \mu _i,\lambda _i)_{i \in I}&= R^+ \text {Bar}_{\varOmega ^d}^W(R(\varphi _{s,u}\sharp \mu _i), \lambda _i)_{i \in I}\\&= R^+ \text {Bar}_{\varOmega ^d}^W(\psi _{s,u}\sharp (R(\mu _i)), \lambda _i)_{i \in I}\\&= R^+ \psi _{s,u} \sharp \text {Bar}_{\varOmega ^d}^W(R(\mu _i), \lambda _i)_{i \in I}\\&= \varphi _{s,u} \sharp R^+ \text {Bar}_{\varOmega ^d}^W(R(\mu _i), \lambda _i)_{i \in I}\\&= \varphi _{s,u} \sharp \text {Bar}_{\mathbb {R}^d}^R(\mu _i,\lambda _i)_{i \in I}. \end{aligned}$$

which proves (7) for \(\text {Bar}_{\mathbb {R}^d}^R\). Property (8) for \(\text {Bar}_{\mathbb {R}^d}^R\) is proved similarly using (28). \(\square \)

Proof of Proposition 9

One has

$$\begin{aligned} \text {Bar}_{\mathbb {R}^d}^R(\varphi _{s_i,u_i} \sharp \mu ,\lambda _i)_{i \in I}&= R^+ \text {Bar}_{\varOmega ^d}^W(R(\varphi _{s_i,u_i}\sharp \mu ), \lambda _i)_{i \in I}\\&= R^+ \text {Bar}_{\varOmega ^d}^W( \psi _{s_i,u_i}\sharp R(\mu ), \lambda _i)_{i \in I}\\&= R^+ \psi _{s^\star ,u^\star } \sharp \text {Bar}_{\varOmega ^d}^W(R(\mu ), \lambda _i)_{i \in I}\\&= \varphi _{s^\star ,u^\star } \sharp R^+ \text {Bar}_{\varOmega ^d}^W(R(\mu ), \lambda _i)_{i \in I}\\&= \varphi _{s^\star ,u^\star } \sharp \text {Bar}_{\mathbb {R}^d}^R(\mu ,\lambda _i)_{i \in I}, \end{aligned}$$

which proves (13) for \(\text {Bar}_{\mathbb {R}^d}^R\). \(\square \)

Appendix 3: Proof of Section 4

Proof of Proposition 10

Property (34) is a re-statement of property (19). Property (35) corresponds to the change of variable \(\nu = R\mu \in \text { Im}(R)\) in (32), which is a bijection thanks to the injectivity of \(R\), see proposition 7. \(\square \)

Proof of Proposition 11

The proof is the same as Proposition 1, replacing the invariance (49) by

$$\begin{aligned} \text {SW}_{\mathbb {R}^d}(\varphi _{s,u} \sharp \mu _1,\varphi _{s,u} \sharp \mu _2)&= \!\text {W}_{\varOmega ^d}( \!R( \varphi _{s,u} \sharp \mu _1 ), R( \varphi _{s,u} \sharp \mu _2 ) ) \\&= \text {W}_{\varOmega ^d}( \psi _{s,u}\! \sharp \!R( \mu _1 ), \psi _{s,u} \sharp R( \mu _1 ) ) \\&= \text {W}_{\varOmega ^d}( R( \mu _1 ), R( \mu _1 ) ) \\&= \text {SW}_{\mathbb {R}^d}( \mu _1, \mu _2), \end{aligned}$$

where we have used the invariance (50) of the Wasserstein distance on \(\varOmega ^d\). \(\square \)

Proof of Proposition 12

One has,

$$\begin{aligned} \forall \,\theta \in \mathbb {S}^{d-1}, \quad P_\theta \sharp \varphi _{s,u} \sharp \mu = \varphi _{s,\langle u,\,\theta \rangle } \sharp P_\theta \sharp \mu . \end{aligned}$$

Thus, for an arbitrary \(\tilde{\mu } \in \fancyscript{M}_1^+(\mathbb {R}^d)\), one has

$$\begin{aligned}&\sum _{i\in I} \lambda _i \text {W}_{\mathbb {R}}( P_\theta \sharp (\varphi _{s_i,u_i} \sharp \mu ), P_\theta \sharp \tilde{\mu } )^2\\&\qquad = \sum _{i\in I} \lambda _i \text {W}_{\mathbb {R}}( \varphi _{s_i,\langle u_i,\,\theta \rangle } \sharp (P_\theta \sharp \mu ), P_\theta \sharp \tilde{\mu } )^2 \\&\qquad \geqslant \sum _{i\in I} \lambda _i \text {W}_{\mathbb {R}}( \varphi _{s_i,\langle u_i,\,\theta \rangle } \sharp (P_\theta \sharp \mu ), \varphi _{s^\star ,\langle u^\star ,\,\theta \rangle } \sharp (P_\theta \sharp \mu ) )^2 \\&\qquad = \sum _{i\in I} \lambda _i \text {W}_{\mathbb {R}}( P_\theta \sharp (\varphi _{s_i,u_i} \sharp \mu ), P_\theta \sharp (\varphi _{s^\star ,u^\star } \sharp \mu ) )^2 \end{aligned}$$

where the inequality comes from the properties of 1-D Wasserstein barycenters. Integrating the resulting inequality with respect to \(\theta \in \mathbb {S}^{d-1}\) gives

$$\begin{aligned}&\sum _i \lambda _i \text {SW}_{\mathbb {R}^d}( \varphi _{s_i,u_i} \sharp \mu , \tilde{\mu } )^2 \\&\quad \geqslant \sum _i \lambda _i \text {SW}_{\mathbb {R}^d}( \varphi _{s_i,u_i} \sharp \mu , \varphi _{s^\star ,u^\star } \sharp \mu )^2. \end{aligned}$$

This inequality is an equality if and only for almost all \(\theta \in \mathbb {S}^{d-1}\), one has

$$\begin{aligned} P_\theta \sharp \tilde{\mu } = P_\theta \sharp (\varphi _{s^\star ,u^\star } \sharp \mu ) \end{aligned}$$

so that, using Proposition (7), this corresponds to \(\tilde{\mu } = \varphi _{s^\star ,u^\star } \sharp \mu \). Since the measure \(\tilde{\mu }\) is arbitrary, this gives the desired result. This proves (13) in the case \(\text {Bar}_{\mathbb {R}^d}^S\). \(\square \)

Appendix 4: Proof of Theorem 1

Notations. Without loss of generality, for a fixed \(Y \in \mathbb {R}^{d \times N}\), we study the smoothness of

$$\begin{aligned}&\forall \,X \in \mathbb {R}^{d\times N}, \quad \fancyscript{E}(X) = \frac{1}{2} \text {SW}_{\mathbb {R}^d}(\mu _X,\mu _Y)^2\\&\quad = \int _{\mathbb {S}^{d-1}} \fancyscript{E}_\theta (X) \mathrm {d}x\\&\quad \text {where} \quad \fancyscript{E}_\theta (X) = \frac{1}{2} \fancyscript{W}(X_\theta ,Y_\theta )^2. \end{aligned}$$

We have used, for \(x, y \in \mathbb {R}^N\), the shorthand notation

$$\begin{aligned} \fancyscript{W}(x,y) = \text {W}_{\mathbb {R}}(\mu _x,\mu _y). \end{aligned}$$

The result of Theorem 1 then follows by summations of such functionals.

We define \(\mathbb {U}(N,d)\) to be vectors of \(\mathbb {R}^{d \times N}\) with distinct entries:

$$\begin{aligned}&\mathbb {U}(N,d) \nonumber \\&\quad = \left\{ W = (W_1, \ldots , W_N) \in {\mathbb R}^{d \times N} \;;\; \forall \,i \not =j,\, X_i \not = X_j \right\} . \end{aligned}$$

The hypothesis is that \(X \in \mathbb {U}(N,d)\). One has

$$\begin{aligned} \fancyscript{E}_\theta (X) = \frac{1}{2} || X_\theta - Y_\theta \circ \sigma _\theta ||^2 \quad \text {where} \quad \sigma _\theta = \sigma _X^\theta \circ (\sigma _Y^{\theta })^{-1} \end{aligned}$$

is a permutation depending on both \(X\) and \(Y\). Note that the permutation involved are not necessarily unique, and are assumed to be arbitrary valid sorting permutations.

For \(X \in \mathbb {R}^{N \times d}\) and \(\varepsilon >0\) we introduce

$$\begin{aligned}&\Theta _\varepsilon (X)\\&\quad = \left\{ \theta \in \mathbb {S}^{d-1} \;;\; \forall \,|| \delta ||_{\mathbb {R}^{N \times d}} \leqslant \varepsilon , \quad X_\theta + \delta _\theta \in \mathbb {U}(N,1) \right\} . \end{aligned}$$

This is the set of directions for which any perturbation of \(X\) of amplitude smaller than \(\varepsilon \) has a projection with disjoint points.

Overview of the proof. In the following, we thus aim at proving that \(\fancyscript{E}\) is \(C^1\), that

$$\begin{aligned}&\tilde{\nabla } \fancyscript{E}(X) = \int _{\mathbb {S}^{d-1}} \tilde{\nabla } \fancyscript{E}_\theta (X) \mathrm {d}\theta \\&\quad \quad \text {where} \quad \tilde{\nabla } \fancyscript{E}_\theta (X) = (X_\theta - Y_\theta \circ \sigma _\theta ) \theta \end{aligned}$$

is indeed equal to \(\nabla \fancyscript{E}(X)\), and that this gradient is Lipschitz continuous.

The general strategy of the proof is to split the integration between the directions \(\theta \in \Theta _\varepsilon (X)\), for which we can locally assume that the permutations \(\sigma _\theta \) are constant (see Lemma 2), which in turn defines a smooth quadratic energy, and the remaining directions in \(\Theta _\varepsilon (X)^c\), which are shown to have a negligible contribution to the energy and to the derivative (see Lemma 3).

Preparatory results. The following lemma shows that if \(\theta \in \Theta _\varepsilon (X)\) the permutations \(\sigma _X^\theta \) are stable to small perturbations of \(X\).

Lemma 2

Let \(X \in \mathbb {U}(N,d)\). For all \(\theta \in \Theta _\varepsilon (X)\), for all \(\delta \) with \(|| \delta ||_{\mathbb {R}^{N \times d}} \leqslant \varepsilon \), the permutation \(\sigma _{X+\delta }^\theta \) that sorts \(( \langle X_i+\delta _i,\,\theta \rangle )_i\) is uniquely defined and satisfies \(\sigma _{X+\delta }^\theta = \sigma _X^\theta \).

Proof

If one has \(\sigma _{X+\delta }^\theta \ne \sigma _X^\theta \), then necessarily there exists some \(t \in [0,1]\) such that \(\sigma _{X+t\delta }^\theta \) is not uniquely defined, which is equivalent to \(X_\theta +t\delta _\theta \) not being in \(\mathbb {U}(N,1)\). Since \(|| t \delta ||_{\mathbb {R}^{N \times d}} \leqslant \varepsilon \), this shows that \(\theta \notin \Theta _\varepsilon (X)\). \(\square \)

In order to prove Theorem 1, we need the following lemma.

Lemma 3

For \(X \in \mathbb {U}(N, d)\), one has

$$\begin{aligned} \text {Vol}(\Theta _\varepsilon (X)^c) = \int _{\Theta _\varepsilon (X)^c} \mathrm {d}\theta = O( \varepsilon ). \end{aligned}$$
(53)

Proof

One has \(X_\theta + \delta _\theta \notin \mathbb {U}(N,1)\) if and only there exists a pair of points \(u=X_i+\delta _i\) and \(v=X_j+\delta _j\) with \(i \ne j\) such that

$$\begin{aligned}&\theta \in A(u,v)\\&\quad \quad \text {where} \quad A(u,v) = \left\{ \xi \in \mathbb {S}^{d-1} \;;\; \langle \xi ,\,u-v\rangle =0 \right\} \end{aligned}$$

Note that \(A(u,v)\) is a great circle of the sphere \(\mathbb {S}^{d-1}\).

One can thus covers \(\Theta _\varepsilon (X)^c\) using the union of all such circles \(A(u,v)\), which shows

$$\begin{aligned}&\Theta _\varepsilon (X)^c \subset \bigcup _{i \ne j} A_\varepsilon (X_i,X_j) \quad \text {where} \quad A_\varepsilon (x,y) \\&\quad = \bigcup _{ {\begin{matrix} || u-x || \leqslant \varepsilon \\ || v-y || \leqslant \varepsilon \end{matrix}} } A(u,v) \end{aligned}$$

Note that the geodesic distance \(d\) on the sphere \(\mathbb {S}^{d-1}\) between two circles is equal to the angle between the normal to the planes of the circles

$$\begin{aligned} d(A(u,v),A(x,y))&= \text {Angle}(u-v,x-y) \\&= \text {Angle}(x-y + \varepsilon w,x-y) \end{aligned}$$

where \(|| w ||\leqslant 2\). As \(\varepsilon \rightarrow 0\), after some computations, one has the following asymptotic decay of the angle

$$\begin{aligned} \text {Angle}(x-y + \varepsilon w,x-y) = O(\varepsilon /|| x-y ||) \end{aligned}$$

and thus \(d(A(u,v),A(x,y)) \leqslant C \varepsilon \) for some constant \(C\). This proves that \(\forall \,u,v\), one has

$$\begin{aligned} \left\{ \begin{array}{l} || u-x || \leqslant \varepsilon \\ || v-y || \leqslant \varepsilon \end{array} \right. \quad \Longrightarrow \quad A(u,v) \subset B_{C\varepsilon }(x,y) \end{aligned}$$

for some constant \(C>0\), where

$$\begin{aligned} B_{\varepsilon }(x,y) = \left\{ \xi \in \mathbb {S}^{d-1} \;;\; d(\xi ,A(x,y)) \leqslant \varepsilon \right\} \end{aligned}$$

One thus has

$$\begin{aligned} A_{\varepsilon }(x,y) \subset B_{C\varepsilon }(x,y). \end{aligned}$$

The volume of the spherical band \(B_{C\varepsilon }(x,y)\) of width \(C\varepsilon \) is proportional to \(\varepsilon \), and thus Vol\((A_\varepsilon (x,y)) = O(\varepsilon )\). Since \(\Theta _\varepsilon (X)^c\) is a finite union of such sets, one obtains the result. \(\square \)

Proof of continuity. For each \(\theta \), the function \(\fancyscript{E}_\theta \) is continuous as a minimum of continuous functions. The function \(\fancyscript{E}\) being an integral of \(\fancyscript{E}_\theta \) on a compact set \(\mathbb {S}^{d-1}\), it is thus continuous.

Proof of differentiability. Let \(\delta \in \mathbb {R}^{N \times d}\) and \(\varepsilon = || \delta ||_{\mathbb {R}^{N \times d}}\). The definition of the Wasserstein distance reads

$$\begin{aligned} \fancyscript{W}((X+\delta )_\theta ,Y_\theta )^2 = || (X_\theta + \delta _\theta ) \circ \sigma _{X+\delta }^\theta - Y_\theta \circ \sigma _Y^\theta ||^2. \end{aligned}$$

For all \(\theta \in \Theta _\varepsilon (X)\), thanks to Lemma 2, \(\sigma _{X+\delta }^\theta = \sigma _{X}^\theta \). One can thus compute the variation of the 1-D Wasserstein distance with respect to \(\delta \) as

$$\begin{aligned}&\fancyscript{W}((X+\delta )_\theta ,Y_\theta )^2 = || X_\theta +\delta _\theta - Y_\theta \circ \sigma _\theta ||^2 \end{aligned}$$
(54)
$$\begin{aligned}&\quad = \fancyscript{W}(X_\theta ,Y_\theta )^2 + \langle \tilde{\nabla } \fancyscript{E}_\theta (X) ,\, \delta \rangle _{\mathbb {R}^{N \times d}} + || \delta _\theta ||^2. \end{aligned}$$
(55)

Note that the fact that \(\sigma _Y^{\theta }\) might not be uniquely defined has no impact on the value of (55). One thus has

$$\begin{aligned}&\fancyscript{E}(X+\delta )-\fancyscript{E}(X) - \langle \tilde{\nabla } \fancyscript{E}(X),\,\delta \rangle _{\mathbb {R}^{N \times d}}\\&\quad = A(\delta ) + B(\delta ) + O(|| \delta ||_{\mathbb {R}^{N \times d}}^2) \end{aligned}$$

where

$$\begin{aligned}&A(\delta ) = \int _{\Theta _\varepsilon (X)^c} \left( \fancyscript{W}(X_\theta +\delta _\theta ,Y_\theta )^2 - \fancyscript{W}(X_\theta ,Y_\theta )^2 \right) \mathrm {d}\theta \\&\quad \text {and} \quad B(\delta ) = -\int _{\Theta _\varepsilon (X)^c} \langle \tilde{\nabla } \fancyscript{E}_\theta (X) ,\, \delta \rangle _{\mathbb {R}^{N \times d}} \mathrm {d}\theta \end{aligned}$$

Note that in the expression of \(B(\delta )\) the permutation \(\sigma _\theta \) involved in \(\tilde{\nabla } \fancyscript{E}_\theta (X)\) is not necessary unique, and can be chosen arbitrarily.

One has,

$$\begin{aligned} |\langle \tilde{\nabla } \fancyscript{E}_\theta (X) ,\, \delta \rangle _{\mathbb {R}^{N \times d}}| \leqslant || X-Y\circ \sigma ^\theta ||_{\mathbb {R}^{N \times d}} || \delta ||_{\mathbb {R}^{N \times d}} \end{aligned}$$

which implies, using Lemma 3

$$\begin{aligned}&|B(\delta )| \leqslant O( \text {Vol}(\Theta _\varepsilon (X)^c) || \delta ||_{\mathbb {R}^{N \times d}} )\nonumber \\&\quad = O(|| \delta ||_{\mathbb {R}^{N \times d}}^2) = o(|| \delta ||_{\mathbb {R}^{N \times d}}). \end{aligned}$$
(56)

Since \((\theta ,X) \mapsto \fancyscript{E}_\theta (X)\) is continuous and defined on a compact set, it is uniformly continuous, and thus

$$\begin{aligned} |\fancyscript{W}(X_\theta +\delta _\theta ,Y_\theta )^2 - \fancyscript{W}(X_\theta ,Y_\theta )^2| \leqslant C(\delta ) \end{aligned}$$

where \(C(\delta ) \rightarrow 0\) where \(\delta \rightarrow 0\). This shows that

$$\begin{aligned} |A(\delta )| \leqslant \text {Vol}(\Theta _\varepsilon (X)^c) C(\delta ) = o(|| \delta ||_{\mathbb {R}^{N \times d}}). \end{aligned}$$
(57)

Putting together (56) and (57) leads to

$$\begin{aligned} |\fancyscript{E}(X+\delta )-\fancyscript{E}(X) - \langle \tilde{\nabla } \fancyscript{E}(X),\,\delta \rangle | = o(|| \delta ||_{\mathbb {R}^{N \times d}}) \end{aligned}$$

which shows that \(\fancyscript{E}\) is differentiable with \(\nabla \fancyscript{E}= \tilde{\nabla } \fancyscript{E}\).

Proof of Lipschitzianity of the gradient. For all \(\theta \in \varTheta _0(X)\), \(\nabla \fancyscript{E}_\theta (X)\) is continuous and uniformly bounded, and thus \(\nabla \fancyscript{E}\) is continuous. One has, for \(\delta \in \mathbb {R}^{N \times d}\), and denoting \(\varepsilon =|| \delta ||\),

$$\begin{aligned}&\nabla \fancyscript{E}(X+\delta ) - \nabla \fancyscript{E}(X) = M( \varTheta _\varepsilon (X) ) + M( \varTheta _\varepsilon (X)^c )\\&\quad \text {where} \quad M(U) = \int _U ( \nabla \fancyscript{E}_\theta (X+\delta ) - \nabla \fancyscript{E}_\theta (X) ) \mathrm {d}\theta . \end{aligned}$$

One has

$$\begin{aligned} M( \varTheta _\varepsilon (X) ) = \int _{ \varTheta _\varepsilon (X) } \delta _\theta \theta \mathrm {d}\theta \end{aligned}$$

whereas

$$\begin{aligned} M( \varTheta _\varepsilon (X)^c )&= \int _{ \varTheta _\varepsilon (X)^c } \delta _\theta \theta \mathrm {d}\theta \\&+ \int _{ \varTheta _\varepsilon (X)^c } ( Y \circ \tilde{\sigma }_\theta - Y \circ \sigma _\theta ) \theta \mathrm {d}\theta \end{aligned}$$

where \(\tilde{\sigma }_\theta = \sigma _{Y_\theta } \circ \,\, \sigma _{X_\theta + \delta _\theta }^{-1}\). Using Lemma (3), one has for some constant \(C>0\), \(\text {Vol}(\Theta _\varepsilon (X)^c) \leqslant C || \delta ||_{\mathbb {R}^{N \times d}}\) and hence

$$\begin{aligned}&|| \nabla \fancyscript{E}(X+\delta ) - \nabla \fancyscript{E}(X) ||_{\mathbb {R}^{N \times d}} \\&\quad \leqslant (1 + 2 C || Y ||_{\mathbb {R}^{N \times d}})|| \delta ||_{\mathbb {R}^{N \times d}} \end{aligned}$$

which shows that \(\nabla \fancyscript{E}\) is \((1 + 2 C || Y ||_{\mathbb {R}^{N \times d}})\)-Lipschitz continuous.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bonneel, N., Rabin, J., Peyré, G. et al. Sliced and Radon Wasserstein Barycenters of Measures. J Math Imaging Vis 51, 22–45 (2015). https://doi.org/10.1007/s10851-014-0506-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-014-0506-3

Keywords

Navigation