Abstract
This article details two approaches to compute barycenters of measures using 1-D Wasserstein distances along radial projections of the input measures. The first method makes use of the Radon transform of the measures, and the second is the solution of a convex optimization problem over the space of measures. We show several properties of these barycenters and explain their relationship. We show numerical approximation schemes based on a discrete Radon transform and on the resolution of a non-convex optimization problem. We explore the respective merits and drawbacks of each approach on applications to two image processing problems: color transfer and texture mixing.
Similar content being viewed by others
References
Agueh, M., Carlier, G.: Barycenters in the wasserstein space. SIAM J. Math. Anal. 43(2), 904–924 (2011)
Averbuch, A., Coifman, R., Donoho, D., Israeli, M., Shkolnisky, Y., Sedelnikov, I.: A framework for discrete integral transformations: II. The 2D discrete radon transform. SIAM J. Sci. Comput. 30(2), 785–803 (2008)
Benamou, J.D., Brenier, Y.: A computational fluid mechanics solution of the monge-kantorovich mass transfer problem. Numer. Math. 84(3), 375–393 (2000)
Benamou, J.D., Froese, B.D., Oberman, A.M.: A viscosity solution approach to the Monge-Ampere formulation of the Optimal Transportation Problem. arXiv:1208.4873v2 (2013, unpublished)
Bertsekas, D.: The auction algorithm: a distributed relaxation method for the assignment problem. Ann. Operat. Res. 14, 105–123 (1988)
Bigot, J., Klein, T.: Consistent estimation of a population barycenter in the wasserstein space. Preprint arXiv:1212.2562v3 (2014)
Boman, J., Lindskog, F.: Support theorems for the radon transform and Cramèr–Wold theorems. J. Theor. Prob. 22(3), 683–710 (2009)
Bonneel, N., van de Panne, M., Paris, S., Heidrich, W.: Displacement interpolation using lagrangian mass transport. ACM Trans. Graph. (SIGGRAPH ASIA’11) 30(6), 1–12 (2011)
Brady, M.L.: A fast discrete approximation algorithm for the radon transform. J. Comput. 27(1), 107–119 (1998)
Cuturi, M., Doucet, A.: Fast computation of wasserstein barycenters. arXiv:1310.4375v1 (2013, unpublished)
Dellacherie, C., Meyer, P.A.: Probabilities and Potential Math. Stud. 29. North Holland, Amsterdam (1978)
Delon, J.: Movie and video scale-time equalization application to flicker reduction. IEEE Trans. Image Process. 15(1), 241–248 (2006)
Desolneux, A., Moisan, L., Ronsin, S.: A compact representation of random phase and Gaussian textures. In: Proc. the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1381–1384 (2012)
Digne, J., Cohen-Steiner, D., Alliez, P., Goes, F., Desbrun, M.: Feature-preserving surface reconstruction and simplification from defect-laden point sets. J. Math. Imaging Vis. 48(2), 369–382 (2013)
Ferradans, S., Xia, G.S., Peyré, G., Aujol, J.F.: Optimal transport mixing of gaussian texture models. In: Proc. SSVM’13 (2013)
Galerne, B., Gousseau, Y., Morel, J.M.: Random phase textures: theory and synthesis. IEEE Trans. Image Process. 20(1), 257–267 (2011)
Galerne, B., Lagae, A., Lefebvre, S., Drettakis, G.: Gabor noise by example. ACM Trans. Graph. (Proceedings of ACM SIGGRAPH 2012) 31(4), 73.1–73.9 (2012)
Haker, S., Zhu, L., Tannenbaum, A., Angenent, S.: Optimal mass transport for registration and warping. Int. J. Comput. Vis. 60(3), 225–240 (2004)
Helgason, S.: The Radon Transform. Birkhauser, Boston (1980)
Kantorovich, L.: On the transfer of masses. Doklady Akademii Nauk 37(2), 227–229 (1942). (in russian)
Kuhn, H.W.: The Hungarian method of solving the assignment problem. Naval Res. Logist. Quart. 2, 83–97 (1955)
Matusik, W., Zwicker, M., Durand, F.: Texture design using a simplicial complex of morphable textures. ACM Trans. Graph. 24(3), 787–794 (2005)
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128(1), 153–179 (1997)
Mérigot, Q.: A multiscale approach to optimal transport. Comput. Graph. Forum 30(5), 1583–1592 (2011)
Papadakis, N., Peyré, G., Oudet, E.: Optimal transport with proximal splitting. SIAM J. Imaging Sci. 7(1), 212–238 (2014)
Pitié, F., Kokaram, A.C., Dahyot, R.: N-Dimensional probability density function transfer and its application to color transfer. In: Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, vol. 2, pp. 1434–1439. (2005)
Rabin, J., Delon, J., Gousseau, Y.: Removing artefacts from color and contrast modifications. IEEE Trans. Image Process. 20(11), 3073–3085 (2011)
Rabin, J., Peyré, G., Delon, J., Bernot, M.: Wasserstein barycenter and its application to texture mixing. In: Scale Space and Variational Methods in Computer Vision (SSVM’11), vol. 6667, pp. 435–446 (2011).
Reinhard, E., Pouli, T.: Colour spaces for colour transfer. In: Proceedings of the Third international conference on Computational color imaging. CCIW’11, pp. 1–15. Springer, Berlin (2011)
Rubner, Y., Tomasi, C., Guibas, L.: A metric for distributions with applications to image databases. In: IEEE International Conference on Computer Vision (ICCV’98), pp. 59–66 (1998)
Solodov, M.: Incremental gradient algorithms with stepsizes bounded away from zero. Comput. Optim. Appl. 11(1), 23–35 (1998)
Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics Series. American Mathematical Society (2003)
Acknowledgments
We thank Marco Cuturi for applying his method to our dataset and for sharing his results. We thank Thouis R. Jones for useful feedback on our draft, and anonymous reviewers for their help in improving this paper. We also thank the authors of all the images used to demonstrate our color transfers. This work has been partially supported by NSF CGV-1111415. Gabriel Peyré acknowledges support from the European Research Council (ERC project SIGMA-Vision).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Proofs of Section 2
Proof of Proposition 1
From the definition (5), one verifies that
so that
where we have introduced the following change of variable
(note that \(\varphi _{s,u}^{-1} = \varphi _{s^{-1},-s^{-1}u}\)). One thus has
which proves (7). Property (8) is proved similarly. Properties (9) and (11) directly follow from (8). \(\square \)
Proof of Proposition 2
We aim at determining \((s^\star ,u^\star )\) such that
and where for simplicity we have denoted \(\varphi _i = \varphi _{s_i,u_i}\) and \(\varphi ^\star = \varphi _{s^\star ,u^\star }\). First, let us notice that
so that the set \(\fancyscript{T}\) of maps of the form \(\varphi _{s,u}\) is a subset of gradients of convex functions. This point is important since optimal maps between \(\mu _i\) and \(\mu ^\star \) are characterized as the gradient of convex functions that push forward \(\mu _i\) onto \(\mu ^\star \), see [32]. Following [1], we thus only need to show that
since \(T_i \sharp \mu _i = \mu ^\star \) and \(T_i \in \fancyscript{T}\) is a gradient of a convex function. So that \(\mu ^\star \) is a barycenter if and only if
This in turn is equivalent to the relationships
which corresponds to (14). \(\square \)
Proof of Proposition 3
The proof is done in [1] for \(\mu = \mu _j\) for some \(j \in I\), which is supposed to be absolutely continuous. It extends to an arbitrary measure \(\mu \). \(\square \)
Proof of Corollary 1
When using \(\mu \), the uniform and normalized measure on \([0,1]\), with the notation of Proposition (3), one has \(T_i = C_{\mu _i}^+\). This is indeed a classical result for 1-D optimal transport, see for instance [1], Section 6.1. One then recognizes that formula (18) is the same as formula (15). \(\square \)
Proof of Proposition 4
One has
This is equivalent to the fact that for almost all \(\theta \in \mathbb {S}^{d-1}\), one has
\(\square \)
Proof of Proposition 5
Proof of (20). Similarly to the proof of (7), the proof of (20) is obtained by using the following invariance of the Wasserstein distance on \(\varOmega ^d\)
Proof of (21). One has that \(\nu ^\star \in \text {Bar}_{\varOmega ^d}^W( \psi _{s_i,u_i} \sharp \nu , \lambda _i )_{i \in I}\) is equivalent to
Using the property of proposition 2 for \(d=1\), one obtains that
which gives the desired result. \(\square \)
Appendix 2: Proofs of Section 3
Proof of Proposition 6
For all \(g \in \fancyscript{C}_0(\varOmega ^d)\), one has
\(\square \)
Proof of Lemma 1
Proof of (26): For all \(g \in \fancyscript{C}_0(\varOmega ^d)\), one has
Proof of (27): First we notice, using (22), that
which proves
We write \(H = (R^*R)^{-1}\) the filtering operator with kernel \(h^+\). One has, for smooth functions \(f \in \fancyscript{S}(\mathbb {R}^d)\), denoting \(\fancyscript{F}(f)=\hat{f}\),
and hence
This shows, using (51) and (52) that for all \(f \in \fancyscript{D}(\mathbb {R}^d)\),
Proof of (28): the proof is similar to the one of (26). \(\square \)
Proof of Proposition 8
Using Lemma 1, one has
which proves (7) for \(\text {Bar}_{\mathbb {R}^d}^R\). Property (8) for \(\text {Bar}_{\mathbb {R}^d}^R\) is proved similarly using (28). \(\square \)
Proof of Proposition 9
One has
which proves (13) for \(\text {Bar}_{\mathbb {R}^d}^R\). \(\square \)
Appendix 3: Proof of Section 4
Proof of Proposition 10
Property (34) is a re-statement of property (19). Property (35) corresponds to the change of variable \(\nu = R\mu \in \text { Im}(R)\) in (32), which is a bijection thanks to the injectivity of \(R\), see proposition 7. \(\square \)
Proof of Proposition 11
The proof is the same as Proposition 1, replacing the invariance (49) by
where we have used the invariance (50) of the Wasserstein distance on \(\varOmega ^d\). \(\square \)
Proof of Proposition 12
One has,
Thus, for an arbitrary \(\tilde{\mu } \in \fancyscript{M}_1^+(\mathbb {R}^d)\), one has
where the inequality comes from the properties of 1-D Wasserstein barycenters. Integrating the resulting inequality with respect to \(\theta \in \mathbb {S}^{d-1}\) gives
This inequality is an equality if and only for almost all \(\theta \in \mathbb {S}^{d-1}\), one has
so that, using Proposition (7), this corresponds to \(\tilde{\mu } = \varphi _{s^\star ,u^\star } \sharp \mu \). Since the measure \(\tilde{\mu }\) is arbitrary, this gives the desired result. This proves (13) in the case \(\text {Bar}_{\mathbb {R}^d}^S\). \(\square \)
Appendix 4: Proof of Theorem 1
Notations. Without loss of generality, for a fixed \(Y \in \mathbb {R}^{d \times N}\), we study the smoothness of
We have used, for \(x, y \in \mathbb {R}^N\), the shorthand notation
The result of Theorem 1 then follows by summations of such functionals.
We define \(\mathbb {U}(N,d)\) to be vectors of \(\mathbb {R}^{d \times N}\) with distinct entries:
The hypothesis is that \(X \in \mathbb {U}(N,d)\). One has
is a permutation depending on both \(X\) and \(Y\). Note that the permutation involved are not necessarily unique, and are assumed to be arbitrary valid sorting permutations.
For \(X \in \mathbb {R}^{N \times d}\) and \(\varepsilon >0\) we introduce
This is the set of directions for which any perturbation of \(X\) of amplitude smaller than \(\varepsilon \) has a projection with disjoint points.
Overview of the proof. In the following, we thus aim at proving that \(\fancyscript{E}\) is \(C^1\), that
is indeed equal to \(\nabla \fancyscript{E}(X)\), and that this gradient is Lipschitz continuous.
The general strategy of the proof is to split the integration between the directions \(\theta \in \Theta _\varepsilon (X)\), for which we can locally assume that the permutations \(\sigma _\theta \) are constant (see Lemma 2), which in turn defines a smooth quadratic energy, and the remaining directions in \(\Theta _\varepsilon (X)^c\), which are shown to have a negligible contribution to the energy and to the derivative (see Lemma 3).
Preparatory results. The following lemma shows that if \(\theta \in \Theta _\varepsilon (X)\) the permutations \(\sigma _X^\theta \) are stable to small perturbations of \(X\).
Lemma 2
Let \(X \in \mathbb {U}(N,d)\). For all \(\theta \in \Theta _\varepsilon (X)\), for all \(\delta \) with \(|| \delta ||_{\mathbb {R}^{N \times d}} \leqslant \varepsilon \), the permutation \(\sigma _{X+\delta }^\theta \) that sorts \(( \langle X_i+\delta _i,\,\theta \rangle )_i\) is uniquely defined and satisfies \(\sigma _{X+\delta }^\theta = \sigma _X^\theta \).
Proof
If one has \(\sigma _{X+\delta }^\theta \ne \sigma _X^\theta \), then necessarily there exists some \(t \in [0,1]\) such that \(\sigma _{X+t\delta }^\theta \) is not uniquely defined, which is equivalent to \(X_\theta +t\delta _\theta \) not being in \(\mathbb {U}(N,1)\). Since \(|| t \delta ||_{\mathbb {R}^{N \times d}} \leqslant \varepsilon \), this shows that \(\theta \notin \Theta _\varepsilon (X)\). \(\square \)
In order to prove Theorem 1, we need the following lemma.
Lemma 3
For \(X \in \mathbb {U}(N, d)\), one has
Proof
One has \(X_\theta + \delta _\theta \notin \mathbb {U}(N,1)\) if and only there exists a pair of points \(u=X_i+\delta _i\) and \(v=X_j+\delta _j\) with \(i \ne j\) such that
Note that \(A(u,v)\) is a great circle of the sphere \(\mathbb {S}^{d-1}\).
One can thus covers \(\Theta _\varepsilon (X)^c\) using the union of all such circles \(A(u,v)\), which shows
Note that the geodesic distance \(d\) on the sphere \(\mathbb {S}^{d-1}\) between two circles is equal to the angle between the normal to the planes of the circles
where \(|| w ||\leqslant 2\). As \(\varepsilon \rightarrow 0\), after some computations, one has the following asymptotic decay of the angle
and thus \(d(A(u,v),A(x,y)) \leqslant C \varepsilon \) for some constant \(C\). This proves that \(\forall \,u,v\), one has
for some constant \(C>0\), where
One thus has
The volume of the spherical band \(B_{C\varepsilon }(x,y)\) of width \(C\varepsilon \) is proportional to \(\varepsilon \), and thus Vol\((A_\varepsilon (x,y)) = O(\varepsilon )\). Since \(\Theta _\varepsilon (X)^c\) is a finite union of such sets, one obtains the result. \(\square \)
Proof of continuity. For each \(\theta \), the function \(\fancyscript{E}_\theta \) is continuous as a minimum of continuous functions. The function \(\fancyscript{E}\) being an integral of \(\fancyscript{E}_\theta \) on a compact set \(\mathbb {S}^{d-1}\), it is thus continuous.
Proof of differentiability. Let \(\delta \in \mathbb {R}^{N \times d}\) and \(\varepsilon = || \delta ||_{\mathbb {R}^{N \times d}}\). The definition of the Wasserstein distance reads
For all \(\theta \in \Theta _\varepsilon (X)\), thanks to Lemma 2, \(\sigma _{X+\delta }^\theta = \sigma _{X}^\theta \). One can thus compute the variation of the 1-D Wasserstein distance with respect to \(\delta \) as
Note that the fact that \(\sigma _Y^{\theta }\) might not be uniquely defined has no impact on the value of (55). One thus has
where
Note that in the expression of \(B(\delta )\) the permutation \(\sigma _\theta \) involved in \(\tilde{\nabla } \fancyscript{E}_\theta (X)\) is not necessary unique, and can be chosen arbitrarily.
One has,
which implies, using Lemma 3
Since \((\theta ,X) \mapsto \fancyscript{E}_\theta (X)\) is continuous and defined on a compact set, it is uniformly continuous, and thus
where \(C(\delta ) \rightarrow 0\) where \(\delta \rightarrow 0\). This shows that
Putting together (56) and (57) leads to
which shows that \(\fancyscript{E}\) is differentiable with \(\nabla \fancyscript{E}= \tilde{\nabla } \fancyscript{E}\).
Proof of Lipschitzianity of the gradient. For all \(\theta \in \varTheta _0(X)\), \(\nabla \fancyscript{E}_\theta (X)\) is continuous and uniformly bounded, and thus \(\nabla \fancyscript{E}\) is continuous. One has, for \(\delta \in \mathbb {R}^{N \times d}\), and denoting \(\varepsilon =|| \delta ||\),
One has
whereas
where \(\tilde{\sigma }_\theta = \sigma _{Y_\theta } \circ \,\, \sigma _{X_\theta + \delta _\theta }^{-1}\). Using Lemma (3), one has for some constant \(C>0\), \(\text {Vol}(\Theta _\varepsilon (X)^c) \leqslant C || \delta ||_{\mathbb {R}^{N \times d}}\) and hence
which shows that \(\nabla \fancyscript{E}\) is \((1 + 2 C || Y ||_{\mathbb {R}^{N \times d}})\)-Lipschitz continuous.
Rights and permissions
About this article
Cite this article
Bonneel, N., Rabin, J., Peyré, G. et al. Sliced and Radon Wasserstein Barycenters of Measures. J Math Imaging Vis 51, 22–45 (2015). https://doi.org/10.1007/s10851-014-0506-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-014-0506-3