Skip to main content

Advertisement

Log in

Procrustes Analysis with Deformations: A Closed-Form Solution by Eigenvalue Decomposition

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Generalized Procrustes Analysis (GPA) is the problem of bringing multiple shapes into a common reference by estimating transformations. GPA has been extensively studied for the Euclidean and affine transformations. We introduce GPA with deformable transformations, which forms a much wider and difficult problem. We specifically study a class of transformations called the Linear Basis Warps, which contains the affine transformation and most of the usual deformation models, such as the Thin-Plate Spline (TPS). GPA with deformations is a nonconvex underconstrained problem. We resolve the fundamental ambiguities of deformable GPA using two shape constraints requiring the eigenvalues of the shape covariance. These eigenvalues can be computed independently as a prior or posterior. We give a closed-form and optimal solution to deformable GPA based on an eigenvalue decomposition. This solution handles regularization, favoring smooth deformation fields. It requires the transformation model to satisfy a fundamental property of free-translations, which asserts that the model can implement any translation. We show that this property fortunately holds true for most common transformation models, including the affine and TPS models. For the other models, we give another closed-form solution to GPA, which agrees exactly with the first solution for models with free-translation. We give pseudo-code for computing our solution, leading to the proposed DefGPA method, which is fast, globally optimal and widely applicable. We validate our method and compare it to previous work on six diverse 2D and 3D datasets, with special care taken to choose the hyperparameters from cross-validation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Absil, P.-A., Mahony, R., & Sepulchre, R. (2009). Optimization algorithms on matrix manifolds. Princeton University Press.

  • Allen, B., Curless, B., & Popović, Z. (2003). The space of human body shapes: Reconstruction and parameterization from range scans. ACM Transactions on Graphics (TOG), 22(3), 587–594.

    Article  Google Scholar 

  • Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., & Davis, J. (2005). SCAPE: Shape completion and animation of people. In ACM SIGGRAPH 2005 papers (pp. 408–416).

  • Arun, K. S., Huang, T. S., & Blostein, S. D. (1987). Least-squares fitting of two 3-d point sets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5, 698–700.

    Article  Google Scholar 

  • Bai, F., Vidal-Calleja, T., & Grisetti, G. (2021). Sparse pose graph optimization in cycle space. IEEE Transactions on Robotics, 37(5), 1381–1400. https://doi.org/10.1109/TRO.2021.3050328

    Article  Google Scholar 

  • Bartoli, A. (2006). Towards 3d motion estimation from deformable surfaces. In Proceedings 2006 IEEE international conference on robotics and automation, 2006. ICRA 2006. (pp. 3083–3088). IEEE.

  • Bartoli, A., Perriollat, M., & Chambon, S. (2010). Generalized thin-plate spline warps. International Journal of Computer Vision, 88(1), 85–110.

    Article  Google Scholar 

  • Bartoli, A., Pizarro, D., & Loog, M. (2013). Stratified generalized procrustes analysis. International Journal of Computer Vision, 101(2), 227–253.

    Article  MathSciNet  Google Scholar 

  • Benjemaa, R., & Schmitt, F. (1998). A solution for the registration of multiple 3d point sets using unit quaternions. In European conference on computer vision (pp. 34–50). Springer.

  • Bilic, P., Christ, P. F., Vorontsov, E., Chlebus, G., Chen, H., Dou, Q., Fu, C.-W., Han, X., Heng, P.-A., & Hesser, J., et al. (2019). The liver tumor segmentation benchmark (lits). arXiv preprint arXiv:1901.04056.

  • Birtea, P., Caşu, I., & Comănescu, D. (2019). First order optimality conditions and steepest descent algorithm on orthogonal Stiefel manifolds. Optimization Letters, 13(8), 1773–1791.

    Article  MathSciNet  Google Scholar 

  • Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.

  • Bookstein, F. L. (1989). Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(6), 567–585.

    Article  Google Scholar 

  • Bookstein, F. L. (1997). Morphometric tools for landmark data: geometry and biology. Cambridge University Press.

  • Bouix, S., Pruessner, J. C., Collins, D. L., & Siddiqi, K. (2005). Hippocampal shape analysis using medial surfaces. Neuroimage, 25(4), 1077–1089.

    Article  Google Scholar 

  • Bro-Nielsen, M., & Gramkow, C. (1996). Fast fluid registration of medical images. In International conference on visualization in biomedical computing (pp. 265–276). Springer.

  • Brockett, R. W. (1989). Least squares matching problems. Linear Algebra and its Applications, 122, 761–777.

    Article  MathSciNet  Google Scholar 

  • Brown, B.J., & Rusinkiewicz, S. (2007). Global non-rigid alignment of 3-d scans. In ACM SIGGRAPH 2007 papers (pp. 21–es).

  • Christensen, G., & He, J. (2001). Consistent nonlinear elastic image registration. In Proceedings IEEE workshop on mathematical methods in biomedical image analysis (MMBIA 2001) (pp. 37–43). IEEE.

  • Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models-their training and application. Computer Vision and Image Understanding, 61(1), 38–59.

    Article  Google Scholar 

  • Davis, T. A. (2006). Direct methods for sparse linear systems. SIAM

  • Dryden, I. L., & Mardia, K. V. (2016). Statistical shape analysis: With applications in R (Vol. 995). Wiley.

  • Duchon, J. (1976). Interpolation des fonctions de deux variables suivant le principe de la flexion des plaques minces. Revue française d’automatique, informatique, recherche opérationnelle. Analyse numérique, 10(R3), 5–12.

  • Eggert, D. W., Lorusso, A., & Fisher, R. B. (1997). Estimating 3-d rigid body transformations: A comparison of four major algorithms. Machine Vision and Applications, 9(5–6), 272–290.

    Article  Google Scholar 

  • Fletcher, P. T., Lu, C., Pizer, S. M., & Joshi, S. (2004). Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Transactions on Medical Imaging, 23(8), 995–1005.

    Article  Google Scholar 

  • Fornefett, M., Rohr, K., & Stiehl, H. S. (2001). Radial basis functions with compact support for elastic registration of medical images. Image and Vision Computing, 19(1–2), 87–96.

    Article  Google Scholar 

  • Freifeld, O., & Black, M. J. (2012). Lie bodies: A manifold representation of 3d human shape. In European conference on computer vision (pp. 1–14). Springer.

  • Gallardo, M., Collins, T., & Bartoli, A. (2017). Dense non-rigid structure-from-motion and shading with unknown albedos. In Proceedings of the IEEE international conference on computer vision (pp. 3884–3892).

  • Golub, G., & Pereyra, V. (2003). Separable nonlinear least squares: The variable projection method and its applications. Inverse Problems, 19(2), R1.

    Article  MathSciNet  Google Scholar 

  • Goodall, C. (1991). Procrustes methods in the statistical analysis of shape. Journal of the Royal Statistical Society: Series B (Methodological), 53(2), 285–321.

    MathSciNet  MATH  Google Scholar 

  • Gower, J. C. (1975). Generalized procrustes analysis. Psychometrika, 40(1), 33–51.

    Article  MathSciNet  Google Scholar 

  • Green, B. F. (1952). The orthogonal approximation of an oblique structure in factor analysis. Psychometrika, 17(4), 429–440.

    Article  MathSciNet  Google Scholar 

  • Hardy, G., Collection, K. M. R., Littlewood, J., Pólya, G., Pólya, G., & Littlewood, D. (1952). Inequalities. Cambridge Mathematical Library: Cambridge University Press. ISBN 9780521358804.

  • Horn, B. K. (1987). Closed-form solution of absolute orientation using unit quaternions. JOSA A, 4(4), 629–642.

    Article  Google Scholar 

  • Horn, B. K., Hilden, H. M., & Negahdaripour, S. (1988). Closed-form solution of absolute orientation using orthonormal matrices. JOSA A, 5(7), 1127–1135.

    Article  MathSciNet  Google Scholar 

  • Iserles, A., Munthe-Kaas, H. Z., Nørsett, S. P., & Zanna, A. (2000). Lie-group methods. Acta numerica, 9, 215–365.

    Article  MathSciNet  Google Scholar 

  • Jermyn, I. H., Kurtek, S., Klassen, E., & Srivastava, A. (2012). Elastic shape matching of parameterized surfaces using square root normal fields. In European conference on computer vision (pp. 804–817). Springer.

  • Joshi, S. H., Klassen, E., Srivastava, A., & Jermyn, I. (2007). A novel representation for Riemannian analysis of elastic curves in rn. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–7). IEEE.

  • Kanatani, K.-I. (1994). Analysis of 3-d rotation fitting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5), 543–549.

    Article  Google Scholar 

  • Kendall, D. G. (1984). Shape manifolds, procrustean metrics, and complex projective spaces. Bulletin of the London Mathematical Society, 16(2), 81–121.

    Article  MathSciNet  Google Scholar 

  • Kendall, D. G., Barden, D., Carne, T. K., & Le, H. (2009). Shape and shape theory, (Vol. 500). Wiley.

  • Kent, J. T. (1994). The complex Bingham distribution and shape analysis. Journal of the Royal Statistical Society: Series B (Methodological), 56(2), 285–299.

    MathSciNet  MATH  Google Scholar 

  • Kilian, M., Mitra, N. J., & Pottmann, H. (2007). Geometric modeling in shape space. In ACM SIGGRAPH 2007 papers (pp. 64–es).

  • Kim, Y. J., Chung, S.-T., Kim, B., & Cho, S. (2008). 3d face modeling based on 3D dense morphable face shape model. International Journal of Computer Science and Engineering, 2(3), 107–113.

    Google Scholar 

  • Krishnan, S., Lee, P. Y., Moore, J. B., & Venkatasubramanian, S., et al. (2005). Global registration of multiple 3d point sets via optimization-on-a-manifold. In Symposium on geometry processing (pp. 187–196).

  • Kristof, W., & Wingersky, B. (1971). A generalization of the orthogonal procrustes rotation procedure to more than two matrices. In Proceedings of the annual convention of the American Psychological Association. American Psychological Association.

  • Kurtek, S., Klassen, E., Ding, Z., & Srivastava, A. (2010). A novel Riemannian framework for shape analysis of 3d objects. In 2010 IEEE computer society conference on computer vision and pattern recognition (pp. 1625–1632). IEEE.

  • Kurtek, S., Klassen, E., Gore, J. C., Ding, Z., & Srivastava, A. (2011). Elastic geodesic paths in shape space of parameterized surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(9), 1717–1730.

    Article  Google Scholar 

  • Laga, H. (2018). A survey on non-rigid 3D shape analysis.

  • Laga, H., Xie, Q., Jermyn, I. H., & Srivastava, A. (2017). Numerical inversion of SRNF maps for elastic shape analysis of genus-zero surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2451–2464.

    Article  Google Scholar 

  • Matei, B., & Meer, P. (1999). Optimal rigid motion estimation and performance evaluation with bootstrap. In Proceedings. 1999 IEEE computer society conference on computer vision and pattern recognition (cat. no PR00149) (Vol. 1, pp. 339–345). IEEE.

  • Meyer, C. D. (2000). Matrix analysis and applied linear algebra (Vol. 71). SIAM

  • Ohta, N., & Kanatani, K. (1998). Optimal estimation of three-dimensional rotation and reliability evaluation. IEICE Transactions on Information and Systems, 81(11), 1247–1252.

    Google Scholar 

  • Osher, S., & Fedkiw, R. P. (2003). Level set methods and dynamic implicit surfaces (Vol. 153). Springer.

  • Rohlf, F. J., & Slice, D. (1990). Extensions of the Procrustes method for the optimal superimposition of landmarks. Systematic Biology, 39(1), 40–59.

    Google Scholar 

  • Rosen, D. M., Carlone, L., Bandeira, A. S., & Leonard, J. J. (2019). Se-sync: A certifiably correct algorithm for synchronization over the special Euclidean group. The International Journal of Robotics Research, 38(2–3), 95–125.

    Article  Google Scholar 

  • Rueckert, D., Sonoda, L. I., Hayes, C., Hill, D. L., Leach, M. O., & Hawkes, D. J. (1999). Nonrigid registration using free-form deformations: Application to breast MR images. IEEE Transactions on Medical Imaging, 18(8), 712–721.

    Article  Google Scholar 

  • Schönemann, P. H. (1966). A generalized solution of the orthogonal procrustes problem. Psychometrika, 31(1), 1–10.

  • Song, J., Bai, F., Zhao, L., Huang, S., & Xiong, R. (2020). Efficient two step optimization for large embedded deformation graph based slam. In 2020 IEEE international conference on robotics and automation (ICRA), (pp. 9419–9425). IEEE.

  • Sumner, R. W., Schmid, J., & Pauly, M. (2007). Embedded deformation for shape manipulation. In ACM SIGGRAPH 2007 papers (pp. 80–es).

  • Szeliski, R., & Coughlan, J. (1997). Spline-based image registration. International Journal of Computer Vision, 22(3), 199–218.

    Article  Google Scholar 

  • Ten Berge, J. M. (1977). Orthogonal procrustes rotation for two or more matrices. Psychometrika, 42(2), 267–276.

    Article  MathSciNet  Google Scholar 

  • Tomasi, C., & Kanade, T. (1992). Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2), 137–154.

    Article  Google Scholar 

  • Umeyama, S. (1991). Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis& Machine Intelligence, 4, 376–380.

    Article  Google Scholar 

  • Walker, M. W., Shao, L., & Volz, R. A. (1991). Estimating 3-d location parameters using dual number quaternions. CVGIP: Image Understanding, 54(3), 358–367.

    Article  Google Scholar 

  • Wen, G., Wang, Z., Xia, S., & Zhu, D. (2006). Least-squares fitting of multiple m-dimensional point sets. The Visual Computer, 22(6), 387–398.

    Article  Google Scholar 

  • Williams, J., & Bennamoun, M. (2001). Simultaneous registration of multiple corresponding point sets. Computer Vision and Image Understanding, 81(1), 117–142.

    Article  Google Scholar 

  • Younes, L. (2012). Spaces and manifolds of shapes in computer vision: An overview. Image and Vision Computing, 30(6–7), 389–397.

    Article  Google Scholar 

  • Younes, L., Michor, P. W., Shah, J. M., & Mumford, D. B. (2008). A metric on shape space with explicit geodesics. Rendiconti Lincei-Matematica e Applicazioni, 19(1), 25–57.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by ANR via the TOPACS project (ANR-19-CE45-0015). We thank the authors of the public datasets which we could use in our experiments. We appreciate the valuable comments of the anonymous reviewers that have helped improving the quality of the manuscript.

Author information

Authors and Affiliations

Authors

Additional information

Communicated by Xavier Pennec.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

The Thin-Plate Spline

1.1 General Mapping Definition

The Thin-Plate Spline (TPS) is a nonlinear mapping driven by a set of chosen control centers. Let \({\varvec{c}}_1, {\varvec{c}}_2, \dots , {\varvec{c}}_l\), with \({\varvec{c}}_i \in {\mathbb {R}}^{d}\) \((d=2,3)\), be the l control centers of TPS. We present the TPS model in 3D; the 2D case can be analogously derived by removing the z coordinate and letting the corresponding TPS kernel function be \(\phi \left( r\right) = r^2 \log \left( r^2\right) \) (Bookstein 1997). Strictly speaking, the TPS only occurs in 2D, but is part of a general family of functions called the harmonic splines. For simplicity, we call the function TPS independently of the dimension.

We denote a point in 3D by \({\varvec{p}}^{\intercal } = \left[ x, y,z\right] \). The TPS function \(\tau ({\varvec{p}})\) is a \({\mathbb {R}}^3 \rightarrowtail {\mathbb {R}}\) mapping:

$$\begin{aligned} \tau ({\varvec{p}}) = \sum _{k=1}^l w_k\, \phi \left( \Vert {\varvec{p}} - {\varvec{c}}_k \Vert \right) + a_1 x + a_2 y + a_3 z + a_4 \end{aligned}$$

where \(\phi \left( \cdot \right) \) is the TPS kernel function. In 3D, we have \(\phi \left( r\right) = - \vert r \vert \).

We collect the parameters in a single vector \(\varvec{\eta }^{\intercal } = \left[ {\varvec{w}}^{\intercal }, {\varvec{a}}^{\intercal }\right] \), with the coefficient in the nonlinear part as \({\varvec{w}}^{\intercal } = \left[ w_1, w_2, \cdots , w_l\right] \), and that in the linear part as \({\varvec{a}}^{\intercal } = \left[ a_1, a_2, a_3, a_4\right] \). Let \(\tilde{\varvec{p}}^{\intercal } = \left[ x, y, z, 1\right] \). Collect the nonlinear components in a vector:

$$\begin{aligned} \varvec{\phi }_{{\varvec{p}}}^{\intercal } = \left[ \phi \left( \Vert {\varvec{p}} - {\varvec{c}}_1 \Vert \right) ,\, \phi \left( \Vert {\varvec{p}} - {\varvec{c}}_2 \Vert \right) ,\, \cdots ,\, \phi \left( \Vert {\varvec{p}} - {\varvec{c}}_l \Vert \right) \right] . \end{aligned}$$

Then the TPS function compacts in the vector form as:

$$\begin{aligned} \tau ({\varvec{p}}) = \varvec{\phi }_{{\varvec{p}}}^{\intercal } {\varvec{w}} + \tilde{\varvec{p}}^{\intercal } {\varvec{a}} . \end{aligned}$$

We define \(\tilde{\varvec{c}}_i^{\intercal } = \left[ {\varvec{c}}_i^{\intercal }, 1\right] \) \((i \in \left[ 1:l \right] )\), and \(\tilde{\varvec{C}} = \left[ \tilde{\varvec{c}}_1, \tilde{\varvec{c}}_2, \dots , \tilde{\varvec{c}}_l\right] \). The transformation parameters in the nonlinear part of the TPS model are constrained with respect to the control centers, i.e., \( \tilde{\varvec{C}} {\varvec{w}} = {\varvec{0}} \). To solve the parameters of TPS, the l control centers \({\varvec{c}}_1, {\varvec{c}}_2, \dots , {\varvec{c}}_l\), are mapped to l scalar outputs \(y_1, y_2, \dots , y_l\), by \(\tau ({\varvec{c}}_k) = y_k\) \((k \in \left[ 1:l \right] )\). We collect the outputs in a vector \({\varvec{y}}^{\intercal } = \left[ y_1, y_2, \dots , y_l\right] \).

1.2 Standard Form as a Linear Regression Model

While there are \(l+4\) parameters in \(\varvec{\eta }\), we will show in the following that the 4 constraints in \( \tilde{\varvec{C}} {\varvec{w}} = {\varvec{0}} \) can be absorbed in a reparameterization of the TPS model, by using l parameters only.

To proceed, we define a matrix:

$$\begin{aligned} {\varvec{K}}_{\lambda } = \begin{bmatrix} \lambda &{} \phi _{12} &{} \dots &{} \phi _{1l} \\ \phi _{21} &{} \lambda &{} \dots &{} \phi _{2l} \\ \vdots &{} \vdots &{} &{} \vdots \\ \phi _{l1} &{} \phi _{l2} &{} \dots &{} \lambda \end{bmatrix} , \end{aligned}$$

where we use the shorthand \( \phi _{ij} = \phi \left( \Vert {\varvec{c}}_i - {\varvec{c}}_j \Vert \right) \). The original diagonal elements of \({\varvec{K}}_{\lambda }\) are all zeros, i.e., \(\phi _{11} = \phi _{22} = \cdots = \phi _{ll} = 0\). \(\lambda \) is an adjustable parameter acting as an internal smoothing parameter to improve the conditioning of the matrix.

The l TPS mappings of control centers \(\tau ({\varvec{c}}_k) = y_k\) \((k \in \left[ 1:l \right] )\), and the 4 parameter constraints \( \tilde{\varvec{C}} {\varvec{w}} = {\varvec{0}} \) can be collected in the following matrix form:

$$\begin{aligned} \begin{bmatrix} {\varvec{K}}_{\lambda } &{} \tilde{\varvec{C}}^{\intercal } \\ \tilde{\varvec{C}} &{} {\varvec{O}} \end{bmatrix} \begin{bmatrix} {\varvec{w}} \\ {\varvec{a}} \end{bmatrix} = \begin{bmatrix} {\varvec{y}} \\ {\varvec{0}} \end{bmatrix} , \end{aligned}$$

which admits a closed-form solution:

$$\begin{aligned} \begin{bmatrix} {\varvec{w}} \\ {\varvec{a}} \end{bmatrix} = \varvec{{\mathcal {E}}}_{\lambda } {\varvec{y}} , \end{aligned}$$

with the matrix \(\varvec{{\mathcal {E}}}_{\lambda } \in {\mathbb {R}}^{\left( l+4\right) \times l}\) defined as:

$$\begin{aligned} \varvec{{\mathcal {E}}}_{\lambda } = \begin{bmatrix} {\varvec{K}}_{\lambda }^{-1} - {\varvec{K}}_{\lambda }^{-1} \tilde{\varvec{C}}^{\intercal } \left( \tilde{\varvec{C}} {\varvec{K}}_{\lambda }^{-1} \tilde{\varvec{C}}^{\intercal }\right) ^{-1} \tilde{\varvec{C}} {\varvec{K}}_{\lambda }^{-1} \\ \left( \tilde{\varvec{C}} {\varvec{K}}_{\lambda }^{-1} \tilde{\varvec{C}}^{\intercal }\right) ^{-1} \tilde{\varvec{C}} {\varvec{K}}_{\lambda }^{-1} \end{bmatrix} . \end{aligned}$$

Therefore the TPS function writes:

$$\begin{aligned} \tau ({\varvec{p}}) = \begin{bmatrix} \varvec{\phi }_{{\varvec{p}}} \\ \tilde{\varvec{p}} \end{bmatrix}^{\intercal } \begin{bmatrix} {\varvec{w}} \\ {\varvec{a}} \end{bmatrix} = \begin{bmatrix} \varvec{\phi }_{{\varvec{p}}} \\ \tilde{\varvec{p}} \end{bmatrix}^{\intercal } \varvec{{\mathcal {E}}}_{\lambda } {\varvec{y}} = {\varvec{y}}^{\intercal } \varvec{{\mathcal {E}}}_{\lambda }^{\intercal } \begin{bmatrix} \varvec{\phi }_{{\varvec{p}}} \\ \tilde{\varvec{p}} \end{bmatrix} . \end{aligned}$$

Let \(\varvec{\beta } ({\varvec{p}}) = \varvec{{\mathcal {E}}}_{\lambda }^{\intercal } \begin{bmatrix} \varvec{\phi }_{{\varvec{p}}} \\ \tilde{\varvec{p}} \end{bmatrix}\). Then \( \tau ({\varvec{p}}) = {\varvec{y}}^{\intercal } \varvec{\beta } ({\varvec{p}}) \).

The 3D TPS warp model \({\mathcal {W}}\left( {\varvec{p}}\right) \,:\, {\mathbb {R}}^3 \rightarrowtail {\mathbb {R}}^3\) is obtained by stacking 3 TPS functions together, one for each dimension:

$$\begin{aligned} {\mathcal {W}} ({\varvec{p}}) = \begin{bmatrix} \tau _{x}({\varvec{p}}) \\ \tau _{y}({\varvec{p}}) \\ \tau _{z}({\varvec{p}}) \end{bmatrix} = \begin{bmatrix} {\varvec{y}}_{x}^{\intercal }\, \varvec{\beta } ({\varvec{p}}) \\ {\varvec{y}}_{y}^{\intercal }\, \varvec{\beta } ({\varvec{p}}) \\ {\varvec{y}}_{z}^{\intercal }\, \varvec{\beta } ({\varvec{p}}) \end{bmatrix} = \left[ {\varvec{y}}_{x}, {\varvec{y}}_{y}, {\varvec{y}}_{z} \right] ^{\intercal } \varvec{\beta } ({\varvec{p}}) .\nonumber \\ \end{aligned}$$
(33)

This is the standard form of the general warp model defined in Eq. (16), with \({\varvec{W}} = \left[ {\varvec{y}}_{x}, {\varvec{y}}_{y}, {\varvec{y}}_{z} \right] \in {\mathbb {R}}^{l \times 3}\) being a set of new transformation parameters that are free of constraints.

1.3 Regularization by Bending Energy Matrix

Let \(\varvec{\bar{{\mathcal {E}}}}_{\lambda } = {\varvec{K}}_{\lambda }^{-1} - {\varvec{K}}_{\lambda }^{-1} \tilde{\varvec{C}}^{\intercal } \left( \tilde{\varvec{C}} {\varvec{K}}_{\lambda }^{-1} \tilde{\varvec{C}}^{\intercal }\right) ^{-1} \tilde{\varvec{C}} {\varvec{K}}_{\lambda }^{-1}\). The \(l \times l\) matrix \(\varvec{\bar{{\mathcal {E}}}}_{\lambda }\) is the bending energy matrix of the TPS warp (Bookstein 1989), which satisfies \( \varvec{\bar{{\mathcal {E}}}}_{\lambda } {\varvec{K}}_{\lambda } \varvec{\bar{{\mathcal {E}}}}_{\lambda } = \varvec{\bar{{\mathcal {E}}}}_{\lambda } \). The bending energy matrix \(\varvec{\bar{{\mathcal {E}}}}_{\lambda }\) is positive semidefinite, with rank \(l - 4\). The bending energy of a single TPS function is proportional to \({\varvec{w}}^{\intercal } {\varvec{K}}_{\lambda } {\varvec{w}}\), with \({\varvec{w}} = \varvec{\bar{{\mathcal {E}}}}_{\lambda } {\varvec{y}}\), which is given by:

$$\begin{aligned} \int _{{\mathbb {R}}^d} \Vert \frac{\partial ^2}{\partial {\varvec{p}}^2} \tau ({\varvec{p}}) \Vert _F^2 \, d {\varvec{p}} \propto {\varvec{w}}^{\intercal } {\varvec{K}}_{\lambda } {\varvec{w}} = {\varvec{y}}^{\intercal } \varvec{\bar{{\mathcal {E}}}}_{\lambda } {\varvec{y}} . \end{aligned}$$

For the 3D TPS warp model, the overall bending energy is:

$$\begin{aligned}&\, \int _{{\mathbb {R}}^d} \left\| \frac{\partial ^2}{\partial {\varvec{p}}^2} {\mathcal {W}} ({\varvec{p}}, {\varvec{W}}) \right\| _F^2 \, d {\varvec{p}}\\&\quad = \, \int _{{\mathbb {R}}^d} \left( \left\| \frac{\partial ^2}{\partial {\varvec{p}}^2} \tau _x ({\varvec{p}}) \right\| _F^2 + \left\| \frac{\partial ^2}{\partial {\varvec{p}}^2} \tau _y ({\varvec{p}}) \right\| _F^2\right. \\ {}&\qquad \left. + \left\| \frac{\partial ^2}{\partial {\varvec{p}}^2} \tau _z ({\varvec{p}}) \right\| _F^2 \right) d {\varvec{p}}\\&\quad = \, {\varvec{y}}_x^{\intercal } \varvec{\bar{{\mathcal {E}}}}_{\lambda } {\varvec{y}}_x + {\varvec{y}}_y^{\intercal } \varvec{\bar{{\mathcal {E}}}}_{\lambda } {\varvec{y}}_y + {\varvec{y}}_z^{\intercal } \varvec{\bar{{\mathcal {E}}}}_{\lambda } {\varvec{y}}_z\\&\quad = \, \mathbf {tr}\left( {\varvec{W}}^{\intercal } \varvec{\bar{{\mathcal {E}}}}_{\lambda } {\varvec{W}} \right) = \Vert \sqrt{ \varvec{\bar{{\mathcal {E}}}}_{\lambda }} {\varvec{W}} \Vert _F^2 \end{aligned}$$

where \(\sqrt{ \varvec{\bar{{\mathcal {E}}}}_{\lambda } }\) is the matrix square root of \(\varvec{\bar{{\mathcal {E}}}}_{\lambda }\). Then the bending energy of the TPS warp writes into the standard form \( {\mathcal {R}} ({\varvec{W}}) = \Vert \sqrt{ \varvec{\bar{{\mathcal {E}}}}_{\lambda } } {\varvec{W}} \Vert _F^2 \), given in Eq. (18).

1.4 The Thin-Plate Spline Satisfies \(\varvec{{\mathcal {B}}}\left( \cdot \right) ^{\intercal } {\varvec{x}} = {\varvec{1}}\) and \({\varvec{Z}} {\varvec{x}} = {\varvec{0}}\)

Let an arbitrary point cloud \({\varvec{D}}\) be \( {\varvec{D}} = \left[ {\varvec{p}}_1,\, {\varvec{p}}_2, \dots , {\varvec{p}}_m \right] \). Then by Eq. (33) and the point-cloud operation defined in Sect. 5.1.2, we write:

$$\begin{aligned} \varvec{{\mathcal {B}}}({\varvec{D}})&= \left[ \varvec{\beta }({\varvec{p}}_1),\, \varvec{\beta }({\varvec{p}}_2), \dots , \varvec{\beta }({\varvec{p}}_m) \right] \\&= \varvec{{\mathcal {E}}}_{\lambda }^{\intercal } \begin{bmatrix} \varvec{\phi }_{{\varvec{p}}_1} &{} \varvec{\phi }_{{\varvec{p}}_2} &{} \dots &{} \varvec{\phi }_{{\varvec{p}}_m} \\ \tilde{\varvec{p}}_1 &{} \tilde{\varvec{p}}_2 &{} \dots &{} \tilde{\varvec{p}}_m \\ \end{bmatrix}\\&= \varvec{{\mathcal {E}}}_{\lambda }^{\intercal } \begin{bmatrix} {\varvec{M}}_{{\varvec{D}}} \\ {\varvec{1}}^{\intercal } \end{bmatrix} . \end{aligned}$$

Hence \(\varvec{{\mathcal {B}}}({\varvec{D}})^{\intercal } \) has structure:

$$\begin{aligned} \varvec{{\mathcal {B}}}({\varvec{D}})^{\intercal } = \begin{bmatrix} {\varvec{M}}_{{\varvec{D}}}^{\intercal }&{\varvec{1}} \end{bmatrix} \varvec{{\mathcal {E}}}_{\lambda } . \end{aligned}$$

1.4.1 \(\varvec{{\mathcal {B}}}(\cdot )^{\intercal } {\varvec{x}} = {\varvec{1}}\)

It suffices to show such an \({\varvec{x}}\) satisfies \( \varvec{{\mathcal {E}}}_{\lambda } {\varvec{x}} = \begin{bmatrix} {\varvec{0}} \\ 1 \end{bmatrix} \). Because:

$$\begin{aligned} \varvec{{\mathcal {E}}}_{\lambda } \tilde{\varvec{C}}^{\intercal } = \begin{bmatrix} {\varvec{O}} \\ {\varvec{I}} \end{bmatrix} \Longrightarrow \varvec{{\mathcal {E}}}_{\lambda } \tilde{\varvec{C}}^{\intercal } \begin{bmatrix} {\varvec{0}} \\ 1 \end{bmatrix} = \begin{bmatrix} {\varvec{O}} \\ {\varvec{I}} \end{bmatrix} \begin{bmatrix} {\varvec{0}} \\ 1 \end{bmatrix} = \begin{bmatrix} {\varvec{0}} \\ 1 \end{bmatrix} , \end{aligned}$$

such an \({\varvec{x}}\) indeed exists, which is \({\varvec{x}} = \tilde{\varvec{C}}^{\intercal } \begin{bmatrix} {\varvec{0}} \\ 1 \end{bmatrix} \).

1.4.2 \({\varvec{Z}} {\varvec{x}} = {\varvec{0}}\)

We need to show:

$$\begin{aligned} {\varvec{Z}} {\varvec{x}} = {\varvec{0}} \iff \Vert {\varvec{Z}} {\varvec{x}} \Vert ^2 = {\varvec{x}}^{\intercal } {\varvec{Z}}^{\intercal } {\varvec{Z}} {\varvec{x}} = 0 . \end{aligned}$$

The proof is immediate by noting that \({\varvec{Z}}^{\intercal } {\varvec{Z}} = \varvec{\bar{{\mathcal {E}}}}_{\lambda } \), and \( \varvec{\bar{{\mathcal {E}}}}_{\lambda } {\varvec{x}} = \varvec{\bar{{\mathcal {E}}}}_{\lambda } \tilde{\varvec{C}}^{\intercal } \begin{bmatrix} {\varvec{0}} \\ 1 \end{bmatrix} = {\varvec{O}} \begin{bmatrix} {\varvec{0}} \\ 1 \end{bmatrix} = {\varvec{0}} \).

1.5 The Thin-Plate Spline is Invariant to the Coordinate Transformation

Given a datum shape \({\varvec{D}}\), we apply the same rigid transformation \(({\varvec{R}},\, {\varvec{t}})\) to each datum point \({\varvec{p}}_i\), and denote the transformed datum point as \({\varvec{p}}_i' = {\varvec{R}} {\varvec{p}}_i + {\varvec{t}}\). Each control point \({\varvec{c}}_i\) is transformed to \( {\varvec{c}}_i' = {\varvec{R}} {\varvec{c}}_i + {\varvec{t}} \). The matrix \({\varvec{K}}_{\lambda }\) is invariant by replacing \({\varvec{c}}_i\) with \({\varvec{c}}_i'\) as its elements are radial basis functions. By replacing \(\tilde{\varvec{C}}\) by \(\tilde{\varvec{C}}' = \left[ \tilde{\varvec{c}}_1', \tilde{\varvec{c}}_2', \dots , \tilde{\varvec{c}}_l'\right] \), the matrix \(\varvec{{\mathcal {E}}}_{\lambda }\) becomes:

$$\begin{aligned} \varvec{{\mathcal {E}}}_{\lambda }' = \begin{bmatrix} {\varvec{I}} &{} {\varvec{O}} \\ {\varvec{O}} &{} ({\varvec{T}}^{\intercal })^{-1} \end{bmatrix} \varvec{{\mathcal {E}}}_{\lambda } , \ \mathrm {with} \ {\varvec{T}} = \begin{bmatrix} {\varvec{R}} &{} {\varvec{t}} \\ {\varvec{0}}^{\intercal } &{} 1 \end{bmatrix} . \end{aligned}$$

\(\varvec{\phi }_{{\varvec{p}}_i}\) is also invariant by simultaneously replacing \({\varvec{c}}_i\) with \({\varvec{c}}_i'\) and \({\varvec{p}}_i\) with \({\varvec{p}}_i'\). Notice that \(\tilde{\varvec{p}}_i' = {\varvec{T}} \tilde{\varvec{p}}_i\), thus:

$$\begin{aligned} \varvec{\beta } \left( {\varvec{p}}_i'\right) = {\varvec{{\mathcal {E}}}_{\lambda }'}^{\intercal } \begin{bmatrix} \varvec{\phi }_{{\varvec{p}}_i'} \\ \tilde{\varvec{p}}_i' \end{bmatrix} = \varvec{{\mathcal {E}}}_{\lambda }^{\intercal } \begin{bmatrix} \varvec{\phi }_{{\varvec{p}}_i} \\ \tilde{\varvec{p}}_i \end{bmatrix} = \varvec{\beta } \left( {\varvec{p}}_i\right) . \end{aligned}$$

This proves \(\varvec{{\mathcal {B}}}\left( {\varvec{D}}'\right) = \varvec{{\mathcal {B}}}\left( {\varvec{D}}\right) \) where \({\varvec{D}}' = {\varvec{R}} {\varvec{D}} + {\varvec{t}} {\varvec{1}}^{\intercal }\). Denote \(\varvec{\bar{{\mathcal {E}}}}_{\lambda }'\) to be the top \(l \times l\) sub-block of \(\varvec{{\mathcal {E}}}_{\lambda }'\). It is obvious that \(\varvec{\bar{{\mathcal {E}}}}_{\lambda }' = \varvec{\bar{{\mathcal {E}}}}_{\lambda }\) thus the regularization matrix \({\varvec{Z}}' = \sqrt{ \varvec{\bar{{\mathcal {E}}}}_{\lambda }' } = \sqrt{ \varvec{\bar{{\mathcal {E}}}}_{\lambda } } = {\varvec{Z}} \) remains unchanged.

Lemmas and Proof of Theorem 2

1.1 Lemmas on Positive Semidefinite Matrices

Lemma 7

For any \({\varvec{M}} = \sum _{i=1}^{n} {\varvec{M}}_i\), if \({\varvec{M}}_i \succeq {\varvec{O}}\) for each \(i \in \left[ 1:n \right] \), then \({\varvec{M}} \succeq {\varvec{O}}\).

Lemma 8

For any \({\varvec{M}} \succeq {\varvec{O}}\), we have \({\varvec{x}}^{\intercal } {\varvec{M}} {\varvec{x}} = 0 \iff {\varvec{M}} {\varvec{x}} = {\varvec{0}}\).

Proof

Given \({\varvec{M}} \succeq {\varvec{O}}\), there exist a matrix \(\sqrt{{\varvec{M}}}\), such that \({\varvec{M}} = \sqrt{{\varvec{M}}} \sqrt{{\varvec{M}}}\). Therefore:

$$\begin{aligned} {\varvec{x}}^{\intercal } {\varvec{M}} {\varvec{x}} = \Vert \sqrt{{\varvec{M}}} {\varvec{x}} \Vert _2^2 = 0 \iff \sqrt{{\varvec{M}}} {\varvec{x}} = {\varvec{0}}\\\quad \Longrightarrow {\varvec{M}} {\varvec{x}} = \sqrt{{\varvec{M}}} \sqrt{{\varvec{M}}} {\varvec{x}} = {\varvec{0}} . \end{aligned}$$

This completes the proof of the sufficiency. The necessity is obvious. \(\square \)

Lemma 9

For any \({\varvec{M}} = \sum _{i=1}^{n} {\varvec{M}}_i \succeq {\varvec{O}}\), with each \({\varvec{M}}_i \succeq {\varvec{O}}\), we have \({\varvec{M}} {\varvec{x}} = {\varvec{0}} \iff {\varvec{M}}_i {\varvec{x}} = {\varvec{0}} \) for each \(i \in \left[ 1:n \right] \).

Proof

Because \({\varvec{M}} \succeq {\varvec{O}}\) and \({\varvec{M}}_i \succeq {\varvec{O}}\), by Lemma 8, it suffices to show \({\varvec{x}}^{\intercal } {\varvec{M}} {\varvec{x}} = 0 \iff {\varvec{x}}^{\intercal } {\varvec{M}}_i\, {\varvec{x}} = 0\), which is true because \( {\varvec{x}}^{\intercal } {\varvec{M}} {\varvec{x}} = \sum _{i=1}^{n} {\varvec{x}}^{\intercal } {\varvec{M}}_i\, {\varvec{x}} \) with each summand \({\varvec{x}}^{\intercal } {\varvec{M}}_i\, {\varvec{x}} \ge 0\). Therefore \({\varvec{x}}^{\intercal } {\varvec{M}} {\varvec{x}} = 0 \iff {\varvec{x}}^{\intercal } {\varvec{M}}_i\, {\varvec{x}} = 0\) for each \(i \in \left[ 1:n \right] \). \(\square \)

1.2 Proof of Theorem 2

We assume \(\mu _i\) being a tuning parameter \(\mu _i \ge 0\). Then the following chain of generalized equality holds:

$$\begin{aligned} {\varvec{I}} \succeq \varvec{{\mathcal {B}}}_i^{\intercal } \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal } \right) ^{-1} \varvec{{\mathcal {B}}}_i \succeq \varvec{{\mathcal {B}}}_i^{\intercal } \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal } + \mu _i {\varvec{Z}}_i^{\intercal } {\varvec{Z}}_i\right) ^{-1} \varvec{{\mathcal {B}}}_i \succeq {\varvec{O}} . \end{aligned}$$

Therefore \({\varvec{I}} - {\varvec{Q}}_i \succeq {\varvec{O}}\), and thus \( \varvec{{\mathcal {P}}}_{\mathrm {II}} = \sum _{i=1}^{n} \left( {\varvec{I}} - {\varvec{Q}}_i\right) \succeq {\varvec{O}} \).

Proof

We shall prove the result by showing: \((a)\iff (b)\), \((a)\iff (c)\), \((a)\iff (d)\), and \((c)\iff (e)\).

\((a)\iff (b)\). Because \(\varvec{{\mathcal {P}}}_{\mathrm {II}} = n {\varvec{I}} - \varvec{{\mathcal {Q}}}_{\mathrm {II}} \), thus \(\varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} = n {\varvec{1}} - \varvec{{\mathcal {Q}}}_{\mathrm {II}} {\varvec{1}} \), which means \(\varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} = {\varvec{0}} \iff \varvec{{\mathcal {Q}}}_{\mathrm {II}} {\varvec{1}} = n {\varvec{1}}\).

\((a)\iff (c)\). Note that \(\varvec{{\mathcal {P}}}_{\mathrm {II}} = \sum _{i=1}^{n} \left( {\varvec{I}} - {\varvec{Q}}_i\right) \succeq {\varvec{O}}\) with \({\varvec{I}} - {\varvec{Q}}_i \succeq {\varvec{O}}\). Therefore by Lemma 9, we have \( \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} = {\varvec{0}} \iff \left( {\varvec{I}} - {\varvec{Q}}_i\right) {\varvec{1}} = {\varvec{0}} \iff {\varvec{Q}}_i {\varvec{1}} = {\varvec{1}} \).

\((a)\iff (d)\). Applying a translation \({\varvec{t}}\) to \({\varvec{S}}\), we have:

$$\begin{aligned} \mathbf {tr}\left( {\varvec{S}} \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{S}}^{\intercal }\right) = \mathbf {tr}\left( \left( {\varvec{S}} + {\varvec{t}} {\varvec{1}}^{\intercal }\right) \varvec{{\mathcal {P}}}_{\mathrm {II}} \left( {\varvec{S}} + {\varvec{t}} {\varvec{1}}^{\intercal }\right) ^{\intercal } \right) \iff \nonumber \\\quad - 2 \mathbf {tr}\left( {\varvec{S}} \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal } \right) = \mathbf {tr}\left( {\varvec{t}} {\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) . \end{aligned}$$
(34)

Sufficiency: if \(\varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} = {\varvec{0}}\), then \(- 2 \mathbf {tr}\left( {\varvec{S}} \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal } \right) = \mathbf {tr}\left( {\varvec{t}} {\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) = 0 \).

Necessity: note that \(\varvec{{\mathcal {P}}}_{\mathrm {II}}\) is positive semidefinite, so \(\mathbf {tr}\left( {\varvec{t}} {\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) \ge 0\), which means \(\mathbf {tr}\left( {\varvec{S}} \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) \le 0\). Since Eq. (34) holds for any \({\varvec{t}}\), flip the sign of \({\varvec{t}}^{\intercal }\), we obtain \(\mathbf {tr}\left( {\varvec{S}} \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) \ge 0\). Therefore:

$$\begin{aligned} 0 \le \mathbf {tr}\left( {\varvec{S}} \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) \le 0 \iff \\\quad \mathbf {tr}\left( {\varvec{S}} \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) = 0 \iff \mathbf {tr}\left( {\varvec{t}} {\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) = 0 . \end{aligned}$$

Note that \({\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}}\) is a scalar, hence:

$$\begin{aligned} \mathbf {tr}\left( {\varvec{t}} {\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} {\varvec{t}}^{\intercal }\right) = {\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} \, \mathbf {tr}\left( {\varvec{t}} {\varvec{t}}^{\intercal }\right) = {\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} \Vert {\varvec{t}} \Vert _2^2 = 0, \end{aligned}$$

for any \({\varvec{t}}\), if and only if \( {\varvec{1}}^{\intercal } \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} = 0 \iff \varvec{{\mathcal {P}}}_{\mathrm {II}} {\varvec{1}} ={\varvec{0}} \) by Lemma 8.

\((c)\iff (e)\). Let \( \varvec{\bar{Q}}_i = \varvec{{\mathcal {B}}}_i^{\intercal } \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal } \right) ^{-1} \varvec{{\mathcal {B}}}_i \), and \(\left( \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger } = \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal } \right) ^{-1} \varvec{{\mathcal {B}}}_i \). Applying the Woodbury matrix identity, the matrix \({\varvec{Q}}_i\) can be expanded as:

$$\begin{aligned} \begin{aligned} {\varvec{Q}}_i&= \varvec{{\mathcal {B}}}_i^{\intercal } \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal } + \mu _i {\varvec{Z}}_i^{\intercal } {\varvec{Z}}_i\right) ^{-1} \varvec{{\mathcal {B}}}_i\\&= \varvec{\bar{Q}}_i - \mu _i \underbrace{ \left( {\varvec{Z}}_i\left( \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger }\right) ^{\intercal } \left( {\varvec{I}} + \mu _i {\varvec{Z}}_i \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{-1} {\varvec{Z}}_i^{\intercal }\right) ^{-1} {\varvec{Z}}_i \left( \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger } }_{\varvec{{\Delta }}_i} . \end{aligned} \end{aligned}$$

Then we write:

$$\begin{aligned} {\varvec{I}} - {\varvec{Q}}_i = \left( {\varvec{I}} - \varvec{\bar{Q}}_i \right) + \mu _i {\varvec{{\Delta }}_i} \succeq {\varvec{O}}, \end{aligned}$$

with \({\varvec{I}} - \varvec{\bar{Q}}_i \succeq {\varvec{O}}\) and \({\varvec{{\Delta }}_i} \succeq {\varvec{O}}\). Noting that \(\mu _i \ge 0\), by Lemma 9, we have \(\left( {\varvec{I}} - {\varvec{Q}}_i\right) {\varvec{1}} = {\varvec{0}}\) if and only if \(\left( {\varvec{I}} - \varvec{\bar{Q}}_i\right) {\varvec{1}} = {\varvec{0}}\) and \( {\varvec{{\Delta }}_i} {\varvec{1}} = {\varvec{0}} \). In other words, \({\varvec{Q}}_i {\varvec{1}} = {\varvec{1}}\) if and only if \(\varvec{\bar{Q}}_i {\varvec{1}} = {\varvec{1}}\) and \({\varvec{{\Delta }}_i} {\varvec{1}} = {\varvec{0}}\).

Note that \(\varvec{\bar{Q}}_i = \varvec{{\mathcal {B}}}_i^{\intercal } \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal } \right) ^{-1} \varvec{{\mathcal {B}}}_i\) is the orthogonal projection into the range space of \(\varvec{{\mathcal {B}}}_i^{\intercal }\), which states \(\varvec{\bar{Q}}_i {\varvec{1}} = {\varvec{1}} \iff {\varvec{1}} \in \mathbf {Range}\left( \varvec{{\mathcal {B}}}_i^{\intercal } \right) \iff \exists {\varvec{x}}\) such that \(\varvec{{\mathcal {B}}}_i^{\intercal } {\varvec{x}} = {\varvec{1}}\). Therefore such an \({\varvec{x}}\) is \({\varvec{x}} = \left( \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger } {\varvec{1}}\).

Since \({\varvec{Z}}_i \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{-1} {\varvec{Z}}_i^{\intercal } \succeq {\varvec{O}}\), we know \({\varvec{I}} + \mu _i {\varvec{Z}}_i \left( \varvec{{\mathcal {B}}}_i \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{-1} {\varvec{Z}}_i^{\intercal }\) is exactly positive definite. Therefore \({\varvec{{\Delta }}_i} {\varvec{1}} = {\varvec{0}} \iff {\varvec{1}}^{\intercal } {\varvec{{\Delta }}_i} {\varvec{1}} = 0\) happens if and only if \({\varvec{Z}}_i \left( \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger } {\varvec{1}} = {\varvec{0}}\) which is \({\varvec{Z}}_i {\varvec{x}} = {\varvec{0}}\). \(\square \)

Proof of Theorem 3

Proof

\((b) \iff (c)\). We define \(\varvec{\bar{Q}}_i = \varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal } \left( \varvec{{\mathcal {B}}}_i \varvec{\Gamma }_i \varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal } \right) ^{-1} \varvec{{\mathcal {B}}}_i \varvec{\Gamma }_i\). By the Woodbury matrix identity:

$$\begin{aligned} \begin{aligned}&\varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal } \left( \varvec{{\mathcal {B}}}_i \varvec{\Gamma }_i \varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal } + \mu _i {\varvec{Z}}_i^{\intercal } {\varvec{Z}}_i \right) ^{-1} \varvec{{\mathcal {B}}}_i \varvec{\Gamma }_i\\&\quad = \varvec{\bar{Q}}_i - \mu _i \underbrace{ \left( {\varvec{Z}}_i\left( \varvec{\Gamma }_i\varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger }\right) ^{\intercal } \left( {\varvec{I}} + \mu _i {\varvec{Z}}_i \left( \varvec{{\mathcal {B}}}_i \varvec{\Gamma }_i \varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{-1} {\varvec{Z}}_i^{\intercal }\right) ^{-1} {\varvec{Z}}_i \left( \varvec{\Gamma }_i\varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger } }_{\varvec{{\Delta }}_i} . \end{aligned} \end{aligned}$$

Then \({\varvec{P}}_i\) can be written as: \( {\varvec{P}}_i = \left( \varvec{\Gamma }_i - \varvec{\bar{Q}}_i \right) + \mu _i {\varvec{{\Delta }}_i} \) with \(\varvec{\Gamma }_i - \varvec{\bar{Q}}_i \succeq {\varvec{O}}\) and \(\varvec{{\Delta }}_i \succeq {\varvec{O}}\). Thus \({\varvec{P}}_i \succeq {\varvec{O}}\). By Lemma 9, \({\varvec{P}}_i {\varvec{1}} = {\varvec{0}}\) if and only if \(\left( \varvec{\Gamma }_i - \varvec{\bar{Q}}_i \right) {\varvec{1}} = {\varvec{0}}\) and \(\varvec{{\Delta }}_i {\varvec{1}} = {\varvec{0}}\)

Note that \(\varvec{\bar{Q}}_i\) is the orthogonal projection to \(\mathbf {Range}\left( \varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal } \right) \). Thus we have:

$$\begin{aligned} \left( \varvec{\Gamma }_i - \varvec{\bar{Q}}_i \right) {\varvec{1}} = {\varvec{0}} \iff \varvec{\bar{Q}}_i \varvec{\Gamma }_i {\varvec{1}} = \varvec{\Gamma }_i {\varvec{1}} \iff \varvec{\Gamma }_i {\varvec{1}} \in \mathbf {Range}\left( \varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal } \right) , \end{aligned}$$

if and only if there exists \({\varvec{x}}\) such that \(\varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal } {\varvec{x}} = \varvec{\Gamma }_i {\varvec{1}}\). Such an \({\varvec{x}}\) is \({\varvec{x}} = \left( \varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger } {\varvec{1}}\).

Note that \(\varvec{{\Delta }}_i \succeq {\varvec{O}}\) and \({\varvec{I}} + \mu _i {\varvec{Z}}_i \left( \varvec{{\mathcal {B}}}_i \varvec{\Gamma }_i \varvec{\Gamma }_i \varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{-1} {\varvec{Z}}_i^{\intercal } \succ {\varvec{O}}\), thus we have:

$$\begin{aligned} \varvec{{\Delta }}_i {\varvec{1}} = {\varvec{0}} \iff {\varvec{1}}^{\intercal } \varvec{{\Delta }}_i {\varvec{1}} = 0 \iff {\varvec{Z}}_i {\varvec{x}} = {\varvec{Z}}_i \left( \varvec{\Gamma }_i\varvec{{\mathcal {B}}}_i^{\intercal }\right) ^{\dagger } {\varvec{1}} = {\varvec{0}} . \end{aligned}$$

\((a) \iff (b)\). Since \({\varvec{P}}_i \succeq {\varvec{O}}\), the proof is immediate by Lemma 9.

This completes the proof. \(\square \)

Rigid Procrustes Analysis with Two Shapes

The pairwise Procrustes problem aims to solve for a similarity transformation (a scale factor \(s>0\), a rotation/reflection \({\varvec{R}} \in \mathrm {O}(3)\), and a translation \({\varvec{t}} \in {\mathbb {R}}^3\)), that minimizes the registration error between the two point-clouds \({\varvec{D}}_1\) and \({\varvec{D}}_2\) with known correspondences:

$$\begin{aligned} \mathop {{\mathrm{arg}}\,{\mathrm{min}}}\limits _{s > 0,\, {\varvec{R}} \in \mathrm {O(3)}}\ \Vert s {\varvec{R}} {\varvec{D}}_1 + {\varvec{t}} {\varvec{1}}^{\intercal } - {\varvec{D}}_2 \Vert _F^2 . \end{aligned}$$

This problem admits a closed-form solution (Horn 1987; Horn et al. 1988; Kanatani 1994).

Let \( \varvec{\bar{D}}_1 = {\varvec{D}}_1 - \frac{1}{m} {\varvec{D}}_1 {\varvec{1}} {\varvec{1}}^{\intercal } \), and \( \varvec{\bar{D}}_2 = {\varvec{D}}_2 - \frac{1}{m} {\varvec{D}}_2 {\varvec{1}} {\varvec{1}}^{\intercal } \). Let the SVD of \(\varvec{\bar{D}}_1 \varvec{\bar{D}}_2^{\intercal }\) be \( \varvec{\bar{D}}_1 \varvec{\bar{D}}_2^{\intercal } = \varvec{\bar{U}} \varvec{\bar{\Sigma }} \varvec{\bar{V}}^{\intercal } \). Then the optimal rotation is \({\varvec{R}}^{\star } = \varvec{\bar{V}} \varvec{\bar{U}}^{\intercal }\) if \({\varvec{R}}^{\star } \in \mathrm {O}(3)\). If we require \({\varvec{R}}^{\star } \in \mathrm {SO}(3)\), the optimal rotation is:

$$\begin{aligned} {\varvec{R}}^{\star } = \varvec{\bar{V}}\, \mathrm {diag} \left( 1, 1, \det \left( \varvec{\bar{V}} \varvec{\bar{U}}^{\intercal }\right) \right) \varvec{\bar{U}}^{\intercal } . \end{aligned}$$

The optimal scale factor and the optimal translation are then given by:

$$\begin{aligned} s^{\star }&= \frac{\mathbf {tr}\left( {\varvec{R}}^{\star } \varvec{\bar{D}}_1 \varvec{\bar{D}}_2^{\intercal } \right) }{\mathbf {tr}\left( \varvec{\bar{D}}_1 \varvec{\bar{D}}_1^{\intercal } \right) }\\ {\varvec{t}}^{\star }&= \frac{1}{m} {\varvec{D}}_2 {\varvec{1}} - \frac{1}{m} s^{\star } {\varvec{R}}^{\star } {\varvec{D}}_1 {\varvec{1}} . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bai, F., Bartoli, A. Procrustes Analysis with Deformations: A Closed-Form Solution by Eigenvalue Decomposition. Int J Comput Vis 130, 567–593 (2022). https://doi.org/10.1007/s11263-021-01571-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-021-01571-8

Keywords

Navigation