GreenWarps: A Two-Stage Warping Model for Stitching Images Using Diffeomorphic Meshes and Green Coordinates

Jacob, Geethu Miriam; Das, Sukhendu

doi:10.1007/978-3-030-11018-5_67

GreenWarps: A Two-Stage Warping Model for Stitching Images Using Diffeomorphic Meshes and Green Coordinates

Conference paper
First Online: 23 January 2019

1360 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11132))

Abstract

Image Stitching is a hard task to solve in the presence of large parallax in the images. Specifically, for a sequence of frames from unconstrained videos which are considerably shaky, recent works fail to align such a sequence of images accurately. The proposed method “GreenWarps” aims to accurately align frames/images with large parallax. The method consists of two novel stages, namely, Prewarping and Diffeomorphic Mesh warping. The first stage warps unaligned image to the reference image using Green Coordinates. The second stage of the model refines the alignment by using a demon-based diffeomorphic warping method for mesh deformation termed “DiffeoMeshes”. The warping is performed using Green Coordinates in both the stages without the assumption of any motion model. The combination of the two stages provide accurate alignment of the images. Experiments were performed on two standard image stitching datasets and one dataset consisting of images created from unconstrained videos. The results show superior performance of our method compared to the state-of-the-art methods.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Image Stitching is a widely studied problem in the field of computer vision and graphics, which generates a single wide field-of-view image from a set of narrow field-of-view images. Several warping models, including homography-based warps [1, 2] spatially varying warping models [3,4,5], hybrid models [6,7,8,9], parallax tolerant models [10,11,12] and image stitching softwares such as Adobe Photoshop and Autostitch, fail to perform well when non-ideal data is provided as input. The main challenges of any stitching algorithm are parallax error, occlusions, motion blur and presence of moving objects. Specifically, for stitching frames of an unconstrained video (e.g shaky/jittery videos), the state-of-the-art techniques fail to provide satisfactory results. The reason is that image stitching assume specific underlying motion models, thus making the task highly challenging in presence of large parallax.

Common approaches to the image stitching algorithms follow the pipeline of estimating transformations between the images, aligning the images using a warping model and stitching them using seam techniques or blending techniques. We present a novel mesh-based warping model termed “GreenWarps”, utilizing Green coordinates [13] and a demon-based diffeomorphic warping model [14] to align the images. “GreenWarps” warping model consists of two stages, namely, pre-warping and “DiffeoMeshes”. The first stage produces a global conformal mapping between the images to be stitched. The conformal mappings induce no shear at all, thereby providing shape-preserving and distortion-free deformations. The second stage of the proposed method, termed “DiffeoMeshes”, provides a mesh deformation based on semi-dense correspondences of the two images and refines the alignment obtained from the first stage. Both the stages utilize Green coordinates for warping the deformed meshes, instead of warping the images based on computed transformations as in previous approaches. Since our method does not assume any motion model, it is immune to large parallax error.

2 Proposed Framework

The steps of the proposed “GreenWarps” method are: (i) estimate SIFT correspondences, (ii) pre-warping based on Green coordinates, (iii) mesh deformation based on DiffeoMeshes and warping based on Green coordinates, (iv) blend the images to obtain stitched image. Similar to spatially varying warps, GreenWarps perform a shape preserving deformation of the mesh for aligning images to the reference image. Interestingly, our approach does not compute any transformation matrix during the process of alignment or warping. This ensures that our method does not assume any motion model. Warping in both the stages (Pre-warping and DiffeoMeshes) is performed based on Green coordinates.

Among the images to be stitched, we take one of them as Reference image (R) and the other as Unaligned image (U). The unaligned image is first divided into image grids, where each grid has 4 vertices. The pre-warping stage takes a $2\times 2$ mesh grid of U. Every point $X_k$ of the unaligned image is defined in terms of the Green coordinates [13] of its corresponding mesh grid as $X_k = \phi _k(X_k)^TV_k+\psi _k(X_k)^TN_k$, where $\phi _k(X_k), \psi _k(X_k)$ are the Green coordinate vectors associated with the 4 coordinates and edges of the mesh grid containing the point $X_k$, $V_k$ is a vector of 4 vertices and $N_k=[n(t_k^1)\ n(t_k^2)\ n(t_k^3)\ n(t_k^4)]$ is a vector of normals of edges $t_k^i$ of the grid containing the point $X_k$. An as-similar-as-possible mesh deformation [3] is performed generating the deformed vertices $\hat{V}$ based on the corresponding SIFT features. The Green coordinates (for every pixel in the image) are first estimated from the initial mesh as derived in [15]. The warping of the image based on the deformed vertices are performed using the computed Green coordinates. The corresponding position of any point of the unaligned image in the pre-warped image ($\hat{X}_k$), with the deformed vertices $\hat{V}$ and updated normals $\hat{N}$ is obtained as follows: $\hat{X}_k = \phi _k(X_k)^T\hat{V}_k+\psi _k(X_k)^Tm_k\hat{N}_k$. Here, $m_k$ is the normalized edge length [13]. Warping based on Green coordinates, as in [13] provides a conformal mapping, preserving the shape of the structures. Thus, Green coordinates provide a natural transformation of the image for alignment without assuming any motion model. Perspective distortion, a problem in many previous approaches [10, 12, 16] is absent in our approach.

The second stage of our approach, termed DiffeoMeshes, refines the alignment by obtaining a per-pixel displacement (spatial transformation) of the region of the overlap of the pre-warped and reference images. Let the overlap regions of the pre-warped and reference images be $M_U$ and $M_R$ respectively. A mesh deformation is performed based on the spatial transformation obtained. The demon-based diffeomorphic transformation, s is estimated using the following optimization function [14]:

$$\begin{aligned} E_{diff}(s) = Sim(M_U, M_R\circ s) + Reg(s) \end{aligned}$$

(1)

The similarity (correspondence) term is $Sim(M_U, M_R\circ s) = \sum _{p=1}^L||M_U(p)-M_R(p)\circ s(p)||_2^2$, and the second regularization term is defined as $\sum _{p=1}^L||\nabla s(p)||_2$, where, $\circ $ indicates the per-pixel spatial warping function and $L=|M_U|=|M_R|$, where |.| is the cardinality function. All the demon-based diffeomorphic registrations [14, 17, 18] uses Gaussian smoothing for the purpose of regularization. Our proposed method utilizes TV-based regularization [19] and this helps in preserving the edges while updating the transformation. An iterative alternating minimization of the correspondence energy and the regularization energy is performed to obtain the diffeomorphic transformation.

Let the mesh grid vertices at the second stage before and after deformation be $\mathcal {V}$ and $\hat{\mathcal {V}}$. DiffeoMeshes minimizes the optimization function $E(\hat{\mathcal {V}})=E_d(\hat{\mathcal {V}})+w_sE_s(\hat{\mathcal {V}})$ where $E_d$ is the data term and $E_s$ is the smoothness term. The data term minimizes the distance between the measured point and the interpolated location in the mesh using diffeomorphic transformation. The data term of DiffeoMeshes is: $E_d(\hat{\mathcal {V}}) =\sum _{p=1}^{N_d} ||s(p)||_2^2$, where $N_d$ is the number of pixels selected from the overlap region of the pre-warped and reference images and s(p) is the diffeomorphic transformation in pixel p. Only those pixels belonging to the edges, with exact match are taken for obtaining the mesh (semi-dense correspondences). $E_s(\hat{\mathcal {V}})$ is the same as that used in [3]. The smoothness term minimizes the deviation of each deformed mesh grid from a similarity transformation of its input mesh. The solution of the problem is obtained using a Jacobi based linear solver. Once the deformed mesh vertices are obtained, the refined alignment is obtained by warping using Green coordinates similar to the first stage. The aligned images are then blended using the multi-band blending method of [20].

Table 1. Comparison of the performances for two standard datasets [10, 12] for large parallax and one dataset consisting of frames from unconstrained videos.

Full size table

3 Experimental Results

Experiments were performed on two parallax-tolerant image stitching datasets [10, 12] and a new dataset consisting of 2-3 frames of unconstrained videos. Parallax error and presence of moving objects are the main challenges of the images in the dataset. Our method is evaluated with the state-of-the-art methods [3, 4, 8, 21]. The error measures used for determining the alignment quality of the images are mean geometric error ($E_{mg}$) and correlation error ($E_{corr}$). $E_{mg}$ measures the average distance between the corresponding feature points after alignment and $E_{corr}$ is defined as the average of one minus Normalized Cross Correlation(NCC) over a neighborhood in the overlapped region. Lower values of the measure indicates better performance. Table 1 shows the average (over the whole dataset) alignment errors of all 3 datasets in comparison to the state-of-the-art methods. As seen in the table, our method outperforms the state-of-the-art methods for every dataset. Some qualitative results are also shown in Fig. 1. The comparison with the methods [4, 8, 10, 12] are shown in the figure. The red boxes indicate the erroneous regions of alignment, whereas the blue boxes shows the corresponding regions accurately aligned. The superiority of the method can be seen from the qualitative and quantitative results.

References

Brown, M., Lowe, D.G.: Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 74(1), 59–73 (2007)
Article Google Scholar
Szeliski, R., Shum, H.Y.: Creating full view panoramic image mosaics and environment maps. In: SIGGRAPH (1997)
Google Scholar
Liu, F., Gleicher, M., Jin, H., Agarwala, A.: Content-preserving warps for 3D video stabilization. ACM Trans. Graph. (TOG) 28(3), 44 (2009)
Google Scholar
Zaragoza, J., Chin, T.J., Brown, M.S., Suter, D.: As-projective-as-possible image stitching with moving DLT. In: CVPR (2013)
Google Scholar
Lin, C.C., Pankanti, S.U., Natesan Ramamurthy, K., Aravkin, A.Y.: Adaptive as-natural-as-possible image stitching. In: CVPR (2015)
Google Scholar
Yan, W., Hou, C., Lei, J., Fang, Y., Gu, Z., Ling, N.: Stereoscopic image stitching based on a hybrid warping model. IEEE Trans. Circuits Syst. Video Technol. 27(9), 1934–1946 (2017)
Article Google Scholar
Gao, J., Kim, S.J., Brown, M.S.: Constructing image panoramas using dual-homography warping. In: CVPR (2011)
Google Scholar
Chang, C.H., Sato, Y., Chuang, Y.Y.: Shape-preserving half-projective warps for image stitching. In: CVPR (2014)
Google Scholar
Lin, K., Jiang, N., Liu, S., Cheong, L.F., Lu, M.D.J.: Direct photometric alignment by mesh deformation. In: CVPR (2017)
Google Scholar
Zhang, F., Liu, F.: Parallax-tolerant image stitching. In: CVPR (2014)
Google Scholar
Chen, Y.-S., Chuang, Y.-Y.: Natural image stitching with the global similarity prior. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 186–201. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_12
Chapter Google Scholar
Lin, K., Jiang, N., Cheong, L.-F., Do, M., Lu, J.: SEAGULL: seam-guided local alignment for parallax-tolerant image stitching. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 370–385. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_23
Chapter Google Scholar
Lipman, Y., Levin, D., Cohen-Or, D.: Green coordinates. ACM Trans. Graph. (TOG) 27(3), 78 (2008)
Article Google Scholar
Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Diffeomorphic demons: efficient non-parametric image registration. NeuroImage 45(1), S61–S72 (2009)
Article Google Scholar
Lipman, Y., Levin, D.: Derivation and analysis of green coordinates. Comput. Methods Funct. Theory 10(1), 167–188 (2010)
Article MathSciNet Google Scholar
Li, J., Wang, Z., Lai, S., Zhai, Y., Zhang, M.: Parallax-tolerant image stitching based on robust elastic warping. IEEE Trans. Multimedia 20, 1672–1687 (2017)
Article Google Scholar
Thirion, J.P.: Image matching as a diffusion process: an analogy with Maxwell’s demons. Med. Image Anal. 2(3), 243–260 (1998)
Article Google Scholar
Santos-Ribeiro, A., Nutt, D.J., McGonigle, J.: Inertial demons: a momentum-based diffeomorphic registration framework. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9902, pp. 37–45. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46726-9_5
Chapter Google Scholar
Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20(1), 89–97 (2004)
MathSciNet MATH Google Scholar
Burt, P.J., Adelson, E.H.: A multiresolution spline with application to image mosaics. ACM Trans. Graph. (TOG) 2(4), 217–236 (1983)
Article Google Scholar
Li, N., Xu, Y., Wang, C.: Quasi-homography warps in image stitching. arXiv preprint arXiv:1701.08006 (2017)

Download references

Author information

Authors and Affiliations

Visualization and Perception Lab, Department of Computer Science and Engineering, Indian Institute of Technology, Madras, Chennai, India
Geethu Miriam Jacob & Sukhendu Das

Authors

Geethu Miriam Jacob
View author publications
You can also search for this author in PubMed Google Scholar
Sukhendu Das
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Geethu Miriam Jacob .

Editor information

Editors and Affiliations

Technical University of Munich, Garching, Germany
Laura Leal-Taixé
Technische Universität Darmstadt, Darmstadt, Germany
Stefan Roth

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1146 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jacob, G.M., Das, S. (2019). GreenWarps: A Two-Stage Warping Model for Stitching Images Using Diffeomorphic Meshes and Green Coordinates. In: Leal-Taixé, L., Roth, S. (eds) Computer Vision – ECCV 2018 Workshops. ECCV 2018. Lecture Notes in Computer Science(), vol 11132. Springer, Cham. https://doi.org/10.1007/978-3-030-11018-5_67

Download citation

DOI: https://doi.org/10.1007/978-3-030-11018-5_67
Published: 23 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11017-8
Online ISBN: 978-3-030-11018-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics