Abstract
Image Stitching is a hard task to solve in the presence of large parallax in the images. Specifically, for a sequence of frames from unconstrained videos which are considerably shaky, recent works fail to align such a sequence of images accurately. The proposed method “GreenWarps” aims to accurately align frames/images with large parallax. The method consists of two novel stages, namely, Prewarping and Diffeomorphic Mesh warping. The first stage warps unaligned image to the reference image using Green Coordinates. The second stage of the model refines the alignment by using a demon-based diffeomorphic warping method for mesh deformation termed “DiffeoMeshes”. The warping is performed using Green Coordinates in both the stages without the assumption of any motion model. The combination of the two stages provide accurate alignment of the images. Experiments were performed on two standard image stitching datasets and one dataset consisting of images created from unconstrained videos. The results show superior performance of our method compared to the state-of-the-art methods.
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
Image Stitching is a widely studied problem in the field of computer vision and graphics, which generates a single wide field-of-view image from a set of narrow field-of-view images. Several warping models, including homography-based warps [1, 2] spatially varying warping models [3,4,5], hybrid models [6,7,8,9], parallax tolerant models [10,11,12] and image stitching softwares such as Adobe Photoshop and Autostitch, fail to perform well when non-ideal data is provided as input. The main challenges of any stitching algorithm are parallax error, occlusions, motion blur and presence of moving objects. Specifically, for stitching frames of an unconstrained video (e.g shaky/jittery videos), the state-of-the-art techniques fail to provide satisfactory results. The reason is that image stitching assume specific underlying motion models, thus making the task highly challenging in presence of large parallax.
Common approaches to the image stitching algorithms follow the pipeline of estimating transformations between the images, aligning the images using a warping model and stitching them using seam techniques or blending techniques. We present a novel mesh-based warping model termed “GreenWarps”, utilizing Green coordinates [13] and a demon-based diffeomorphic warping model [14] to align the images. “GreenWarps” warping model consists of two stages, namely, pre-warping and “DiffeoMeshes”. The first stage produces a global conformal mapping between the images to be stitched. The conformal mappings induce no shear at all, thereby providing shape-preserving and distortion-free deformations. The second stage of the proposed method, termed “DiffeoMeshes”, provides a mesh deformation based on semi-dense correspondences of the two images and refines the alignment obtained from the first stage. Both the stages utilize Green coordinates for warping the deformed meshes, instead of warping the images based on computed transformations as in previous approaches. Since our method does not assume any motion model, it is immune to large parallax error.
2 Proposed Framework
The steps of the proposed “GreenWarps” method are: (i) estimate SIFT correspondences, (ii) pre-warping based on Green coordinates, (iii) mesh deformation based on DiffeoMeshes and warping based on Green coordinates, (iv) blend the images to obtain stitched image. Similar to spatially varying warps, GreenWarps perform a shape preserving deformation of the mesh for aligning images to the reference image. Interestingly, our approach does not compute any transformation matrix during the process of alignment or warping. This ensures that our method does not assume any motion model. Warping in both the stages (Pre-warping and DiffeoMeshes) is performed based on Green coordinates.
Among the images to be stitched, we take one of them as Reference image (R) and the other as Unaligned image (U). The unaligned image is first divided into image grids, where each grid has 4 vertices. The pre-warping stage takes a \(2\times 2\) mesh grid of U. Every point \(X_k\) of the unaligned image is defined in terms of the Green coordinates [13] of its corresponding mesh grid as \(X_k = \phi _k(X_k)^TV_k+\psi _k(X_k)^TN_k\), where \(\phi _k(X_k), \psi _k(X_k)\) are the Green coordinate vectors associated with the 4 coordinates and edges of the mesh grid containing the point \(X_k\), \(V_k\) is a vector of 4 vertices and \(N_k=[n(t_k^1)\ n(t_k^2)\ n(t_k^3)\ n(t_k^4)]\) is a vector of normals of edges \(t_k^i\) of the grid containing the point \(X_k\). An as-similar-as-possible mesh deformation [3] is performed generating the deformed vertices \(\hat{V}\) based on the corresponding SIFT features. The Green coordinates (for every pixel in the image) are first estimated from the initial mesh as derived in [15]. The warping of the image based on the deformed vertices are performed using the computed Green coordinates. The corresponding position of any point of the unaligned image in the pre-warped image (\(\hat{X}_k\)), with the deformed vertices \(\hat{V}\) and updated normals \(\hat{N}\) is obtained as follows: \(\hat{X}_k = \phi _k(X_k)^T\hat{V}_k+\psi _k(X_k)^Tm_k\hat{N}_k\). Here, \(m_k\) is the normalized edge length [13]. Warping based on Green coordinates, as in [13] provides a conformal mapping, preserving the shape of the structures. Thus, Green coordinates provide a natural transformation of the image for alignment without assuming any motion model. Perspective distortion, a problem in many previous approaches [10, 12, 16] is absent in our approach.
The second stage of our approach, termed DiffeoMeshes, refines the alignment by obtaining a per-pixel displacement (spatial transformation) of the region of the overlap of the pre-warped and reference images. Let the overlap regions of the pre-warped and reference images be \(M_U\) and \(M_R\) respectively. A mesh deformation is performed based on the spatial transformation obtained. The demon-based diffeomorphic transformation, s is estimated using the following optimization function [14]:
The similarity (correspondence) term is \(Sim(M_U, M_R\circ s) = \sum _{p=1}^L||M_U(p)-M_R(p)\circ s(p)||_2^2\), and the second regularization term is defined as \(\sum _{p=1}^L||\nabla s(p)||_2\), where, \(\circ \) indicates the per-pixel spatial warping function and \(L=|M_U|=|M_R|\), where |.| is the cardinality function. All the demon-based diffeomorphic registrations [14, 17, 18] uses Gaussian smoothing for the purpose of regularization. Our proposed method utilizes TV-based regularization [19] and this helps in preserving the edges while updating the transformation. An iterative alternating minimization of the correspondence energy and the regularization energy is performed to obtain the diffeomorphic transformation.
Let the mesh grid vertices at the second stage before and after deformation be \(\mathcal {V}\) and \(\hat{\mathcal {V}}\). DiffeoMeshes minimizes the optimization function \(E(\hat{\mathcal {V}})=E_d(\hat{\mathcal {V}})+w_sE_s(\hat{\mathcal {V}})\) where \(E_d\) is the data term and \(E_s\) is the smoothness term. The data term minimizes the distance between the measured point and the interpolated location in the mesh using diffeomorphic transformation. The data term of DiffeoMeshes is: \(E_d(\hat{\mathcal {V}}) =\sum _{p=1}^{N_d} ||s(p)||_2^2\), where \(N_d\) is the number of pixels selected from the overlap region of the pre-warped and reference images and s(p) is the diffeomorphic transformation in pixel p. Only those pixels belonging to the edges, with exact match are taken for obtaining the mesh (semi-dense correspondences). \(E_s(\hat{\mathcal {V}})\) is the same as that used in [3]. The smoothness term minimizes the deviation of each deformed mesh grid from a similarity transformation of its input mesh. The solution of the problem is obtained using a Jacobi based linear solver. Once the deformed mesh vertices are obtained, the refined alignment is obtained by warping using Green coordinates similar to the first stage. The aligned images are then blended using the multi-band blending method of [20].
3 Experimental Results
Experiments were performed on two parallax-tolerant image stitching datasets [10, 12] and a new dataset consisting of 2-3 frames of unconstrained videos. Parallax error and presence of moving objects are the main challenges of the images in the dataset. Our method is evaluated with the state-of-the-art methods [3, 4, 8, 21]. The error measures used for determining the alignment quality of the images are mean geometric error (\(E_{mg}\)) and correlation error (\(E_{corr}\)). \(E_{mg}\) measures the average distance between the corresponding feature points after alignment and \(E_{corr}\) is defined as the average of one minus Normalized Cross Correlation(NCC) over a neighborhood in the overlapped region. Lower values of the measure indicates better performance. Table 1 shows the average (over the whole dataset) alignment errors of all 3 datasets in comparison to the state-of-the-art methods. As seen in the table, our method outperforms the state-of-the-art methods for every dataset. Some qualitative results are also shown in Fig. 1. The comparison with the methods [4, 8, 10, 12] are shown in the figure. The red boxes indicate the erroneous regions of alignment, whereas the blue boxes shows the corresponding regions accurately aligned. The superiority of the method can be seen from the qualitative and quantitative results.
References
Brown, M., Lowe, D.G.: Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 74(1), 59–73 (2007)
Szeliski, R., Shum, H.Y.: Creating full view panoramic image mosaics and environment maps. In: SIGGRAPH (1997)
Liu, F., Gleicher, M., Jin, H., Agarwala, A.: Content-preserving warps for 3D video stabilization. ACM Trans. Graph. (TOG) 28(3), 44 (2009)
Zaragoza, J., Chin, T.J., Brown, M.S., Suter, D.: As-projective-as-possible image stitching with moving DLT. In: CVPR (2013)
Lin, C.C., Pankanti, S.U., Natesan Ramamurthy, K., Aravkin, A.Y.: Adaptive as-natural-as-possible image stitching. In: CVPR (2015)
Yan, W., Hou, C., Lei, J., Fang, Y., Gu, Z., Ling, N.: Stereoscopic image stitching based on a hybrid warping model. IEEE Trans. Circuits Syst. Video Technol. 27(9), 1934–1946 (2017)
Gao, J., Kim, S.J., Brown, M.S.: Constructing image panoramas using dual-homography warping. In: CVPR (2011)
Chang, C.H., Sato, Y., Chuang, Y.Y.: Shape-preserving half-projective warps for image stitching. In: CVPR (2014)
Lin, K., Jiang, N., Liu, S., Cheong, L.F., Lu, M.D.J.: Direct photometric alignment by mesh deformation. In: CVPR (2017)
Zhang, F., Liu, F.: Parallax-tolerant image stitching. In: CVPR (2014)
Chen, Y.-S., Chuang, Y.-Y.: Natural image stitching with the global similarity prior. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 186–201. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_12
Lin, K., Jiang, N., Cheong, L.-F., Do, M., Lu, J.: SEAGULL: seam-guided local alignment for parallax-tolerant image stitching. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 370–385. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_23
Lipman, Y., Levin, D., Cohen-Or, D.: Green coordinates. ACM Trans. Graph. (TOG) 27(3), 78 (2008)
Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Diffeomorphic demons: efficient non-parametric image registration. NeuroImage 45(1), S61–S72 (2009)
Lipman, Y., Levin, D.: Derivation and analysis of green coordinates. Comput. Methods Funct. Theory 10(1), 167–188 (2010)
Li, J., Wang, Z., Lai, S., Zhai, Y., Zhang, M.: Parallax-tolerant image stitching based on robust elastic warping. IEEE Trans. Multimedia 20, 1672–1687 (2017)
Thirion, J.P.: Image matching as a diffusion process: an analogy with Maxwell’s demons. Med. Image Anal. 2(3), 243–260 (1998)
Santos-Ribeiro, A., Nutt, D.J., McGonigle, J.: Inertial demons: a momentum-based diffeomorphic registration framework. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9902, pp. 37–45. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46726-9_5
Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20(1), 89–97 (2004)
Burt, P.J., Adelson, E.H.: A multiresolution spline with application to image mosaics. ACM Trans. Graph. (TOG) 2(4), 217–236 (1983)
Li, N., Xu, Y., Wang, C.: Quasi-homography warps in image stitching. arXiv preprint arXiv:1701.08006 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Jacob, G.M., Das, S. (2019). GreenWarps: A Two-Stage Warping Model for Stitching Images Using Diffeomorphic Meshes and Green Coordinates. In: Leal-Taixé, L., Roth, S. (eds) Computer Vision – ECCV 2018 Workshops. ECCV 2018. Lecture Notes in Computer Science(), vol 11132. Springer, Cham. https://doi.org/10.1007/978-3-030-11018-5_67
Download citation
DOI: https://doi.org/10.1007/978-3-030-11018-5_67
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11017-8
Online ISBN: 978-3-030-11018-5
eBook Packages: Computer ScienceComputer Science (R0)