Elsevier

Signal Processing

Volume 77, Issue 3, September 1999, Pages 309-334
Signal Processing

Accurate and simple geometric calibration of multi-camera systems

https://doi.org/10.1016/S0165-1684(99)00042-0Get rights and content

Abstract

In this paper we present a low-cost, accurate and flexible approach to the calibration of multi-camera acquisition systems for 3D scene modeling. The adopted calibration target-set is just a marked planar surface, which is imaged in several positions in order to emulate a larger 3D target-frame. In order to obtain a better camera parameter estimation, the proposed approach is able to refine the a priori knowledge on the target-set through a process of self-calibration. This allows us to start with rough measurements of the coordinates of the calibration targets. We formalize our parameter estimation problem as a particular case of the more general class of inverse problem. In particular, we derive an analytic prediction of the calibration performance, based on error propagation analysis, whose correctness is demonstrated through simulation experiments. Finally, the results of a series of calibration experiments on real data is presented, which confirm the effectiveness of approach in a variety of experimental conditions.

Zusammenfassung

Wir präsentieren in diesem Artikel einen kostengünstigen, genauen und flexiblen Weg, um die Multi-Kamera-Erfassung für eine 3D-Szenariummodellierung zu kalibrieren. Die angenommene Zielmenge zur Kalibrierung ist lediglich eine markierte ebene Oberfläche, die in verschiedenen Positionen ins Bild gebracht wird, um einen größeren 3D-Zielrahmen zu emulieren. Um eine bessere Schätzung der Kameraparameter zu erhalten, ist die vorgeschlagene Methode in der Lage, die a priori-Kenntnis über die Zielmenge durch einen Selbstkalibrierungsprozeß auszunutzen. Wir formulieren unser Parameterschätzproblem als einen Spezialfall der allgemeineren Klasse der Invertierungsprobleme. Speziell leiten wir eine analytische Vorhersage für die Kalibrierungsleistung her, die auf der Fehlerfortplanzungsanalyse basiert, und deren Korrektheit anhand simulierter Experimente gezeigt wird. Abschließend werden die Ergebnisse einer Reihe von Kalibrierungsexperimente mit echten Daten vorgestellt, die die Wirksamkeit der Methode in einer Vielzahl experimenteller Bedingungen bestätigen.

Résumé

Dans cet article, nous présentons une approche de faible coût, précise et flexible pour calibrer des systèmes d'acquisition à caméras multiples pour la modélisation de scènes 3D. L'ensemble cible de calibration que nous avons adopté est simplement une surface plane marquée, imagée dans plusieurs positions afin d’émuler une trame cible 3D plus large. Afin d'obtenir une meilleure estimation des paramètres des caméras, l'approche proposée permet de raffiner les connaissances a priori sur l'ensemble cible au moyen d'un processus d'auto-calibration. Ceci nous permet de commencer avec des mesures grossières des coordonnées des cibles de calibration. Nous formalisons notre problème d'estimation de paramètres comme un cas particulier d'une classe de problèmes plus générale. En particulier, nous dérivons une prédiction analytique de la performance de calibration, reposant sur une analyse de la propagation de l'erreur, dont l'exactitude est démontrée par des expériences de simulations. Finalement, nous présentons les résultats d'une série d'expériences de calibration sur des données réelles, qui confirment l'efficacité de l'approche dans une large variété de conditions expérimentales.

Introduction

Multi-camera acquisition systems are today often employed for 3D scene reconstruction in a variety of applications ranging from industrial quality control to content for virtual reality applications. In the past decades, in fact, a variety of methods have been developed for estimating the 3D structure of a scene through the joint analysis of a set of its views. Most of these methods rely on the a priori knowledge of a set of parameters that specifies the geometrical model of the acquisition system. The estimation process of such parameters is generally called camera calibration and represents a crucial step in the global reconstruction chain. As a matter of fact, the quality of the reconstruction is crucially dependent on the accuracy of the calibration process and a 3D reconstruction of “metric” quality often requires a long and cumbersome calibration procedure. It is the aim of this article to approach the calibration problem in a general fashion with the goal of keeping the complexity and the setup of the calibration procedure as simple as possible without giving up accuracy in the estimation results.

It is well-known that the 2D coordinates of some image features, as acquired with two or more cameras, can be used for recovering the 3D position of the scene details that originated them, through a process of “geometric triangulation”. In order to do so, we need to know the physical (optical and electrical) and geometrical (positional) characteristics of the cameras and we need to make sure that the correspondences between image features are correctly determined. The “matching” of image features is usually a critical problem as the search space for stereo-correspondences is two-dimensional (the image plane). However, the knowledge of the model parameters of the acquisition system can be used for making it a one-dimensional search by exploiting the epipolar geometry of the camera setup. In fact, given a point on the first image, the stereo-corresponding point on the second one is bound to lie on the epipolar line, which is the projection of the first optical ray onto the second image plane [1]. This epipolar constraint, however, is generally not enough to guarantee the correctness of a stereo correspondence. A search for feature correspondences along the epipolar line is, in fact, often performed by comparing the luminance profiles in the neighborhood of the candidate matches on the two views, under some constraints on the relative ordering between them [13]. The risk of matching ambiguities can be reduced through the adoption of global consistency constraints, implemented through dynamic programming [13]. In alternative, we can geometrically remove the ambiguity with the introduction of a third camera. In this case, in fact, each point of a matched triplet is bound to lie on the intersection of the epipolar lines corresponding to the other two points.

The use of more than three cameras could be justified by the need of making the 3D reconstruction strategy more robust or by the need of expanding the class of 3D information that we can safely extract from a joint analysis of the available views. For example, horizon contours (i.e. extremal boundaries generated by smooth self-occlusions) [14], [18] are known to provide valuable information on the local 3D structure of the surface (position, tangent plane and curvature) near the visible rims of the objects, provided that at least three cameras be available. However, in order to make this reconstruction approach robust, the adoption of at least four cameras with known geometry would help.

It is important to mention that a number of single-camera methods are also available in the literature [19], [23]. Such methods, often based on the analysis of a video sequence acquired with a single camera, are usually non-calibrated, in the sense that the parameters of a camera model are “implicitly” estimated together with the 3D structure of the imaged scene. However, if the goal is that of a high-accuracy “metric” 3D reconstruction, then a preliminary partial calibration (estimation of the intrinsic camera parameters) of the camera system becomes important, especially when the camera resolution is modest [17]. In conclusion, the acquisition systems that we are interested in are multi-camera systems and, although the approach we will illustrate can be applied to a broad range of cameras, we will focus on cameras of modest resolution (standard TV resolution) in order to verify what performance can be achieved with such low-cost devices.

One obvious way to obtain information on the camera parameters is, in fact, to decide them beforehand. This can be done through a mechanical adjustment (through high-precision mechanical supports) of the position and the orientation of each camera and on the use of “metric” lenses and sensors (optics with a priori known characteristics). This solution, however, is normally not applicable because of its complexity and its high cost. A more flexible approach is to estimate the parameters of the acquisition systems through a photogrammetric analysis of matched image features [2], [4], [17], [22]. In general the estimation procedure consists of a joint analysis of one or more views of a number of points (targets), which could be fiducial marks placed in the scene volume or even some natural point-like features that belong to the scene to be reconstructed. This procedure can be implemented in a variety of ways, depending on the structure and on the available a priori information on the calibration targets. One common approach to camera parameter estimation is to make use of an artificial target-set, whose targets are attached to a rigid frame that occupies part of the 3D viewing space, with a priori known geometrical characteristics. As the exact 3D coordinates of the targets are assumed available (for example because they have been previously measured through some high-precision procedure), they can be used together with the image coordinates of their views in order to estimate the parameters of the acquisition system. This approach is commonly referred to as (simple) calibration, and is characterized by a complete knowledge of the calibration target-set. The opposite situation is produced by a set of targets that are scattered in the scene volume in locations that are completely unknown. This extreme situation occurs when, instead of using a pre-measured calibration target-frame, we use a set of targets that have been artificially added to the scene or natural point-like features that are already present in the scene to be reconstructed. This type of blind calibration problem is usually referred to as self-calibration and, due to the much larger number of unknowns, in total absence of a priori information on the targets, it is an undetermined problem [7], in the sense that it does not allow us to recover the whole geometry of the camera system. In between the two extremes of simple and blind calibration there is a whole range of situations in which only some information on the targets or on the cameras is available in a variety of forms, for example statistical information (nominal target coordinates and a measure of their uncertainty), rigidity constraints, etc. We will see that this partial information can be successfully exploited for making the self-calibration problem solvable.

It is important to emphasize that the estimated parameters of the acquisition system are expected to hold accurate only for measurements within the 3D volume “spanned” by the specific calibration target-set [8]. In fact, roughly speaking, the target-set plays the role of a training set for the simple calibration procedure; therefore it should be chosen in such a way to be “statistically representative” of the scene to be reconstructed. As a consequence, in order to achieve high accuracy in the calibration and in the 3D reconstruction, it is important for the targets to properly “fill up” the entire volume that will be later occupied by the object to be measured. This implies that the size of an adequate calibration target-frame should be comparable with that of the scene to be reconstructed, with obvious difficulties in the calibration procedure. In order to overcome this difficulty, we virtually “enlarge” a target-set of modest size through the acquisition of a number of its views in different positions. The positions of the target-frame are chosen in such a way that the union of all targets will fill up volume of interest in such a way to be more representative of the scene to be reconstructed. Of course, every time we move the target-frame we introduce six new positional unknowns, unless we are able to force the frame into pre-determined positions through some high-precision positioning device. Due to the cost of high-precision mechanical positioners, the only feasible alternative is to move the pattern freely between acquisitions and to proceed with an a posteriori determination of this motion parameters by embedding their estimation into the calibration process itself. Notice that this way of proceeding corresponds to performing a partial self-calibration as some information on the global 3D set of targets (position of the target-frame) is not available and, therefore, must be estimated. However, we will keep referring to this method as a simple calibration technique, meaning that the available 3D coordinates of the targets within the target-frame will be taken as granted and trusted upon.

As we can easily expect, the quality of the simple calibration process is strongly influenced by the accuracy of the camera model. Because of that, the ideal projective camera [1], also known as “pin-hole” camera, is usually not accurate enough as to guarantee high accuracy in the 3D measurements. In particular, accounting for the non-ideal behavior of the camera lenses can become a crucial aspect in applications of 3D reconstruction. However, although the accuracy of the camera model can be improved through the introduction of an adequate number of parameters [21], there is no point in using too sophisticated a model as one of the main sources of inaccuracy in the calibration methods is, in fact, the accuracy with which the 3D coordinates of the targets are known. Due to the high cost of accurate measurement procedures, the only option we have for improving the performance of the parameter estimation process is to improve the calibration performance through a sort of self-calibration approach [6], which allows us to go beyond the accuracy of the available target measurements.

As already said above, a blind calibration strategy (self-calibration in total absence of information on the targets) is an extremely ill-conditioned problem. However, some approximate a priori information on the target-set is usually available, or it can be easily obtained through rough measurements. If such measurements can be assumed to be fairly unbiased, even if they are rough, we can devise a self-calibration strategy that is able to refine the rough measurements of the target's coordinates while estimating the parameters of the acquisition system. In general, however, as we need to maximize the accuracy of the target's coordinates, we need the noise that the data is affected by (additive noise and the consequent error in the localization of the image coordinates of the targets) [6] to be of modest magnitude. In the following, we will refer to this approach as a self-calibration method, in the sense that the accuracy of the target's coordinates is improved by the estimation process, with a consequent improvement of the calibration's accuracy.

In this article we propose a simple and effective technique for calibrating CCD-based multi-camera acquisition systems, which is capable of highly accurate results even when using a low-cost planar calibration target-set of modest size, and low-cost imaging devices, such as standard TV-resolution cameras connected to commercial frame-grabbers. The key features of the method are the above-described “multi-view, multi-camera” (MVMC) approach,1 based on the analysis of a number of views of a calibration target-set placed in different positions, combined with a self-calibration approach, which makes it able to refine (when necessary) rough information on the target's coordinates. Our goal is to show that accurate calibration can be a task of fairly modest difficulty and cost.

In order to devise and develop the method proposed in this manuscript, we formalized the simple calibration and the self-calibration methods as two particular instances of the more general class of inverse problems [16], which only differ in the input data. In fact, we derived an analytic prediction of the calibration performance, based on error propagation analysis, whose correctness is demonstrated in the manuscript through simulation experiments. Furthermore, a series of calibration experiments on real data has been carried out in order to evaluate the accuracy and the robustness of the proposed algorithm in a variety of experimental conditions. In particular, we conducted a series of experiments for comparing the performance of the self-calibration approach with that achievable through simple calibration.

The article is organized as follows: in Section 2 we summarize some basic concepts and define the notation that we will need to approach the considered problem. In particular, the camera model adopted for this work is illustrated in detail. Furthermore, we present the simple calibration and the self-calibration problems (Section 2.2) as particular cases of inverse problems. In particular, in Section 2.3 we discuss an approach to inverse problems that can be used for analytically predicting the performance of the simple calibration and the self-calibration methods. In 3 Multi-view multi-camera approach, 4 Implementation we illustrate our approach to simple-calibration/self-calibration, based on multiple acquisitions of a planar target-set. This approach allows a great flexibility in the exploitation of the a priori knowledge of the acquisition system and on the target-set. Furthermore it allows us to obtain a level of accuracy which is at the same level as with 3D target-sets. 5 Simulation results, 6 Experimental results are devoted to the presentation of the results of some simulation experiments on synthetic data and on some calibration experiments on real data. Such experiments confirm the validity of the analytical prediction of the performance of our method, and prove the effectiveness and the flexibility of the approach in a variety of experimental conditions. Section 7 concludes the manuscript with final remarks and suggestions for future improvements.

Section snippets

Preliminaries

The goal of this section is to present and formalize the camera simple-calibration and self-calibration problems as instances of the general theory of inverse problems. In order to do so, we will first provide a description of the adopted camera model and of the parameter estimation approach, in order to be able to discuss this approach as a particular case of inverse problem.

Multi-view multi-camera approach

As already said in the previous Section, the aim of simple camera calibration and self-calibration is to estimate the model parameters m from the knowledge about the observed data p, i.e. to solve the inverse problem m=g−1(p). The observed data p is a set of all image coordinates of the calibration targets, as seen on all available views, i.e. p={p(j)(i);1⩽i⩽N,1⩽j⩽M} where N and M are the total number of targets and cameras, respectively.

Implementation

In both simple-calibration and self-calibration the observed data vector p contains the image coordinates of all the targets in all the available images. In order to guarantee an accurate localization (with sub-pixel accuracy) of the fiducial points, the image coordinates of the center (or some other relevant point such as a corner or an edge crossing) of the target must be detected with an appropriate subpixel technique. For example, the points to be detected could be the centers (or the

Simulation results

In this section we present the results of a series of simulations on synthetic data, which have been carried out in order to verify the correctness of Eq. (A.5), for the performance evaluation of the proposed calibration (self-calibration) algorithm.

If we have very limited a priori information on the model parameters, or even no information at all, the a posteriori covariance matrix CM|P becomesCM|P=(GTCP−1G)−1,G=∂gmm=mML.The diagonal elements of CM|P quantify the dispersion of the model

Experimental results

In order to test the reliability of our (simple/self) calibration methods, we performed a series of tests in a variety of experimental conditions. In this section, we present the results of two series of tests: the former is relative to a high-quality target-frame whose targets are not just known in their nominal 3D coordinates but have been accurately measured through a photogrammetric procedure in order to quantify the positional displacement relative to the nominal coordinates. For the

Conclusions

In this paper we presented a simple and effective technique for calibrating CCD-based multi-camera acquisition systems. The proposed method was proven to be capable of highly accurate results even when using very simple calibration target-sets and low-cost imaging devices, such as standard TV-resolution cameras connected to commercial frame-grabbers. In fact, the performance of our calibration approach is found to be about the same as that of other traditional calibration methods based on 3D

References (24)

  • G. Ferrigno et al.

    Pattern recognition in 3D automatic human motion analysis

    ISPRS Journal of Photogrammetry and Remote Sensing

    (1990)
  • N. Ayache, Artificial Vision for Mobile Robots, MIT Press,...
  • Y.I. Aziz, H.M. Karara, Direct linear transformation into object space coordinates in close-range photogrammetry, in:...
  • D. Barbe, Imaging devices using the charge-coupled concept, in: Proc. IEEE, Vol. 63, No. 1, January...
  • H.A. Beyer, Some aspects of the geometric calibration of CCD-cameras, ISPRS Intercomm. Conference on Fast Processing of...
  • H.A. Beyer, Geometric and radiometric analysis of a CCD-camera based photogrammetric close-range system, Ph.D. Thesis,...
  • W. Faig, Manual of Photogrammetry, 4th edition, American Society of Photogrammetry,...
  • O. Faugeras, Stratification of three-dimensional vision: Projective, affine, and metric representations, Journal of the...
  • A. Gruen, H. Beyer, System calibration through self-calibration, Invited paper, Workshop on Camera Calibration and...
  • C.F. Laizet, Determination of video cameras parameters in stereoscopic mode, Fourth European Workshop on...
  • R. Lenz, U. Lenz, New developments in high resolution image acquisition with CCD area sensors, Optical 3-D Measurement...
  • R. Lenz, U. Lenz, New developments in high resolution image acquisition with CCD area sensors, Optical 3-D Measurement...
  • Cited by (0)

    Work supported in part by the ACTS Project “PANORAMA”, Project No. AC092.

    View full text