Abstract
Useful for the visual perception of a human, edge detection remains a crucial stage in numerous image processing applications. Therefore, one of the most challenging goals in contour extraction is to operate algorithms that can process visual information as humans need. Hence, to ensure that it is reliable, an edge detection technique needs to be severely assessed before being used it in a computer vision tools. To achieve this task, a supervised evaluation computes a score between a ground truth edge map and a candidate image. Theoretically, by varying the hysteresis thresholds of the thin edges, the minimum score of the measure corresponds to the best edge map, compared to the ground truth. In this study, a new supervised edge map quality measure is proposed, where the minimum score of the measure is associated with an edge map in which the main structures of the desired objects are distinctive.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction on Edge Detection and Thresholding
Edge detection is an important field in image processing because this process frequently attempts to capture the most important structures in the image. Hence, edge detection represents a fundamental step concerning computer vision approaches. Furthermore, edge detection itself could be used to qualify a region segmentation technique. Additionally, the edge detection assessment remains very useful in image segmentation, registration, reconstruction or interpretation. Hence, it is hard to design an edge detector which is able to extract the exact edge with good localization and orientation from an image. In the literature, different techniques have emerged and, due to its importance, edge detection continues to be an active research area [1]. The best-known and useful edge detection methods are based on gradient computing first-order fixed operators [2, 3]. Oriented operators compute the maximum energy in an orientation [4,5,6] or two directions [7]. Typically, these methods are composed of three steps:
-
1.
Computation of the gradient magnitude and its orientation \(\eta \), see Fig. 1.
-
2.
Non-maximum suppression to obtain thin edges: the selected pixels are those having gradient magnitude at a local maximum along the gradient direction \(\eta \) which is perpendicular to the edge orientation.
-
3.
Thresholding of the thin contours to obtain an edge map.
Thus, Fig. 1 exposes the different possibilities of gradient and its associated orientations involving several edge detection algorithms compared in this paper.
The final step remains a difficult stage in image processing, however it represents a crucial operation to compare several segmentation algorithms. In edge detection, the hysteresis process uses the connectivity information of the pixels belonging to thin contours and thus remains a more elaborated method than binary thresholding. Simply, this technique determines a contour image that has been thresholded at different levels (low: \(\tau _L\) and high: \(\tau _H\)). The low threshold \(\tau _L\) determines which pixels are considered as edge points if at least one point higher than \(\tau _H\) exists in a contour chain where all the pixel values are also higher than \(\tau _L\), as represented with a signal in Fig. 1. Thus, the lower the thresholds are, the more the undesirable pixels are preserved.
Usually, in order to compare several edge detection methods, the user has to try some thresholds to select the ones that appear visually as the best edge maps in quality. However, this assessment suffers from a main drawback: segmentations are compared using the threshold (deliberately) chosen by the user, this evaluation is very subjective and not reproducible. Hence, the purpose is to use the dissimilarity measures without any user intervention for an objective assessment. Finally, to consider a valuable edge detection assessment, the evaluation process should produce a result that correlates with the perceived quality of the edge image, which relies on human judgment [8,9,10]. In other words, a reliable edge map should characterize all the relevant structures of an image as closely as possible, without any disappearance of desired contours. Nevertheless, a minimum of spurious pixels can be created by the edge detector, disturbing at the same time the visibility of the main/desired objects to detect.
In this paper, a novel technique is presented to compare edge detection techniques by using hysteresis thresholds in a supervised way, being consistent with the visual perception of a human. Indeed, by comparing a ground truth contour map with an ideal edge map, several assessments can be compared by varying the parameters of the hysteresis thresholds. This study shows the importance to penalize stronger the false negative points, compared to the false positive points, leading to a new edge detection evaluation algorithm. The experiment using synthetic and real images demonstrated that the proposed method obtains contours maps closer to the ground truth without requiring tuning parameters and outperforms other assessment methods in an objective way.
2 Supervised Measures for Image Contour Evaluations
A supervised evaluation criterion computes a dissimilarity measure between a segmentation result and a ground truth obtained from synthetic data or an expert judgment (i.e. manual segmentation) [11,12,13,14]. In this paper, the closer to 0 the score of the evaluation is, the more the segmentation is qualified as good. This work focusses on comparisons of supervised edge detection evaluations and proposes a new measure, aiming at an objective assessment.
2.1 Error Measures Involving only Statistics
To assess an edge detector, the confusion matrix remains a cornerstone in boundary detection evaluation methods. Let \(G_t\) be the reference contour map corresponding to ground truth and \(D_c\) the detected contour map of an original image I. Comparing pixel per pixel \(G_t\) and \(D_c\), the 1st criterion to be assessed is the common presence of edge/non-edge points. A basic evaluation is composed of statistics; to that end, \(G_t \) and \(D_c \) are combined. Afterwards, denoting \(|\cdot |\) as the cardinality of a set, all points are divided into four sets (see Fig. 3):
-
True Positive points (TPs), common points of \(G_t \) and \(D_c\): \(TP = {\left| D_{c}\cap G_t\right| }\),
-
False Positive points (FPs), spurious detected edges of \(D_c \): \(FP = {\left| D_{c}\cap \lnot G_t\right| }\),
-
False Negative points (FNs), missing boundary points of \(D_c \): \(FN = {\left| \lnot D_{c} \cap G_t\right| }\),
-
True Negative points (TNs), common non-edge points: \(TN = {\left| \lnot D_{c}\cap \lnot G_t\right| }\).
Several edge detection evaluations involving confusion matrix are presented in Table 1. Computing only FPs and FNs [7] or their sum enables a segmentation assessment to be performed. The complemented Performance measure \(P_m^*\) considers directly and simultaneously the three entities TP, FP and FN to assess a binary image and decreases with improved quality of detection.
Another way to display evaluations is to create Receiver Operating Characteristic (ROC) [19] curves or Precision-Recall (PR) [18], involving True Positive Rates (TPR) and False Positive Rates (FPR): \( {TPR }= {{TP}\over {TP + FN}} \) and\({FPR } = {\frac{FP}{FP + TN}}.\) Derived from TPR and FPR, the three measures \(\varPhi \), \( \chi ^2\) and \( F_\alpha \) (detailed in Table 1) are frequently used. The complement of these measures enables to translate a value close to 0 as a good segmentation.
These measures evaluate the comparison of two edge images, pixel per pixel, tending to severely penalize a (even slightly) misplaced contour, as illustrated in Fig. 2. Consequently, some evaluations resulting from the confusion matrix recommend incorporating spatial tolerance. Tolerating a distance from the true contour and integrating several TPs for one detected contour can penalize efficient edge detection methods, or, on the contrary, advantage poor ones (especially for corners or small objects). Thus, from the discussion below, the assessment should penalize a misplaced edge point proportionally to the distance from its true location (some examples in [14], and, as shown in Fig. 2).
2.2 Assessment Involving Distances of Misplaced Pixels
A reference-based edge map quality measure requires that a displaced edge should be penalized in function not only of FPs and/or FNs but also of the distance from the position where it should be located. Table 2 reviews the most relevant measures involving distances. Thus, for a pixel p belonging to the desired contour \(D_c\), \(d_{G_t} (p)\) represents the minimal Euclidian distance between p and \(G_t\). If p belongs to the ground truth \(G_t\), \(d_{D_c} (p)\) is the minimal distance between p and \(D_c\). On the one hand, some distance measures are specified in the evaluation of over-segmentation (i.e. presence of FPs), like: \(\Upsilon \), \(D^k\), \(\varTheta \) and \(\varGamma \). On the other hand, \(\varOmega \) measure assesses an edge detection by computing only an under segmentation (FNs). Other edge detection evaluation measures consider both distances of FPs and FNs [9]. A perfect segmentation using an over-segmentation measure could be an image including no edge points and an image having most undesirable edge points (FPs) concerning under-segmentation evaluations (see Fig. 3). Also, another limitation of only over- and under-segmentation evaluations are that several binary images can produce the same result (Fig. 2). Therefore, as demonstrated in [9], a complete and optimum edge detection evaluation measure should combine assessments of both over- and under-segmentation.
Among the distance measures between two contours, one of the most popular descriptors is named the Figure of Merit (FoM). Nonetheless, for FoM, the distance of the FNs is not recorded and are strongly penalized as statistic measures (see above). For example, in Fig. 3, \(FoM (G_t, C) > FoM (G_t, M)\), whereas M contains both FPs and FNs and C only FNs. Further, for the extreme cases:
-
if \(FP = 0\): \( FoM \left( G_t, D_{c} \right) = 1 - TP / | G_t | = 1 - (| G_t | - FN) / | G_t |\),
-
if \(FN = 0\): \( FoM \left( G_t, D_{c} \right) = 1 - {1 \over {\max \left( \left| G_t \right| , \left| D_{c} \right| \right) }}\cdot {{\sum _{p \in { D_{c}\cap \lnot G_t}} {1 \over 1 + \kappa \cdot d^2_{G_t}(p)}}}\).
When \(FN > 0\) and FP constant, it behaves like matrix-based error assessments (Fig. 2). Moreover, for \(FP > 0\), the FoM penalizes the over-detection very low compared to the under-detection. On the contrary, the F measure computes the distances of FNs but not of the FPs, so F behaves inversely to FoM. Also, \(d_4\) measure depends particularly on TP, FP, FN and FoM but penalizes FNs like the FoM measure. SFoM and MFoM take into account both distances of FNs and FPs, so they can compute a global evaluation of a contour image. However, MFoM does not consider FPs and FNs at the same time, contrary to SFoM. Another way to compute a global measure is presented in [28] with the edge map quality measure \(D_p\). The right term computes the distances of the FNs between the closest correctly detected edge pixel, i.e. \(G_t \cap D_c\). Finally, \(D_p\) is more sensitive to FNs than FPs because of the coefficient \(1 \over {|I| - |G_t|}\).
A second measure widely computed in matching techniques is represented by the Hausdorff distance H, which measures the mismatch of two sets of points [24]. This max-min distance could be strongly deviated by only one pixel which can be positioned sufficiently far from the pattern (Fig. 3). To improve the measure, one idea is to compute H with a proportion of the maximum distances; let us note \(H_{15\%}\) this measure for 15% of the values [24]. Nevertheless, as pointed out in [11], an average distance from the edge pixels in the candidate image to those in the ground truth is more appropriate, like \(S^k\) or \(\varPsi \). Eventually, Delta Metric (\(\varDelta ^k\)) [27] intends to estimate the dissimilarity between each element of two binary images, but is highly sensitive to distances of misplaced points [8, 14].
A new objective edge detection assessment measure: In [14] a measure of the edge detection assessment is developed: it is denoted \(\varPsi \) (Table 2) and improvements the over-segmentation measure \(\varGamma \), by combining both \(d_{G_t}\) and \(d_{D_c}\), see Fig. 3. \(\varPsi \) gives the same weight for \(d_{G_t}\) and \(d_{D_c}\) in its assessment of errors. Thus, using \(\varPsi \), a missing edge remains not enough penalized contrary to the distance of FPs which could be too important. Another example, in Fig. 3, \(\varPsi (G_t, C) < \varPsi (G_t, T)\) whereas C must be more penalized because of FNs which does not allow to identify the object (also Fig. 5). The solution proposed here is to penalize stronger the distances of the FNs depending on the number of TPs:
The term influencing the penalization of FN distances can be rewritten as: \( {|G_t|^2 \over TP^2} =\left( {FN + TP \over TP} \right) ^2 =\left( 1 + {FN \over TP} \right) ^2 \geqslant 1 \), ensuring a stronger penalty for \(d^2_{D_c}\), compared to \(d^2_{ G_t}\). When \(TP = 0\), the min function avoids the multiplication by infinity; moreover, the number of FNs is large, corresponding to a strong penalty with the weight term \(|G_t|^2\) (see Fig. 4 left). When \(|G_t| = TP \), \(\lambda \) is equivalent to \(\varPsi \) and \(\varGamma \) (see Fig. 3, image T). Also, compared to \(\varPsi \), \(\lambda \) penalizes more \(D_c\) having FNs, than \(D_c\) with only FPs, as illustrated in Fig. 3 (images C and T). Finally, the weight \(\frac{|G_t|^2}{ TP^2}\) tunes the \(\lambda \) measure by considering an edge map of better quality when FNs points are localized close to the desired contours \(D_c\).
The next subsection details the way to evaluate an edge detector in an objective way. Results presented in this communication show the importance to penalize stronger the false negative points, compared to the false positive points because the desired objects are not always completely visible by using ill-suited evaluation measure, and, \(\lambda \) provides a reliable edge detection assessment.
2.3 Minimum of the Measure and Ground Truth Edge Image
Dissimilarity measures are used for an objective assessment using binary images. Instead of choosing manually a threshold to obtain a binary image (see Fig. 3 in [9]), the purpose is to compute the minimal value of a dissimilarity measure by varying the thresholds (double loop: loop over \(\tau _L\) and loop over \(\tau _H\)) of the thin edges (see Table in Fig. 1). Thus, compared to a ground truth contour map, the ideal edge map for a measure corresponds to the desired contour at which the evaluation obtains the minimum score for the considered measure among the thresholded (binary) images. Theoretically, this score corresponds to the thresholds at which the edge detection represents the best edge map, compared to the ground truth contour map [8, 12, 30]. Figure 4 right illustrates the choice of a contour map in function of \(\tau _L\) and \(\tau _H\). Since small thresholds lead to heavy over-segmentation and strong thresholds may create numerous false negative pixels, the minimum score of an edge detection evaluation should be a compromise between under- and over-segmentation (detailed and illustrated in [8]).
As demonstrated in [8], the significance of the ground truth map choice influences on the dissimilarity evaluations. Indeed, if not reliable [31], an inaccurate ground truth contour map in terms of localization penalizes precise edge detectors and/or advantages the rough algorithms as edge maps presented in [9, 10]. For these reasons, the ground truth edge map concerning the real image in our experiments is built in a semi-automatic way detailed in [8].
3 Experimental results
In these experiments, the importance of an assessment to penalize stronger the false negative points is enlightened, compared to the false positive points. In order to study the performance of the contour detection evaluation measures, the hysteresis thresholds vary and the minimum score of the studied measure corresponds to the best edge map. The thin edges of both synthetic and real noisy images are computed by five or six edge detectors: Sobel [2], Canny [3], Steerable Filters of order 1 (\(SF_1\)) [4] or 5 (\(SF_5\)) [5], Anisotropic Gaussian Kernels (AGK) [6] and Half Gaussian Kernels (H-K) [7]. Figure 5 presents the results for 14 measures with their associated scores (bars) according to the hysteresis parameters. In the one hand, we must take into account the obtained edge map, and on the other hand the measure score. Generally, the optimal edge map for FoM, SFoM, \(f_2d_6\), \(\varPsi \) and \(\lambda \) measures allows to distinct the majority of the desired edges for each contour detection operator (except Sobel), whereas for the other assessments, contours are too disturbed by undesirable points or distinguished with high difficulty (especially \(\varPsi \) which does not penalizes enough FNs). Note that SFoM measure does not classify the Sobel algorithm as less efficient. Concerning the experiment with a real image in Fig. 6, 8 measures are compared together. For FoM, H, \(\varDelta ^k\) and \(S^{k}\), the ideal edge maps concerning Sobel edge detector are highly corrupted by undesirable contours, the main objects are not recognizable. The other segmentations are also disturbed by undesirable pixels for FoM, H and \(\varDelta ^k\). Moreover, the higher score for \(\varDelta ^k\) (AGK) does not represent the more disturbed map. Ultimately, using \(\lambda \), the essential structures are visible in the optimal contour map for each edge detector (objects are easily recognizable). Moreover, contrary to H, FoM, \(d_4\), \(\varDelta ^k\) and \(S^k\) measures, the scores of \(\lambda \) are coherent, in relation to the obtained segmentations (Sobel and H-K results).
4 Conclusion and Future Works
This study presents a new supervised edge detection assessment method \(\lambda \) which enables to assess a contour map in an objective way. Based on the theory of the dissimilarity evaluation measures, the objective evaluation allows to evaluate 1st-order edge detectors. Indeed, the segmentation which obtains the minimum score of a measure is considered as the best one. Theory and experiments prove that the minimum score of the new dissimilarity measure \(\lambda \) corresponds to the best edge quality map evaluations, which is similarly closer to the ground truth, compared to the other methods. On the one hand, this new measure takes into account the distances of false positive points, in the other hand, it considers the distance of false negative points tuned by a weight. This weight depends on the number of false negative points: the more it is elevated, the more the segmentation is penalized. Thus, this enables to obtain objectively an edge map containing the main structures, similar to the ground truth, concerning a reliable edge detector. Finally, the computation of the minimum score of a measure does not require tuning parameters, which represents a huge advantage. For this purpose, we plan in a future study to deeply compare the robustness of several edge detection algorithms and use the new measure in object recognition.
References
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE TPAMI 33(5), 898–916 (2011)
Sobel, I.E.: Camera models and machine perception. Ph.D. thesis, Stanford University (1970)
Canny, J.: A computational approach to edge detection. IEEE TPAMI PAMI–8(6), 679–698 (1986)
Freeman, W.T., Adelson, E.H.: The design and use of steerable filters. IEEE TPAMI 13, 891–906 (1991)
Jacob, M., Unser, M.: Design of steerable filters for feature detection using Canny-like criteria. IEEE TPAMI 26(8), 1007–1019 (2004)
Geusebroek, J.-M., Smeulders, A.W.M., van de Weijer, J.: Fast anisotropic Gauss filtering. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 99–112. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47969-4_7
Magnier, B., Montesinos, P., Diep, D.: Fast anisotropic edge detection using Gamma correction in color images. In: IEEE ISPA, pp. 212–217 (2011)
Abdulrahman, H., Magnier, B., Montesinos, P.: From contours to ground truth: how to evaluate edge detectors by filtering. In: WSCG (2017)
Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: IEEE ICCV, vol. 2, pp. 416–423. IEEE (2001)
Heath, M.D., Sarkar, S., Sanocki, T., Bowyer, K.W.: A robust visual method for assessing the relative performance of edge-detection algorithms. IEEE TPAMI 19(12), 1338–1359 (1997)
Dubuisson, M.-P., Jain, A.K.: A modified Hausdorff distance for object matching. In: IEEE ICPR, vol. 1, pp. 566–568 (1994)
Chabrier, S., Laurent, H., Rosenberger, C., Emile, B.: Comparative study of contour detection evaluation criteria based on dissimilarity measures. EURASIP J. Image Video Process. 2008, 2 (2008)
Lopez-Molina, C., De Baets, B., Bustince, H.: Quantitative error measures for edge detection. Pattern Recogn. 46(4), 1125–1139 (2013)
Abdulrahman, H., Magnier, B., Montesinos, P.: A new normalized supervised edge detection evaluation. In: Alexandre, L.A., Salvador Sánchez, J., Rodrigues, J.M.F. (eds.) IbPRIA 2017. LNCS, vol. 10255, pp. 203–213. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58838-4_23
Grigorescu, C., Petkov, N., Westenberg, M.: Contour detection based on nonclassical receptive field inhibition. IEEE TIP 12(7), 729–739 (2003)
Venkatesh, S., Rosin, P.L.: Dynamic threshold determination by local and global edge evaluation. CVGIP 57(2), 146–160 (1995)
Yitzhaky, Y., Peli, E.: A method for objective edge detection evaluation and detector parameter selection. IEEE TPAMI 25(8), 1027–1033 (2003)
Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE TPAMI 26(5), 530–549 (2004)
Bowyer, K., Kranenburg, C., Dougherty, S.: Edge detector evaluation using empirical ROC curves. In: CVIU, pp. 77–103 (2001)
Abdou, I.E., Pratt, W.K.: Quantitative design and evaluation of enhancement/thresholding edge detectors. Proc. IEEE 67, 753–763 (1979)
Pinho, A.J., Almeida, L.B.: Edge detection filters based on artificial neural networks. In: Braccini, C., DeFloriani, L., Vernazza, G. (eds.) ICIAP 1995. LNCS, vol. 974, pp. 159–164. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-60298-4_252
Boaventura, A.G., Gonzaga, A.: Method to evaluate the performance of edge detector (2009)
Yasnoff, W.A., Galbraith, W., Bacus, J.W.: Error measures for objective assessment of scene segmentation algorithms. Anal. Quant. Cytol. 1(2), 107–121 (1978)
Huttenlocher, D.P., Rucklidge, W.J.: A multi-resolution technique for comparing images using the Hausdorff distance. In: IEEE CVPR, pp. 705–706 (1993)
Peli, T., Malah, D.: A study of edge detection algorithms. CGIP 20(1), 1–21 (1982)
Odet, C., Belaroussi, B., Benoit-Cattin, H.: Scalable discrepancy measures for segmentation evaluation. In: IEEE ICIP, vol. 1, pp. 785–788 (2002)
Baddeley, A.J.: An error metric for binary images. In: Robust Computer Vision: Quality of Vision Algorithms, Proceedings of the International Workshop on Robust Computer Vision, pp. 59–78. Bonn, Wichmann (1992)
Panetta, K., Gao, C., Agaian, S., Nercessian, S.: A new reference-based edge map quality measure. IEEE Trans. Syst. Man Cybern.: Syst. 46(11), 1505–1517 (2016)
Magnier, B., Le, A., Zogo, A.: A quantitative error measure for the evaluation of roof edge detectors. In: IEEE IST, pp. 429–434 (2016)
Fernández-Garca, N.L., Medina-Carnicer, R., Carmona-Poyato, A., Madrid-Cuevas, F.J., Prieto-Villegas, M.: Characterization of empirical discrepancy evaluation measures. Pattern Recogn. Lett. 25(1), 35–47 (2004)
Hou, X., Yuille, A., Koch, C.: Boundary detection benchmarking: beyond F-measures. In: IEEE CVPR, pp. 2123–2130 (2013)
Acknowledgements
The authors thank the Iraqi Ministry of Higher Education and Scientific Research for funding and supporting this work and reviewers for their remarks.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Abdulrahman, H., Magnier, B., Montesinos, P. (2017). A New Objective Supervised Edge Detection Assessment Using Hysteresis Thresholds. In: Battiato, S., Farinella, G., Leo, M., Gallo, G. (eds) New Trends in Image Analysis and Processing – ICIAP 2017. ICIAP 2017. Lecture Notes in Computer Science(), vol 10590. Springer, Cham. https://doi.org/10.1007/978-3-319-70742-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-70742-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70741-9
Online ISBN: 978-3-319-70742-6
eBook Packages: Computer ScienceComputer Science (R0)