Automated crack severity level detection and classification for ballastless track slab using deep convolutional neural network

https://doi.org/10.1016/j.autcon.2020.103484Get rights and content

Highlights

  • Implementing classification of cracks with three severity levels by DCNN.

  • Inefficient calculation by IPT is completed in training set without post-processing.

  • The accuracy and efficiency are significantly better than traditional MLAs.

  • Showing good robustness and adaptability to noise and light intensity.

  • Enabling automated maintenance decision making.

Abstract

The classification and treatment of cracks with different severity levels based on the width measurement is a critical consideration in maintenance of ballastless track slab. Existing deep learning methods cannot directly quantify cracks, which must rely on image processing technologies to post-process the initial results by deep learning, inevitably leading to multiple steps and low efficiency. This paper proposes a novel quantitative classification method for cracks with different severity levels based on deep convolutional neural networks, using orthogonal projection method to preprocess training data and define the severity level, which is validated and evaluated from four aspects: network structures, crack data, classification methods, and environmental conditions. Results show that the Inception-ResNet-v2 network can classify crack images into three severity levels without pixel segmentation or post-processing, achieving the accuracy, precision, recall and F1 score all exceeding 93%, with good robustness and adaptability to noise and light intensity.

Introduction

High-speed railway (HSR) ballastless track slab (BTS) deteriorates with service time increases. Distresses such as cracks not only reduce the strength of the track structure and shorten the service life of the BTS, but also may cause the fastener falling off and the rail shifting, which threatens the operation safety of HSR [1,2]. The severity level of cracks is an important decision-making factor for the maintenance and rehabilitation and strategies of BTS. Due to high frequency and short maintenance window of HSR, it is difficult to obtain comprehensive, accurate, detailed, and timely crack information on BTS through manual visual inspection. Various machine vision methods have been developed to automatically detect cracks to replace the conventional manual visual recognition.

Image processing technologies (IPTs) have first been widely used in the field of crack detection, calculating and analyzing image features to distinguish cracks from background based heuristic rules, which can be summarized as edge detection, threshold segmentation, region growth and filters. Although commonly used edge detectors (Roberts, Prewitt, Sobel and Canny, etc.) can accurately segment fuzzy boundaries between the surface cracks and background based on the difference in pixel gray value [[3], [4], [5]], they will produce residual noise in the final output binary image, especially the detection effect of the noisy image is poor and easy to cause discontinuous crack edges [[6], [7]]. Adu-Gyamfi et al. [8] proposed an image denoising and enhancement method that combines empirical mode decomposition (EMD) and weighted reconstruction techniques, which can extract more effective image features from noisy images compared to most edge detectors. Unlike the edge detection method, the processing object of threshold segmentation changes from the boundary pixels of the cracks to all pixels of the whole image, which are judged as cracks or background by setting an appropriate threshold (global threshold, local threshold, or adaptive threshold) [[9], [10], [11], [12], [13]]. Tang et al. [14] used fuzzy set theory and boundary histogram to determine the optimal threshold for distinguishing crack pixels and background pixels by maximizing the fuzzy index entropy. But the threshold setting still has great uncertainty and the detection result is easy to lose the boundary information of the cracks when the contrast between the cracks and the background is low. The region growth algorithm solves the problem of poor crack detection accuracy under low contrast conditions by setting seed pixels in the target area and continuously expanding [[15], [16], [17]]. Although the anti-noise performance of the region growth is better than edge detection and threshold segmentation, it is still sensitive to noise and the seed pixels rely on manual determination, which may lead to voids in the crack detection results. Zhang et al. [18] matched the pre-designed filter with crack characteristics according to shape, direction or intensity to detect cracks, which has higher detection accuracy and can better suppress environmental noise compared with edge detectors.

The heuristic rules of the above IPTs heavily rely on prior knowledge and engineering experience (e.g., pavement distress matrix) to personally design and adjust according to unique crack detection requirements [19], which aim at a single crack feature with high specificity, low repeatability and incompleteness. The performance under different complex environmental conditions varies, which also hinders the full automation in recognition of cracks in IPTS. The application of machine learning algorithms (MLAs) is an effective measure to realize the automatic detection of complex and diverse cracks. The essence of machine learning is feature learning, using the extracted image features (entropy, texture, HOG, SIFT, LBP, GLCM, etc.) to fully train SVM, BPANN, NBC, KNN and other classifiers to learn the similarity between various crack images, so that the computer can grasp the recognition rules and detect cracks from unknown image data [[20], [21], [22], [23], [24], [25], [26]]. Cha et al. [27] used Hough transform and other image processing methods to extract features such as horizontal and vertical lengths of bolts, which were used to fully train linear SVM to distinguish between tight bolts and loose bolts. Xu et al. [28] extracted the parameters representing the crack characteristics from each sub-image, and then manually selected sub-images with representative parameters to train the ANN. Shi et al. [29] extracted crack features at multiple levels and directions from the labeled data to fully train the crack forest, which solved the problem of detecting noise-containing cracks with complex topology. However, the image features required to fully train existing machine learning classifiers rely on IPTs for pre-extraction and manual labeling, resulting in low detection accuracy, extended timing and limited detection range by shallow and scant features [7]. In addition, unsupervised MLAs such as clustering algorithm (K-means algorithm) [30], principal component analysis (PCA) [31] and Gaussian mixture model [32] have also been applied to crack detection of roads, bridges and other infrastructure. The advantage is that the image data required for training does not require manual labeling in advance, which reduces manual intervention, but the limitation is that they can only detect crack images with obvious textures and are generally not as accurate as traditional supervised MLAs [33].

From manually processing limited image features by IPTs and MLAs to automatically processing rich, arbitrary image features using deep learning is a great advancement. Deep learning methods with convolutional neural network (CNN) as the core can automatically extract rich and deep abstract features from massive infrastructure surface crack data to master recognition rules. The CNN relies on the convolutional layer (a large number of convolution kernels) inside the networks to perform a convolution operation with a neighborhood of the input crack image, which slides from the upper left to the lower right of the image with a certain step and outputs the deep abstract feature map for large-scale, diverse crack rapid detection tasks [34]. Various convolutional network models derived from the original CNN have been used to identify, locate and characterize cracks in different shapes and locations [[35], [36]]. Mandal et al. [37] used YOLOv2 network to accurately identify and locate lateral cracks, longitudinal cracks and alligator cracks at different locations. Faster R-CNN was used to simultaneously identify and locate different types of structural damages, and the average precision (AP) of cracks has reached 94.7%, which can adapt well to various image sizes and lighting conditions [38]. The SDDNet proposed by Cha et al. [39] can efficiently segment crack features and remove complex backgrounds and crack-like features. Although the detection performance of existing convolutional network models for cracks of various structures (pavement, bridge, tunnel, building) is demonstrated superior to IPTS and MLAs, which has fast detection speed, high accuracy, and good adaptability to complex environment conditions [[40], [41], [42], [43], [44], [45], [46], [47], [48]], it still faces some challenges in quantifying cracks.

The existing detection methods based on convolutional network models cannot directly quantify cracks, which must rely on one or more IPTs to post-process the initial results by deep learning to calculate the specific numerical indexes of cracks and further define the severity level. Kang et al. [49] went through three steps to quantify cracks. First, Faster R-CNN was used to identify and locate the cracks, then the improved tubularity flow field (TUFF) algorithm was used to further segment the specific features of the cracks, and finally the improved distance transform method (DTM) was used to calculate the length and width. Beckman et al. [50] used the RANSAC algorithm to further divide and calculate the volume of concrete spalling based on the positioning results of Faster R-CNN. Ni et al. [51] first used a dual-scale convolutional neural network to extract detailed morphological features of the cracks and then estimated the width of the detected cracks based on the Zernike moment operator. Yang et al. [52] skeletonized the FCN network crack characterization results based on the median axis algorithm and used the ratio between the pixel area and pixel length to define the average width of cracks. In the above process, all crack inspection data must go through region location, pixel segmentation and post-processing based on IPTs to calculate specific numerical index values for quantifying, which inevitably leads to multiple steps, high computational costs and low automation, especially the post-processing based on IPTs greatly limits the massive data processing capability of convolutional network models.

The severity level of the crack can be estimated based on other cracks of known severity at that location [53]. Therefore, this paper proposes a novel quantitative classification method based on IPT pre-processing and DCNN to address the challenges of existing crack quantification methods. The low-efficiency calculation of specific numerical index values of cracks based on IPT is completed in the training set in advance, and then the training set is used to fully train the DCNN to directly classify cracks with different severity levels from inspection data of BTS. The fully trained DCNN does not need to segment the pixel features or calculate specific numerical indexes of cracks one by one, but quantitatively classifies the cracks from the image level according to the similarity of features between training data and inspection data, which overcomes the challenges of the existing crack quantification methods with multiple steps, high computational costs and low automation. In addition, unlike the existing shallow convolutional network models (generally within 10 layers) used for crack detection [[54], [55], [56], [57]], the DCNN not only increases the structural depth but also expands the width of each layer in a parallel manner, so that it can fully learn the input image and extract more complex and advanced effective feature information to ensure the accuracy of crack quantification [[58], [59], [60], [61]].

The objective of this research is to establish an automated classifier that can quantify cracks according to their severity levels. Adequate robustness and adaptability to adverse environments such as weak light and massive background noise is needed for this classifier. This paper uses DCNN to implement this classifier and evaluates and tests its performance. The content of this research is shown in Fig. 1. Section 2 introduces the structural composition and image processing process of the DCNN, and uses the Inception-ResNet-v2 network as an example for elaboration. Section 3 first explains the source of the crack image in the first part. Then in the second part, the average width of each crack is calculated by orthogonal projection method, and the average width is used as a quantitative classification index to divide the collected crack images into three severity levels (label1, label2, label3). The images are preprocessed in the third part to build a crack image database containing 15,000 images. Section 4 uses transfer learning to compare the recognition effect of six existing DCNNs on the database, and selects the Inception-ResNet-v2 network with the best performance for in-depth learning, training, and testing of the database. Section 5 uses different training and testing sets to validate the classification results of the network, which is compared with the four traditional machine learning methods for evaluating the performance by accuracy, F1 score, and test time. Section 6 further tests the consistency of the classification effect of the Inception-ResNet-v2 network in three adverse environments. Sections 7 concludes this paper by summarizing contributions and limitations.

Section snippets

Overall architecture

This paper uses DCNN to detect and classify crack images with different severity levels, and employs the Inception-ResNet-v2 network as an example to introduce the structure composition and image processing process. The overall structure of the Inception-ResNet-v2 network mainly includes input layer, convolution layer, pooling layer, Stem module, Inception-resnet module, Reduction module and the final Dropout layer and Softmax layer (classification layer). The image processing process can be

Image acquisition

Building a large and comprehensive sample image database is a prerequisite for DCNN based image recognition. In this paper, the appearance of CRTS Ш BTS is scanned by high-resolution line-array cameras mounted on the track inspection vehicle to collect the original crack image data. The image acquisition equipment and region are shown in Fig. 3.

The original BTS images collected by the track inspection vehicle are further cropped to obtain 4650 crack images (400pixel × 400pixel) to remove a

Performance comparison of six existing deep convolutional neural networks

Existing DCNNs (such as VGG16, Inception V3, and ResNet V1 50) can achieve a recognition accuracy of more than 95% in the ImageNet image database (including 1.2 million images with 1000 labels), which has been widely used [68]. As for the crack images of BTS, due to the large differences between the morphological characteristics of cracks and conventional image classification data sets, it is inconclusive as to which network can achieve excellent recognition results. Therefore, this paper uses

Comparison of different training and testing sets

When using the Inception-ResNet-v2 network to detect and classify the database, only the features of 12,000 training set images have been fully learned, while the 3000 testing set images are only used to validate the learning effect of the network. Due to the images of the training and testing sets are fixed, the generality and repeatability of the Inception-ResNet-v2 network cannot be proven, which may fall into local minimum or maximum values. This paper uses the cross-validation (k = 5)

Consistency of classification results in adverse environmental conditions

In order to explore the robustness and adaptability of the Inception-ResNet-v2 network, this paper selects 300 images (each label contains 100 images) from the original testing set to further test the classification effect of the network under adverse environmental conditions, including three types: interference from background noise, the influence of light intensity and image blur caused by human factors. Adding different degrees of salt and pepper noise to the original image to simulate the

Conclusions

This paper proposes a novel quantitative classification method based on IPT pre-processing and DCNN. Unlike the existing crack quantification methods with multiple steps, high computational costs and low automation, the fully trained Inception-ResNet-v2 network neither needs to segment the pixel features of cracks nor to calculate detailed numerical indicators, which can accurately and efficiently detect and classify cracks with three severity levels from the image level based on the similarity

Declaration of Competing Interest

None.

Acknowledgements

The research is supported by the High-Speed Railway Infrastructure Joint Fund of the National Natural Science Foundation of China (U1734208); Science and Technology Support Plan of the Department of Science and Technology of Guizhou Province, China ([2018] 2154); and Key Project of China State Railway Group Co., Ltd. (N2019G024).

References (76)

  • D. Kang et al.

    Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning

    Autom. Constr.

    (2020)
  • G.H. Beckman et al.

    Deep learning-based automatic volumetric damage quantification using depth camera

    Autom. Constr.

    (2019)
  • M. Kouzehgar et al.

    Self-reconfigurable facade-cleaning robot equipped with deep-learning-based crack detection based on convolutional neural networks

    Autom. Constr.

    (2019)
  • K. Gopalakrishnan et al.

    Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection

    Constr. Build. Mater.

    (2017)
  • I. Valavanis et al.

    Multiclass defect detection and classification in weld radiographic images using geometric and texture features

    Expert Syst. Appl.

    (2010)
  • Z.P. Zeng et al.

    Experimental study on evolution of mechanical properties of CRTS III ballastless slab track under fatigue load

    Constr. Build. Mater.

    (2019)
  • I. Abeel-Qader et al.

    Analysis of edge-detection techniques for crack identification in bridges

    J. Comput. Civ. Eng.

    (2003)
  • S.Y. Lee et al.

    Development of an inspection system for cracks in a concrete tunnel lining

    Can. J. Civ. Eng.

    (2007)
  • N.D. Hoang et al.

    Metaheuristic optimized edge detection for recognition of concrete wall cracks: a comparative study on the performances of Roberts, Prewitt, canny, and Sobel algorithms

    Adv. Civil Eng.

    (2018)
  • Y.J. Cha et al.

    Deep learning-based crack damage detection using convolutional neural networks

    Comp. Aid. Civil Infrastruct. Eng.

    (2017)
  • Y.O. Adu-Gyamfi et al.

    Multiresolution information mining for pavement crack image analysis

    J. Comput. Civ. Eng.

    (2012)
  • W. Zhang et al.

    Automatic crack detection and classification method for subway tunnel safety monitoring

    Sensors (Basel)

    (2014)
  • H.D. Cheng et al.

    Real-time image thresholding based on sample space reduction and interpolation approach

    J. Comput. Civ. Eng.

    (2003)
  • Y.G. Tang et al.

    Application of a new image segmentation method to detection of defects in castings

    Int. J. Adv. Manuf. Technol.

    (2009)
  • H. Oliveira et al.

    Road Surface Crack Detection: Improved Segmentation with Pixel-based Refinement, 25th European Signal Processing Conference (EUSIPCO)

    (2017)
  • A. Zhang et al.

    Matched filtering algorithm for pavement cracking detection

    Transp. Res. Rec.

    (2013)
  • B. Steven et al.

    An Assessment of Automated Pavement Distress Identification Technologies in California (Technical Memorandum No. UCPRC-TM-2008-13)

    (2009)
  • G. Miguel et al.

    Adaptive road crack detection system by pavement classification

    Sensors

    (2011)
  • S.F. Wang et al.

    Cracking classification using minimum rectangular cover-based support vector machine

    J. Comput. Civ. Eng.

    (2017)
  • Y.F. Pan et al.

    Detection of asphalt pavement potholes and cracks based on the unmanned aerial vehicle multispectral imagery

    IEEE J. Select. Topics Appl. Earth Observ. Remote Sens.

    (2018)
  • N.D. Hoang et al.

    Automatic recognition of asphalt pavement cracks based on image processing and machine learning approaches: a comparative study on classifier performance

    Math. Probl. Eng.

    (2018)
  • N.D. Hoang et al.

    A novel method for asphalt pavement crack classification based on image processing and machine learning

    Eng. Comput.

    (2018)
  • F. Duan et al.

    Automatic welding defect detection of x-ray images by using cascade AdaBoost with penalty term

    IEEE Access

    (2019)
  • Q. Yang et al.

    Evaluation of cracking in asphalt pavement with stabilized base course based on statistical pattern recognition

    Int. J. Pavement Eng.

    (2017)
  • G.A. Xu et al.

    Automatic Recognition of Pavement Surface Crack Based on BP Neural Network, Proceedings of the 2008 International Conference on Computer and Electrical Engineering (ICCEE)

    (2008)
  • Y. Shi et al.

    Automatic road crack detection using random structured forests

    IEEE Trans. Intell. Transp. Syst.

    (2016)
  • H.Y. Ju et al.

    Illumination compensation model with k-means algorithm for detection of pavement surface cracks with shadow

    J. Comput. Civ. Eng.

    (2020)
  • L. Qiu et al.

    An enhanced dynamic Gaussian mixture model-based damage monitoring method of aircraft structures under environmental and operational conditions

    Struct. Health Monitor. Int. J.

    (2019)
  • Cited by (65)

    • Automated detection and quantification of pavement cracking around manhole

      2024, Engineering Applications of Artificial Intelligence
    View all citing articles on Scopus
    View full text