Automated crack severity level detection and classification for ballastless track slab using deep convolutional neural network
Introduction
High-speed railway (HSR) ballastless track slab (BTS) deteriorates with service time increases. Distresses such as cracks not only reduce the strength of the track structure and shorten the service life of the BTS, but also may cause the fastener falling off and the rail shifting, which threatens the operation safety of HSR [1,2]. The severity level of cracks is an important decision-making factor for the maintenance and rehabilitation and strategies of BTS. Due to high frequency and short maintenance window of HSR, it is difficult to obtain comprehensive, accurate, detailed, and timely crack information on BTS through manual visual inspection. Various machine vision methods have been developed to automatically detect cracks to replace the conventional manual visual recognition.
Image processing technologies (IPTs) have first been widely used in the field of crack detection, calculating and analyzing image features to distinguish cracks from background based heuristic rules, which can be summarized as edge detection, threshold segmentation, region growth and filters. Although commonly used edge detectors (Roberts, Prewitt, Sobel and Canny, etc.) can accurately segment fuzzy boundaries between the surface cracks and background based on the difference in pixel gray value [[3], [4], [5]], they will produce residual noise in the final output binary image, especially the detection effect of the noisy image is poor and easy to cause discontinuous crack edges [[6], [7]]. Adu-Gyamfi et al. [8] proposed an image denoising and enhancement method that combines empirical mode decomposition (EMD) and weighted reconstruction techniques, which can extract more effective image features from noisy images compared to most edge detectors. Unlike the edge detection method, the processing object of threshold segmentation changes from the boundary pixels of the cracks to all pixels of the whole image, which are judged as cracks or background by setting an appropriate threshold (global threshold, local threshold, or adaptive threshold) [[9], [10], [11], [12], [13]]. Tang et al. [14] used fuzzy set theory and boundary histogram to determine the optimal threshold for distinguishing crack pixels and background pixels by maximizing the fuzzy index entropy. But the threshold setting still has great uncertainty and the detection result is easy to lose the boundary information of the cracks when the contrast between the cracks and the background is low. The region growth algorithm solves the problem of poor crack detection accuracy under low contrast conditions by setting seed pixels in the target area and continuously expanding [[15], [16], [17]]. Although the anti-noise performance of the region growth is better than edge detection and threshold segmentation, it is still sensitive to noise and the seed pixels rely on manual determination, which may lead to voids in the crack detection results. Zhang et al. [18] matched the pre-designed filter with crack characteristics according to shape, direction or intensity to detect cracks, which has higher detection accuracy and can better suppress environmental noise compared with edge detectors.
The heuristic rules of the above IPTs heavily rely on prior knowledge and engineering experience (e.g., pavement distress matrix) to personally design and adjust according to unique crack detection requirements [19], which aim at a single crack feature with high specificity, low repeatability and incompleteness. The performance under different complex environmental conditions varies, which also hinders the full automation in recognition of cracks in IPTS. The application of machine learning algorithms (MLAs) is an effective measure to realize the automatic detection of complex and diverse cracks. The essence of machine learning is feature learning, using the extracted image features (entropy, texture, HOG, SIFT, LBP, GLCM, etc.) to fully train SVM, BPANN, NBC, KNN and other classifiers to learn the similarity between various crack images, so that the computer can grasp the recognition rules and detect cracks from unknown image data [[20], [21], [22], [23], [24], [25], [26]]. Cha et al. [27] used Hough transform and other image processing methods to extract features such as horizontal and vertical lengths of bolts, which were used to fully train linear SVM to distinguish between tight bolts and loose bolts. Xu et al. [28] extracted the parameters representing the crack characteristics from each sub-image, and then manually selected sub-images with representative parameters to train the ANN. Shi et al. [29] extracted crack features at multiple levels and directions from the labeled data to fully train the crack forest, which solved the problem of detecting noise-containing cracks with complex topology. However, the image features required to fully train existing machine learning classifiers rely on IPTs for pre-extraction and manual labeling, resulting in low detection accuracy, extended timing and limited detection range by shallow and scant features [7]. In addition, unsupervised MLAs such as clustering algorithm (K-means algorithm) [30], principal component analysis (PCA) [31] and Gaussian mixture model [32] have also been applied to crack detection of roads, bridges and other infrastructure. The advantage is that the image data required for training does not require manual labeling in advance, which reduces manual intervention, but the limitation is that they can only detect crack images with obvious textures and are generally not as accurate as traditional supervised MLAs [33].
From manually processing limited image features by IPTs and MLAs to automatically processing rich, arbitrary image features using deep learning is a great advancement. Deep learning methods with convolutional neural network (CNN) as the core can automatically extract rich and deep abstract features from massive infrastructure surface crack data to master recognition rules. The CNN relies on the convolutional layer (a large number of convolution kernels) inside the networks to perform a convolution operation with a neighborhood of the input crack image, which slides from the upper left to the lower right of the image with a certain step and outputs the deep abstract feature map for large-scale, diverse crack rapid detection tasks [34]. Various convolutional network models derived from the original CNN have been used to identify, locate and characterize cracks in different shapes and locations [[35], [36]]. Mandal et al. [37] used YOLOv2 network to accurately identify and locate lateral cracks, longitudinal cracks and alligator cracks at different locations. Faster R-CNN was used to simultaneously identify and locate different types of structural damages, and the average precision (AP) of cracks has reached 94.7%, which can adapt well to various image sizes and lighting conditions [38]. The SDDNet proposed by Cha et al. [39] can efficiently segment crack features and remove complex backgrounds and crack-like features. Although the detection performance of existing convolutional network models for cracks of various structures (pavement, bridge, tunnel, building) is demonstrated superior to IPTS and MLAs, which has fast detection speed, high accuracy, and good adaptability to complex environment conditions [[40], [41], [42], [43], [44], [45], [46], [47], [48]], it still faces some challenges in quantifying cracks.
The existing detection methods based on convolutional network models cannot directly quantify cracks, which must rely on one or more IPTs to post-process the initial results by deep learning to calculate the specific numerical indexes of cracks and further define the severity level. Kang et al. [49] went through three steps to quantify cracks. First, Faster R-CNN was used to identify and locate the cracks, then the improved tubularity flow field (TUFF) algorithm was used to further segment the specific features of the cracks, and finally the improved distance transform method (DTM) was used to calculate the length and width. Beckman et al. [50] used the RANSAC algorithm to further divide and calculate the volume of concrete spalling based on the positioning results of Faster R-CNN. Ni et al. [51] first used a dual-scale convolutional neural network to extract detailed morphological features of the cracks and then estimated the width of the detected cracks based on the Zernike moment operator. Yang et al. [52] skeletonized the FCN network crack characterization results based on the median axis algorithm and used the ratio between the pixel area and pixel length to define the average width of cracks. In the above process, all crack inspection data must go through region location, pixel segmentation and post-processing based on IPTs to calculate specific numerical index values for quantifying, which inevitably leads to multiple steps, high computational costs and low automation, especially the post-processing based on IPTs greatly limits the massive data processing capability of convolutional network models.
The severity level of the crack can be estimated based on other cracks of known severity at that location [53]. Therefore, this paper proposes a novel quantitative classification method based on IPT pre-processing and DCNN to address the challenges of existing crack quantification methods. The low-efficiency calculation of specific numerical index values of cracks based on IPT is completed in the training set in advance, and then the training set is used to fully train the DCNN to directly classify cracks with different severity levels from inspection data of BTS. The fully trained DCNN does not need to segment the pixel features or calculate specific numerical indexes of cracks one by one, but quantitatively classifies the cracks from the image level according to the similarity of features between training data and inspection data, which overcomes the challenges of the existing crack quantification methods with multiple steps, high computational costs and low automation. In addition, unlike the existing shallow convolutional network models (generally within 10 layers) used for crack detection [[54], [55], [56], [57]], the DCNN not only increases the structural depth but also expands the width of each layer in a parallel manner, so that it can fully learn the input image and extract more complex and advanced effective feature information to ensure the accuracy of crack quantification [[58], [59], [60], [61]].
The objective of this research is to establish an automated classifier that can quantify cracks according to their severity levels. Adequate robustness and adaptability to adverse environments such as weak light and massive background noise is needed for this classifier. This paper uses DCNN to implement this classifier and evaluates and tests its performance. The content of this research is shown in Fig. 1. Section 2 introduces the structural composition and image processing process of the DCNN, and uses the Inception-ResNet-v2 network as an example for elaboration. Section 3 first explains the source of the crack image in the first part. Then in the second part, the average width of each crack is calculated by orthogonal projection method, and the average width is used as a quantitative classification index to divide the collected crack images into three severity levels (label1, label2, label3). The images are preprocessed in the third part to build a crack image database containing 15,000 images. Section 4 uses transfer learning to compare the recognition effect of six existing DCNNs on the database, and selects the Inception-ResNet-v2 network with the best performance for in-depth learning, training, and testing of the database. Section 5 uses different training and testing sets to validate the classification results of the network, which is compared with the four traditional machine learning methods for evaluating the performance by accuracy, F1 score, and test time. Section 6 further tests the consistency of the classification effect of the Inception-ResNet-v2 network in three adverse environments. Sections 7 concludes this paper by summarizing contributions and limitations.
Section snippets
Overall architecture
This paper uses DCNN to detect and classify crack images with different severity levels, and employs the Inception-ResNet-v2 network as an example to introduce the structure composition and image processing process. The overall structure of the Inception-ResNet-v2 network mainly includes input layer, convolution layer, pooling layer, Stem module, Inception-resnet module, Reduction module and the final Dropout layer and Softmax layer (classification layer). The image processing process can be
Image acquisition
Building a large and comprehensive sample image database is a prerequisite for DCNN based image recognition. In this paper, the appearance of CRTS Ш BTS is scanned by high-resolution line-array cameras mounted on the track inspection vehicle to collect the original crack image data. The image acquisition equipment and region are shown in Fig. 3.
The original BTS images collected by the track inspection vehicle are further cropped to obtain 4650 crack images (400pixel × 400pixel) to remove a
Performance comparison of six existing deep convolutional neural networks
Existing DCNNs (such as VGG16, Inception V3, and ResNet V1 50) can achieve a recognition accuracy of more than 95% in the ImageNet image database (including 1.2 million images with 1000 labels), which has been widely used [68]. As for the crack images of BTS, due to the large differences between the morphological characteristics of cracks and conventional image classification data sets, it is inconclusive as to which network can achieve excellent recognition results. Therefore, this paper uses
Comparison of different training and testing sets
When using the Inception-ResNet-v2 network to detect and classify the database, only the features of 12,000 training set images have been fully learned, while the 3000 testing set images are only used to validate the learning effect of the network. Due to the images of the training and testing sets are fixed, the generality and repeatability of the Inception-ResNet-v2 network cannot be proven, which may fall into local minimum or maximum values. This paper uses the cross-validation (k = 5)
Consistency of classification results in adverse environmental conditions
In order to explore the robustness and adaptability of the Inception-ResNet-v2 network, this paper selects 300 images (each label contains 100 images) from the original testing set to further test the classification effect of the network under adverse environmental conditions, including three types: interference from background noise, the influence of light intensity and image blur caused by human factors. Adding different degrees of salt and pepper noise to the original image to simulate the
Conclusions
This paper proposes a novel quantitative classification method based on IPT pre-processing and DCNN. Unlike the existing crack quantification methods with multiple steps, high computational costs and low automation, the fully trained Inception-ResNet-v2 network neither needs to segment the pixel features of cracks nor to calculate detailed numerical indicators, which can accurately and efficiently detect and classify cracks with three severity levels from the image level based on the similarity
Declaration of Competing Interest
None.
Acknowledgements
The research is supported by the High-Speed Railway Infrastructure Joint Fund of the National Natural Science Foundation of China (U1734208); Science and Technology Support Plan of the Department of Science and Technology of Guizhou Province, China ([2018] 2154); and Key Project of China State Railway Group Co., Ltd. (N2019G024).
References (76)
- et al.
Mechanical property and damage evolution of concrete interface of ballastless track in high-speed railway: experiment and simulation
Constr. Build. Mater.
(2018) - et al.
Comparison of deep convolutional neural networks and edge detectors for image-based crack detection in concrete
Constr. Build. Mater.
(2018) - et al.
Rahbin: a quadcopter unmanned aerial vehicle based on a systematic image processing approach toward an automated asphalt pavement inspection
Autom. Constr.
(2016) - et al.
Automatic ridgelet image enhancement algorithm for road crack image based on fuzzy entropy and fuzzy divergence
Opt. Lasers Eng.
(2009) - et al.
An efficient and reliable coarse-to-fine approach for asphalt pavement crack detection
Image Vis. Comput.
(2017) - et al.
FoSA: F* seed-growing approach for crack-line detection from pavement images
Image Vis. Comput.
(2011) - et al.
An efficient and reliable coarse-to-fine approach for asphalt pavement crack detection
Image Vis. Comput.
(2017) - et al.
Vision-based detection of loosened bolts using the Hough transform and support vector machines
Autom. Constr.
(2016) - et al.
PCA-based algorithm for unsupervised bridge crack detection
Adv. Eng. Softw.
(2006) - et al.
Deep learning based image recognition for crack and leakage defects of metro shield tunnel
Tunn. Undergr. Space Technol.
(2018)
Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning
Autom. Constr.
Deep learning-based automatic volumetric damage quantification using depth camera
Autom. Constr.
Self-reconfigurable facade-cleaning robot equipped with deep-learning-based crack detection based on convolutional neural networks
Autom. Constr.
Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection
Constr. Build. Mater.
Multiclass defect detection and classification in weld radiographic images using geometric and texture features
Expert Syst. Appl.
Experimental study on evolution of mechanical properties of CRTS III ballastless slab track under fatigue load
Constr. Build. Mater.
Analysis of edge-detection techniques for crack identification in bridges
J. Comput. Civ. Eng.
Development of an inspection system for cracks in a concrete tunnel lining
Can. J. Civ. Eng.
Metaheuristic optimized edge detection for recognition of concrete wall cracks: a comparative study on the performances of Roberts, Prewitt, canny, and Sobel algorithms
Adv. Civil Eng.
Deep learning-based crack damage detection using convolutional neural networks
Comp. Aid. Civil Infrastruct. Eng.
Multiresolution information mining for pavement crack image analysis
J. Comput. Civ. Eng.
Automatic crack detection and classification method for subway tunnel safety monitoring
Sensors (Basel)
Real-time image thresholding based on sample space reduction and interpolation approach
J. Comput. Civ. Eng.
Application of a new image segmentation method to detection of defects in castings
Int. J. Adv. Manuf. Technol.
Road Surface Crack Detection: Improved Segmentation with Pixel-based Refinement, 25th European Signal Processing Conference (EUSIPCO)
Matched filtering algorithm for pavement cracking detection
Transp. Res. Rec.
An Assessment of Automated Pavement Distress Identification Technologies in California (Technical Memorandum No. UCPRC-TM-2008-13)
Adaptive road crack detection system by pavement classification
Sensors
Cracking classification using minimum rectangular cover-based support vector machine
J. Comput. Civ. Eng.
Detection of asphalt pavement potholes and cracks based on the unmanned aerial vehicle multispectral imagery
IEEE J. Select. Topics Appl. Earth Observ. Remote Sens.
Automatic recognition of asphalt pavement cracks based on image processing and machine learning approaches: a comparative study on classifier performance
Math. Probl. Eng.
A novel method for asphalt pavement crack classification based on image processing and machine learning
Eng. Comput.
Automatic welding defect detection of x-ray images by using cascade AdaBoost with penalty term
IEEE Access
Evaluation of cracking in asphalt pavement with stabilized base course based on statistical pattern recognition
Int. J. Pavement Eng.
Automatic Recognition of Pavement Surface Crack Based on BP Neural Network, Proceedings of the 2008 International Conference on Computer and Electrical Engineering (ICCEE)
Automatic road crack detection using random structured forests
IEEE Trans. Intell. Transp. Syst.
Illumination compensation model with k-means algorithm for detection of pavement surface cracks with shadow
J. Comput. Civ. Eng.
An enhanced dynamic Gaussian mixture model-based damage monitoring method of aircraft structures under environmental and operational conditions
Struct. Health Monitor. Int. J.
Cited by (65)
Research on dynamic characteristics of railway side-cracked slab for train-track coupled system
2024, Engineering Failure AnalysisAutomated detection of railway defective fasteners based on YOLOv8-FAM and synthetic data using style transfer
2024, Automation in Construction3D tensor-based point cloud and image fusion for robust detection and measurement of rail surface defects
2024, Automation in ConstructionPerformance deterioration and structural state diagnosis of slab tracks for high-speed railways: A review
2024, Engineering Failure AnalysisAutomated detection and quantification of pavement cracking around manhole
2024, Engineering Applications of Artificial IntelligenceResearch and implementation of fault data recovery method for dry-type transformer temperature control sensor based on ISSA-LSTM algorithm
2024, Measurement: Journal of the International Measurement Confederation