1. Introduction
Metal structural materials are commonly used in critical areas such as buildings, bridges, machinery, power stations, and aerospace [
1,
2,
3,
4,
5], and the fields of application are increasingly broad and demanding with the emergence of industry and population growth, and must be designed and produced with high strength, high rigidity, excellent durability, and corrosion resistance to fulfill the growing demand [
6,
7]. While under prolonged utilization and loading, structural materials are exposed to irradiation, heat, mechanical stress, moisture, and other factors. These changes, which affect the internal microstructure and chemical composition of the material [
8], cause performance degradation or even failure and may cause equipment downtime and major safety accidents. The aging grade is a crucial indicator for determining the level of structural material failure in the industrial safety evaluation system. Cutting out specified portions of the material and then employing metallographic microscopy for off-line examination [
9] is the traditional method for evaluating the aging grade of structural materials. This method is excessively complex for sample pre-processing, takes a long time, and can destroy the material. Therefore, it is essential to classify structural materials and estimate their level of aging in order to improve equipment maintenance.
Traditional material aging grade research primarily focuses on the characterization of material properties, such as mechanical experiments to establish the relationship between force and the deformation of materials, in order to evaluate the mechanical properties of materials, including elasticity, plasticity, and hardness [
10]. However, the method is unable to account for the influence of factors, including the internal microstructure, surface condition of materials and material properties, and the material is harmed and rendered useless. The structural and thermal properties of materials are evaluated using metallographic and thermal analysis in addition to mechanical experimental methods [
11]. However, this approach necessitates sample preparation, which can result in sample damage and micro-temperature fluctuations that alter the material’s properties. Moreover, this renders it difficult to analyze and evaluate the materials. Spark direct reading spectrometry (SDRS) provides precise information and analytical tools for the study of material aging by identifying and quantifying changes in the elemental composition of materials [
12]. However, the SDRS method can only be used for analyzing elements with higher content in metal materials, and samples must be polished repeatedly before testing so that their surface smoothness satisfies specific standards. This will result in some errors and uncertainties.
Laser-induced breakdown spectroscopy (LIBS) is an atomic spectroscopy analysis technology for elemental analysis of materials. Its principle is to focus a laser pulse on the sample’s surface to create a high-temperature plasma [
13], collecting the plasma’s emitted spectrum and analyzing it. As a result of its advantages of rapid analysis, non-destructive measurement, lack of requirement for sample preparation, high sensitivity to low atomic weight elements, and long-distance measurement capabilities [
14], it differs from conventional analytical technologies. Additionally, it can be used to detect elements in different states of matter, such as solids, liquids, gases, aerosols, etc. [
15,
16] Thus, LIBS has found widespread application in a variety of fields, such as environmental monitoring [
17], biomedical applications [
18], archaeological research [
19], drug applications [
20], extraterrestrial detection [
21], hazardous material identification [
22], nuclear industry [
23], and geological material characteristics [
24]. Furthermore, LIBS is also utilized to identify insulating faults in power supplies for medium-voltage applications [
25].
The range of applications for machine learning (ML) has been growing quickly in recent years. It has been extensively researched and used in the field of material analysis combined with LIBS analysis technology, such as in the application of measuring material composition [
13,
26,
27]. Moreover, the related physical and chemical properties (i.e., matrix effects) are different due to the various compositions of various materials. The laser ablation process and plasma characteristics can be impacted by these properties, according to earlier research [
28,
29]. Therefore, mechanical characteristics, elemental quantitative analysis, and microstructure analysis also can be performed using LIBS technology. Qiu et al. [
30] used random forest regression to determine the content of elements in the sample. The results demonstrated that this approach reduced the detection limit, with a relative error of 0.02 wt.%. Shaik et al. [
31] established a crude oil pipeline life prediction method based on the historical detection data of oil and gas fields and using the feedforward back propagation network (FFBPN). The research findings demonstrate that the crude oil pipeline life prediction model based on FFBPN has higher accuracy and better robustness than the published model, as measured by the maximum Coefficient of R
2 and MSE. To study various aging grades of T91 steel samples, Lu et al. [
32] combined laser-induced breakdown spectroscopy (LIBS) with support vector machines (SVM). The results showed that using multiple linear strengths and the average linear strength ratio as input variables considerably enhanced the model’s performance. However, this method does not take into account the crucial characteristics that differentiate between different age levels and simply assesses the aging level with a limited number of measurement points. Bakthavatchalam et al. [
33] suggested an artificial neural network technique based on experimental datasets to forecast the relative thermophysical characteristics of the measured nanofluid. Temperature, concentration, size, and time were the model’s inputs, while the thermophysical properties were the model’s output. The results indicate that the R
2 value is close to 1.0. Sanjana et al. [
34] classified seven different types of contaminated silicone rubber insulators using machine learning technology with LIBS assistance. According to the findings, classification accuracy of LightGBM was 97.43%. Bellou et al. [
35] used principal component analysis (PCA) and laser-induced shock spectroscopy (LIBS) to classify olive oil samples while researching the effects of experimental conditions on plasma properties. According to the findings, classification performance using appropriate algorithms is improved when experimental conditions are better. Gold ore formed as pressed particles from crushed bulk samples was classified using LIBS and principal component analysis (PCA) by Daniel Diaz et al. [
36] The aforementioned research indicates that the combination of LIBS technology and machine learning has a lot of potential applications in the fields of material classification, element quantitative analysis, and mechanical performance research. However, no one has discovered a connection between artificial intelligence and the estimation of material aging degree. To accomplish a thorough characterization of material service behavior, it is essential to investigate the method of disclosing the multidimensional properties and degree of aging of structural materials based on machine learning.
In this study, a time-frequency feature extraction method using STFT and a deep feature mining method based on a similarity measurement were proposed to solve the challenge of traditional approaches’ limited capacity to predict aging time due to feature similarity retrieved from aging materials. A multitask model of a probabilistic neural network based on bionics algorithm optimization is developed, which can simultaneously realize material classification and aging time prediction, and the best optimization algorithm is selected after comparison. The framework is as follows: (1) establish an experimental system to collect spectral data of samples, LIBS spectral data is pre-processed to enhance the accuracy and stability of the data; (2) take features from pre-processed spectra using principal component analysis (PCA) and short-time Fourier transform (STFT), compare them to choose the best features, and explain the PCA’s limitations for extracting features from aging materials; (3) carry out probabilistic neural network (PNN) analysis and parameter optimization to categorize structural materials and forecast various aging levels based on the retrieved LIBS spectral feature data.
3. Data Analysis
With each material containing sub samples at various aging levels, three material samples were examined using LIBS. The valve stem had age samples that had been operated for 0, 100, 300, 500, and 1000 h. The aging samples used for the welding material had service times of 0, 2000, 5000, 10,000, 13,000, and 35,000 h. Aging samples with aging times of 0, 2000, 5000, 10,000, and 13,000 h were included in the basis material. A total of 32 samples (sample library) were tested, and 250 spectral data were collected, each containing 20,480 pixels. As a result, the experimental dataset obtained was 250 × 20,480, using the spectrum of the base metal’s aging sample as an illustration, as depicted in
Figure 3a. During measurement, a number of variables, including the matrix effect, self-absorption effect [
41], gate delay, and environmental conditions, etc., had an impact on the LIBS spectra. Among them, high-frequency noise brought on by matrix effects, optical interference, and other reasons can result in abrupt peaks or dips in the spectrum, which appear as quick changes in light intensity. Spectral analysis’ precision and dependability may be impacted by this high-frequency noise. Low-frequency noise can generate baseline shifts in the spectrum that are continuous or smooth due to a self-absorption effect and spectrometer noise, which can also affect the analysis. As shown in
Figure 3b, it can be clearly seen that high-frequency noise is caused by a matrix effect and low-frequency noise is caused by a self-absorption effect. Therefore, it is necessary to preprocess the LIBS spectral data in order to increase the analysis’s accuracy [
42], as described in
Section 3.1. Direct application of machine learning models might cause problems with convergence due to the large dimensionality of spectral input, making it difficult to improve the model’s accuracy. In
Section 3.2, the pre-processed spectral data were used to extract features and different feature extraction algorithms were compared. The extracted feature dataset was divided, and probabilistic neural network algorithms were used to classify various types of materials and varying degrees of aging under the same material.
3.1. Spectral Data Pre-Processing
There will typically be some spectrum changes between the observations of each pulse due to the non-uniformity of the sample surface, interference from the environment, and variations in laser energy. These data fluctuations can be decreased with appropriate data pretreatment. In this work, the pre-processing of the spectral data includes wavelet threshold noise reduction (
Figure 4a), baseline calibration based on the segmented feature extraction method (
Figure 4b), and maximum-minimum normalization processing (
Figure 4c).
In the process of using wavelet threshold denoising, the spectral was ultimately decomposed into four layers using db6 wavelet bases and fixed thresholds after testing the denoising effects of various wavelet bases and decomposition sizes. A soft threshold function, which is an improvement over a hard threshold function and has better smoothness in denoising, is chosen among them by the threshold processing. The soft threshold function is described as follows in this article:
where
is a sign function,
is the wavelet coefficient before threshold processing, and
is the wavelet coefficient after threshold processing,
λ represents the threshold. The commonly used threshold is:
In the formula, X is the number of wavelength points in the spectrum. The threshold selected for this study, λ, is 3.15.
The segmented feature value extraction method was used in this work as the baseline correction technique for spectral data.
Step 1: Equally divide the LIBS spectrum into
N sets of data point groups.
Step 2: Calculate the minimum spectral intensity of each data point group as the eigenvalues of the spectrum in that data group.
Step 3: Subtract the corresponding eigenvalues of each data point group, and finally concatenate all data point groups to obtain the baseline corrected spectrum.
The maximum-minimum approach, which sets the spectrum’s intensity values to [0, 1], was utilized in this study to normalize the spectral data. According to the following normalizing formula:
In the formula, y stands for the intensity values of the group of spectra at different wavelengths, while and stand for the intensity values of the group’s spectral data’s minimum and maximum values.
3.2. LIBS Spectral Feature Extraction and Similarity Metric
Directly applying machine learning models can cause difficulties with convergence and other issues due to the high dimensionality of the spectrum input. Feature extraction on spectral data should be carried out in order to enhance the model’s performance and interpretability [
43]. Principal component analysis (PCA), a statistical technique for dimensionality reduction for high-dimensional datasets, has been applied frequently in the analysis of LIBS spectral data [
44]. Its basic idea is to reduce high-dimensional data to a set of principal components (PCs) by projecting it downward into a low-dimensional subspace. The variance contained in each PC is used as the eigenvalues of the spectral dataset, which serve as inputs to the neural network. However, PCA is not without flaws, including the inability to handle nonlinear data, the disregard for non-variance information (such as correlation and outliers), the high processing cost, and perhaps the lack of interpretability of extracted features. Data time-frequency processing and analysis methods have drawn increasing amounts of attention in recent years and have developed into effective tools for time-varying non-stationary signals. A well-known technique for time-frequency analysis, the short-time Fourier transform (STFT), is frequently employed for feature extraction [
45]. The STFT overcomes the limitations of the Fourier transform, which include its poor performance on abrupt and non-stationary signals as well as its inability to characterize the local properties of signals in the time domain. STFT can be used to visualize data in the time spectrum (or time scale) domain and intuitively observe the time-frequency characteristics of the data, while the principal components extracted by PCA may not have intuitive interpretability. In this study, the LIBS spectrum data features were extracted using STFT, and similarity tests were performed on the extracted spectral feature [
46,
47] (Formulas (1) and (2)). In various material service behavior situations, the results revealed a single change, and the similarity measurement results of the material spectral feature were used as input for the multi-classification deep learning model. This study applies PCA to extract feature values (feature 1) and the multi-frequency spectral feature extraction based on STFT (feature 2). These feature values are then inputted into the same network for classification prediction and comparison.
where
X* is the normalized value,
is the value before normalization,
is the mean of the components,
is the standard deviation of the components,
,
are the standard sample data and the measured sample data, and
is the normalized Euclidean distance. The similarity measure based on Euclidean distance measures the distance between two vectors by calculating the square root of the sum of the squares of the differences between their respective dimensions. After measuring the similarity of the characteristics of aging materials, the degree of aging or similarity between different materials can be more accurately evaluated, which can help identify common patterns or related features in aging materials. This is crucial for predicting material properties, evaluating reliability, and identifying potential aging and defect mechanisms.
3.3. PNN in LIBS
PNN is a form of feedforward network that combines density function estimation and Bayesian decision theory to classify samples based on radial basis function (RBF) networks [
48]. Its network structure is shown in
Figure 5. The input layer, hidden layer, summation layer, and output layer are the four components that make up the PNN network. The input layer is used to transfer information to the hidden layer and receive values from training samples, and the number of neurons is equal to the number of input variables. The hidden layer is a radial base layer with each neuron corresponding to a center, and the distance between the input vector and the center is determined. A scalar value is eventually returned. The performance of PNN will be impacted by the number of hidden layer neurons n, which should be configured in accordance with the particular application. The summation layer has M nodes, each of which represents a class. The summing layer has M nodes, each of which corresponds to a class. The decision-making process is determined using the summation layer’s competitive transfer function. Resulting from that, the output layer outputs the decision result, with only one 1 and all other results being 0. The output result of the classification that has the highest probability value is 1.
The activation function of each neuron in the hidden layer is given by the probability density function based on the Gaussian kernel, and the formula below describes the link between input and output determined by the
neuron of class
:
where
= 1, 2, ⋯,
,
is the total number of classes in the training samples,
is the kth training sample belonging to the
class of samples,
is the dimensionality of the sample vector, and
σ is the smoothing parameter.
The summation layer takes the weighted average of the outputs of the hidden neurons belonging to the same class in the hidden layer:
where
denotes the output of the
category,
is the total number of training samples of the
category, and the number of neurons in the summation layer is the same as the number of categories
M.
Remove common elements and define the discriminant function as follows based on the input/output relationship between the hidden layer and the summing layer:
The greatest
in the summation layer is selected as the output category in the output layer:
PNN has excellent adaptive learning and fault tolerance capabilities. The choice of parameters [
49], such as smoothing parameter
σ, the number of hidden layer nodes
n, the hidden center vector
c (the center vector of each pattern category), etc., affects how well the network structure performs. The value of
σ is too small, which only serves as isolation for separately trained samples, and the value of
σ is too large to fully distinguish details, and for different categories with unclear boundaries, the ideal classification effect may not be achieved, which is close to linear classification. In order to improve the accuracy of the network and achieve the best classification results, this research chooses bionics optimization algorithms (such as genetic algorithm (GA), particle swarm optimization (PSO), dragonfly algorithm (FA), etc.) to determine the smoothing parameter
σ and the number of hidden nodes
n. A classification model based on FA-PNN was ultimately chosen by contrasting various iterative optimization algorithms.
In this study, for multiple material classification and aging time estimation, LIBS spectral data and PNN algorithm are combined. (1) Use spectral feature sets to categorize the three materials. (2) Following the determination of the classification outcomes from (1), extract the material’s aging time feature dataset and arbitrarily split it into a 70% training set and a 30% testing set. To categorize material samples with various degrees of aging, the same PNN model as in step one is applied. (3) After utilizing optimization algorithms to optimize the structural parameters in PNN, create an FA-PNN model and perform the classification of the material aging time. Employ the data sets of distinct aging degrees of the other two materials as prediction sets to verify the generalization ability of the constructed model.
5. Summary and Prospective
In this paper, a new method based on a probabilistic neural network model combined with LIBS spectral data for multi-classification of material samples and aging degree is proposed. The raw spectral data are firstly pre-processed (noise reduction, baseline calibration, and normalization), and the spectral data after PCA dimensionality reduction and multi-band data under time-frequency analysis are selected as spectral features. The ACCs of the neural network model based on different numbers of hidden neurons and iteration algorithms are compared. Experimental results show that the number of hidden neurons that best fits the neural network model is six, the optimal iteration algorithm is FA, and the value of the optimal parameter σ under FA_PNN is determined. We also compared the ACC of the models built using PCA features and time-frequency features as input variables of the test set, and the results show that the recognition ability of the model built with time-frequency features as input variables is better than that of the model built with PCA features as input variables. The optimal ACCs of neural networks and other well-known models in spectral analysis (ANN, PLS-DA, KNN, and SIMCA) on the test set were also evaluated, and the results showed that only the probabilistic neural network model achieved 100% ACC, which indicates the success of LIBS combined with probabilistic neural networks in the identification of material types and different aging levels. Through the analysis of historical data and real-time monitoring data in real-time practice, this study effort provides a prediction and early warning of material aging time, offering a direction for material aging monitoring and evaluation. It has significant practical significance for extending equipment life and doing preventative maintenance. Meanwhile, the efficacy of simulation experiments for thermal aging and corrosion can be improved while material aging time prediction based on time-frequency cfharacteristics and the FA-PNN model could provide guidance for simulation experiments, such as optimizing experimental parameters, determining experimental time, and choosing suitable aging conditions.
In our future research, LIBS will be used for quantitative analysis of material samples, and we will explore the use of emerging technologies like attention mechanisms or graph convolutional networks in PNN, as well as the application of multi-scale Gaussian kernels to solve the issue of significant spatial scale variations.