Abstract
The Coronavirus disease 2019, or COVID-19, has shifted the medical paradigm from face-to-face to telehealth. Telehealth has become a vital resource to contain the virus spread and ensure the continued care of patients. In terms of preventing cardiovascular diseases, automating electrocardiogram (ECG) classification is a promising telehealth intervention. The healthcare service ensures that patient care is appropriate, comfortable, and accessible. Convolutional neural networks (CNNs) have demonstrated promising results in ECG categorization, which require high accuracy and short training time to ensure healthcare quality. This study proposes a one-dimensional-CNN (1D-CNN) arrhythmia classification based on the differential evolution (DE) algorithm to optimize the accuracy of ECG classification and training time. The performance of 1D-CNNs of different activation functions are optimized based on the standard DE algorithm. Finally, based on MIT-BIH and SCDH arrhythmia databases, the performances of optimized and unoptimized 1D-CNN are compared and analysed. Results show that the 1D-CNN optimized by the DE has higher accuracy in heartbeats classification. The optimized 1D-CNN improves from 97.6% to 99.5% on MIT-BIH and from 80.2% to 88.5% on SCDH. Therefore, the optimized 1D-CNN shows improvements of 1.9% and 8.3% in the two datasets, respectively. In addition, compared with the unoptimized 1D-CNN based on the same parameter settings, the optimized 1D-CNN has less training time. Under the conditions of ReLU function and 10 epochs, the training takes 9.22 s on MIT-BIH and 10.35 s on SCDH, reducing training time by 67.2% and 64.2%, respectively.
Similar content being viewed by others
1 Introduction
An electrocardiogram (ECG) is a non-invasive detection method that is widely used to reflect a potential heart condition. The ECG information of the patient’s heart is directly obtained to reflect its heart rate status. ECG is appropriate for the diagnosis and treatment of various types of heart diseases [9], and its automatic classification can provide an objective diagnosis and reduce medical diagnosis time [6]. Therefore, the detection and classification of ECG are of great clinical significance, which is also helpful in promoting the clinical research of cardiovascular diseases [8, 15]. Various feature classifiers have been developed to automatically detect ECG signals [27]. However, most methods involve manually extracting features and then classifying heartbeats using traditional classifiers [18]. Thus, achieving high accuracy requires considerable time finding and calculating the best combination of features.
Deep learning has achieved considerable success in computer vision [30] with a robust feature extraction ability. In particular, the convolutional neural network (CNN) is the most widely used deep learning model [2, 13, 16, 19], attaining robust results in applications in medical imaging, gene recognition, speech processing, sleep apnea detection and other aspects [4, 28]. However, in the classification and detection of ECG, the existing methods still reflect three shortcomings, as follows: (1) The complexity of algorithms for QRS waves; (2) The complex changes of irregular heartbeats in rhythm or morphology leading to difficulties in the ECG feature recognition; (3) The need for large training samples and training time to achieve the ideal recognition accuracy [1, 22]. Given the problems in ECG automatic classification, this study proposes a DE-optimized, automated classification approach for arrhythmia based on a representative CNN.
The contributions of this study are as follows: (1) Applying deep learning technology to ECG signal detection classification, which solves the drawback of traditional machine learning methods that require manual feature extraction; (2) Using the DE algorithm to optimize the CNN parameters is helpful to improve the performance of CNN in terms of classification accuracy and training time; (3) Help to detect arrhythmia more accurately and automatically using deep learning.
The remainder of this paper is organized as follows. Section 2 describes the studies related to CNN in ECG recognition. Section 3 presents the methodology of our proposed one-dimensional CNN (1D-CNN) and its optimization method and Sect. 4 describes the experimental setup. Section 5 discusses the experimental results through analysis. Section 6 presents the conclusions and future work.
2 Background
Early detection is beneficial for diagnosing and treating various heart diseases to ensure a high survival rate of patients [5]. Examples of heart diseases are atrial premature beat, ventricular premature beat, left bundle branch block (LBBB) and right bundle branch block (RBBB). Many effective methods for automatically detecting cardiovascular diseases (CVDs) from ECG signals have been developed in the past few decades. For example, in the automatic detection of Atrial Fibrillation (AF), several detectors can identify cardiovascular diseases (CVDs) based on p-wave deletion or RR interval variation (R is a symbol for the beginning of ventricular depolarization). Dash et al. proposed an AF automatic detection algorithm based on the randomness, variability and complexity of heartbeat interval time series. Lian et al. presented an AF detection algorithm based on RR interval scatter plots with variations [11]. For the method in [11], the morphological features are extracted by a wavelet transform and independent component analysis. Wavelet features consist of fourth-order approximation coefficients and third and fourth-order detail coefficients. The features extracted by independent component analysis include a set of independent source signals recovered from the observed samples [14].
Although various feature classifiers have been developed, most of these involve manual extraction and traditional classification. These methods require considerable time in determining the best feature combination to increase their accuracy. Moreover, feature extraction in ECG signal processing requires high professional knowledge of digital signal processing. Therefore, feature extraction or selection presents a challenge to non-medical researchers and those in the field. To overcome this drawback, several researchers have begun using neural networks to automatically extract features such as heartbeat. Escalona-Morán et al. presented a neural network based on the convolution of the 2D heartbeat as a classification method [10], a series of three adjacent beats are converted into a 2D coupling matrix input, enabling the convolution filter to easily capture the continuous waveform of adjacent heartbeats and the correlation between beats. The above method achieved a final sensitivity of 76.8%, with positive predictive values of 74.0% for SupraVentricular Ectopic Beat (SVEB) and 93.8% for Ventricular Ectopic Beat (VEB).
Deep learning, which exhibits strong feature extraction capability, has achieved considerable success in computer vision in recent years. In particular, the CNN model is the most widely used deep learning model, demonstrating robustness in its applications in medical imaging, gene recognition, speech processing, sleep apnea detection and other aspects. Accordingly, the use of deep neural networks (DNN) to automatically detect ECG signals has gained research interest. Salem et al. developed an automatic AF detection method based on CNN [23], where AF features are automatically learned and applied to the classification module. This method simplifies the feature extraction without requiring expert feature engineering to determine the suitability and criticality of features.
Bhagyalakshmi et al. developed a genetic bat-assisted support vector neural network (GB-SVNN) to classify ECG arrhythmia [3], obtaining a final accuracy of 0.9696 and sensitivity of 0.99. Zhang et al. proposed a multi-scale CNN (MCNN) that performs the timescale transformation of input signals and AF detection based on the scale transformation input [31], achieving a depth that strongly correlated with detection performance.
Although the above methods are experimentally effective in addressing specific CVD detection problems, their good performance is typically based on carefully selected clean data or a small number of testers. Thus, their applicability may be limited. Therefore, achieving the generalization capability of models to detect CVD reliably from limited single-lead ECG records remains a considerable challenge. Such generalization capability depends on how a CNN model depends is trained and requires a longer training time. In the present study, we propose using DE to optimize the initial weights of 1D-CNN to ensure its generalization and reduce its training time.
3 Methodology
This section starts with (1) Data source and signal preprocessing, (2) Implementation of 1D-CNN and (3) Optimization of SDE. These three aspects introduce the implementation and optimization methods for ECG classification. The specific process is shown in Fig. 1.
Figure 1 shows the entire process of ECG classification with CNN, which starts from preprocessing the hexadecimal raw file of ECG. After preprocessing, we obtain the intuitive image and extract the ECG image to build the data set. With sufficient data extracted from the database, the data set is divided into 75% training set and 25% test set. We repeatedly train CNN and obtain the required classification results.
3.1 Data source and signal preprocessing
The experimental data for this study is the single-lead ECG signal from the MIT-BIH and Sudden Death Cardiac Holter (SCDH) arrhythmia databases [12] to evaluate the proposed method. MIT-BIH database covers ECG signal records with a sampling rate of 360 Hz and 30 min as the unit, and the SCDH contains records with a sampling rate of 250 Hz. Unlike the equal-length records in the MIT-BIH, the record lengths vary in the SCDH. For example, the No. 30 record is stored for approximately 24 h, and the No. 52 record is stored for approximately 7 h. This study needs data recording with heartbeats as the unit. Therefore, we extract the heartbeat from the records before the experiment. The process can be divided into signal preprocessing and heartbeat location and capture.
3.1.1 Signal preprocessing
The ECG signals are preprocessed to ensure that the experimental data only retain the frequency and characteristics related to arrhythmia. In addition, such preprocessing eliminates the interference of noise and other clutter and thus singles out the waveform mode. Therefore, preprocessing of ECG signals is required to accurately capture heartbeats.
The preprocessing is divided into four processes: band-pass filtering, ‘double slope’ preprocessing, waveform smoothing and window sliding. In signal filtering, the ECG signal is filtered by a 40-order FIR band-pass filter with a passband of 15–25 Hz, as suggested in literature [25]. After the band-pass filtering, the ‘double slope’ preprocessing is applied to increase the prominence of the waveform. Then, waveform smoothing is performed based on the low-pass filter with a cut-off frequency of 5 Hz. The last step is to implement window sliding to the waveform to increase the amplitude of and smoothen the waveform. The width of the sliding window is set to 17 sampling points.
Figure 2 shows the waveform comparison before and after ECG row data preprocessing. After preprocessing, the original ECG signal shows single-mode peaks, each corresponding to a QRS wave. Compared with the original signal, the preprocessed signal is easier to locate and detect, thereby the purpose of preprocessing is attained.
3.1.2 Heartbeat location and capture
Adaptive double threshold for QRS peak location
This study uses the double threshold method [26] to complete the detection logic of the QRS peak. The specific processing method works such that the detection starts when the wave peak is higher than the low threshold. The threshold is lowered when the wave peak is between the high and low thresholds, thenadjusted higher when the wave peak is higher than the high threshold. When the wave peak is lower than the low threshold, then noise is assumed.
The sensitivity (SE) and positive predictive rate (P +) are used to evaluate the QRS detection algorithm. The evaluation indicators are defined as follows [21]:
where \(TP\) represents the number of correct check beats, \(FN\) represents the number of missed beats and \(FP\) represents the number of check wrong beats.
Heartbeats capture
Based on the QRS peak location, 100 sampling points are extracted from the left, 150 sampling points are extracted from the right and a single heartbeat with a total length of 250 sampling points is intercepted.
The manual annotation type codes of various heartbeats are stored in the MIT-BIH and SCDH arrhythmia databases. The heartbeats consist of the normal beat (Normal), left bundle branch block beat (LBBB), right bundle branch block beat (RBBB), premature venture contract (PVC), aberrated atrial premature beat (Aab), left or right bundle branch block (BBB), R-on-T premature ventricular contraction (RONT), nodal (junctional) premature beat (NPC) and for premature or ectopic supraventricular beat (SVPB).
Tables 1 and 2 show the final heartbeat statistics. LBBB, RBBB, PVC and Normal are the four categories with the most significant number of 41 types in the MIT-BIH arrhythmia database. In the SCDH database, Normal, PVC, SVPB and BBB are the four categories with the largest sample sizes. For the accuracy evaluation, we select these two groups of heartbeats as the training and testing data of 1D-CNN.
3.2 Implementation of 1D-CNN
We use 1D-CNN for the processing because it commonly classifies electrodermal activity signals and works similarly to two-dimensional CNN (2D-CNN) [24]. 1D-CNN receives lower dimensionality of the elements, leading to its simpler architecture [24]. Therefore, this study adopts the 1D-CNN model as a classifier according to ECG signals' characteristics and data scales in the MIT-BIH and SCDH arrhythmia databases. Designed on the basis of the 2D-CNN and its backpropagation (BP) training, the 1D-CNN predicts and classifies the ECG signals of different patients. This section briefly reviews and summarizes the above process and explains the modifications needed to transform 2D-CNN to 1D-CNN.
The convolutional and the pooling layers are stacked alternately for design to simplify the CNN and its network parameters, as shown in Table 3.
3.2.1 Forward propagation
Forward propagation is divided into four situations, as follows:
-
(1)
Forward propagation from the input layer to the convolution layer
In the CNN, forward propagation of the input layer is the first step. Generally, the input layer corresponds to the convolution layer. The dimension of the convolution kernel can correspond to the input dimensionality. For example, if the input is a black-and-white image, then the corresponding convolution kernel is a 2D square matrix.
-
(2)
Forward propagation from the hidden layer to the convolution layer
The forward propagation of the hidden layer to the convolution layer is very similar to that of the input layer to the convolution layer. The only difference is that the input is from the hidden layer rather than the matrix formed by the original image samples.
-
(3)
Forward propagation from the hidden layer to the pooling layer
The processing logic of the pooling layer is relatively simple and its purpose is only to reduce and summarize the input matrix. For example, the input is an N × N square matrix and the pooled size is a k × k square matrix. Then the size of the output matrix is \(\frac{N}{k}\times \frac{N}{k}\). The average pooling logic is used in the pooling layer of the 1D-CNN, with sizes of 5 and 3, respectively.
-
(4)
Forward propagation from the hidden layer to the fully connected layer
The forward propagation of the hidden layer to the fully connected layer is used to put multiple outputs together to calculate the fully connected layer. For example, if the outputs of the hidden layers are 3N × N square matrixes, then the input size of the full connection layer is 3N × 3N. After calculating the full connection layer, the final output is obtained by applying the \(softmax\) activation function.
3.2.2 Backpropagation
BP uses the gradient descent algorithm to update the weights and offsets of CNN. With the need to propagate the output error step by step, each layer must keep the intermediate variables of the error delta and the derivative of the activation function.
3.2.3 Update the weights and offsets of each layer
The update is divided into two situations, for the hidden and fully connected layers. In the first situation, given that the pooling layer has no weights and biases, only the convolutional layer is updated.
3.2.4 Changes needed for 1D-CNN implementation
The most crucial difference between 1D-CNN and 2D-CNN is that they adopt weights, input and output elements as 1D vectors and a matrix, respectively. Therefore, the operation of the matrix must be changed accordingly. For example, convolution, pooling, full connection and error calculation need to be formatted based on 1D-CNN. At the same time, \(conv2D\) in the forward propagation and backpropagation must be changed to \(conv1D\) to show the difference in 2D-CNN.
3.3 Optimization of 1D-CNN based on SDE
The DE algorithm is another excellent optimization after genetic algorithms, particle swarm optimization and other evolutionary algorithms. The DE algorithm has a simple structure, few control parameters, easy real coding and fast convergence speed, which has been proven theoretically [20]. The parameter setting operations of DE can be divided into SDE and adaptive DE. This study selects the SDE algorithm with a relatively simple structure to optimize the initial parameters of 1D-CNN by considering the algorithm complexity and running time cost. In SDE, the vector containing D optimization variables is called individual and the \(i\mathrm{th}\) individual is expressed as:
where \(i=\mathrm{1,2},\cdot \cdot \cdot ,NP\), \(NP\) is the population size, \(G\) is the evolution generation and \(j\) is the \(j\mathrm{th}\) optimization variable.
Figure 3 shows the optimization flow chart of the SDE for 1D-CNN and its basic operations include initialization, mutation, crossover and selection. Individuals are randomly generated in the search space by initialization and new individuals are generated by mutation and crossover. The selection determines the individuals entering the next generation. This process is repeated until the termination condition is reached.
3.3.1 Mutation
When the population evolves to the \(G\) generation, the mutation operation is carried out on the parent individual \({\mathrm{X}}_{\mathrm{i},\mathrm{G}}\) to obtain the mutated individual:
The subscripts \(r1\) \(\mathrm{r}2\) and \(\mathrm{r}3\) are mutually different integers that are randomly selected between 1 and \(NP\) and different from \(i\). \({X}_{r1,G}\) is called the base vector, \(\left({X}_{r2,G}-{X}_{r3,G}\right)\) is called the difference vector and \(F\) is the mutation operator. If the parameter in the variant exceeds the boundary, the parameter value is replaced by the boundary value.
The mutation strategy used by the SDE is usually called DE/rand/1, where ‘rand’ means that the base vector is randomly selected in the population and ‘1’ means the number of difference vectors. DE/rand/1 strategy has good global convergence but also the disadvantage of slow convergence.
3.3.2 Crossover
The experimental individuals generated by crossover operation are as follows:
where \({r}_{j}\left[\mathrm{0,1}\right)\) is the random number calculated for the \(j\) th time and \(CR\) is the crossover operator. \(r(i)\) is a randomly selected integer between 1 and \(D\), which enables \({U}_{i,G+1}\) to obtain at least one variable from \({V}_{i,G+1}\).
3.3.3 Selection
For the minimization problem, the individual with a smaller objective function result is selected from the test individuals \({U}_{i,G+1}\) and the parent individual \({X}_{i,G}\) to enter the population of the next generation:
where \(F\left(X\right)\) represents the objective function. In the experiment below, we use the accuracy of heartbeat type prediction to evaluate the optimization effect of the model, and thus the positive prediction rate P + of 1D-CNN is used as the fitness function.
3.3.4 Control parameters
The control parameters of the SDE mainly include population size NP, mutation operator F and crossover operator CR, which remain unchanged in the evolution [7]. The control parameter selection of the SDE follows the rules below based on experience.
-
(1)
Population size. According to experience, the population size is commonly taken as \(NP=5-10D\) to ensure that the algorithm has enough different mutation vectors.
-
(2)
Mutation operator. The mutation operator \(F\in \left[\mathrm{0,2}\right]\) is a real constant factor, which determines the amplification ratio of the deviation vector. The size of \(F\) is negatively related to the convergence rate. If the population converges too early, then \(F\) should be increased. In this study, the initial value of \(F\) is set as 0.5.
-
(3)
Crossover operator. The crossover operator \(CR\in [\mathrm{0,1}]\) is a real constant factor. For \(CR\), the size is positively related to the convergence speed and its initial value is set as 0.9 to prevent overfitting.
-
(4)
Maximum evolutionary generation. The parameter’s initial value is 30 and the control experiments are carried out 30–50 times to ensure reliability and shorten the experimental time.
3.3.5 Conversion of weight into the chromosome representation
Matrix deformation operation splits the organizational structure of the corresponding weights in 1D-CNN, which are then transformed and mapped into the chromosome representation in DE. The structure mapping from chromosome to weight is shown in Fig. 4, where \({\widehat{\mathrm{X}}}_{\mathrm{s}*d}\) represents the mapping of chromosomes to weights, the subscript s represents the size of the weight and d represents the weight dimension.
4 Experimental setup
In this section, we introduce ECG detection and classification before and after using SDE to optimize the initial weights of 1D-CNN.
4.1 Experimental process before DE optimization
4.1.1 Simulation
The experiments are simulated on the MatlabR2017b platform. All data in the MIT-BIH and SCDH databases are two-lead ECG records. Still, our simulation experiment only needs the ECG signal of the first lead and the label marking the corresponding heartbeats.
The most significant advantage of CNN over traditional machine learning lies in feature extraction and selection, due to automatically acquire features from large data without manual design and selection. The data of the input layer takes the QRS peak as the midpoint and the left and right heartbeats are sampled to 250 points to learn the heartbeat waveform’s morphological structure.
In the MIT-BIH arrhythmia database, the 4 types of manually labeled beats of Normal, LBBB, RBBB and VPC account for over 88% of the total heartbeats in the global database. To ensure the training effect of CNN and improve the classification accuracy, we use these four heartbeats train CNN for ECG recognition as the first group of data. For the same reason, the four types of beats in the SCDH database, namely, Normal, PVC, SVPB and BBB, are used to train the 1D-CNN.
The experimental process is as follows:
-
(1)
Binary files in the MIT-BIH and SCDH databases are decoded and read based on records.
-
(2)
From the MIT-BIH arrhythmia and SCDH databases, 48 and 23 records, respectively, are scanned globally and manually labeled to extract Normal, LBBB, RBBB and PVC and Normal, PVC, SVPB and BBB, respectively.
-
(3)
From each of the four data types obtained in steps 1 and 2, 5000 records are taken from MIT-BIH and 1000 records from SCDH and then combined into a data set. At the same time, construct the corresponding tag set.
-
(4)
The data set processed in Step 3 is randomly divided into a 75% training set and a 25% test set.
-
(5)
According to the parameter settings in Table 2, a 7-layer CNN framework is designed. The initial learning rate is set to 0.01, the batch size is set to 16 and the epoch is set to 30.
-
(6)
CNN is established and initialized.
-
(7)
According to the batch size set in Step 5, this study constructs the input matrix, input CNN and start training.
4.1.2 Algorithm performance analysis
Evaluation index setting
-
(1)
Confusion matrix
The confusion matrix represents the classification results in the form of a matrix, where each column represents the prediction category and its total number represents the number of predicted data. Each row represents the real data category and its total number represents the number of actual data.
-
(2)
Average accuracy
Given that the training and test samples are randomly selected from Tables 1 and 2, each classification result may have contingecies. The average accuracy evaluates the classification performance of the algorithm to avoid the impact of contingency on the results analysis. Ten independent experiments are repeated on the classification algorithm and the epoch of each experiment is 30. The average classification accuracy over 10 times is calculated.
-
(3)
Loss function
Cross-entropy is used as a loss function to evaluate the model to evaluate the gap between predictions and labels after each training. The simplified formula of cross-entropy is:
where \({y}_{i}\) is the expected output and \({a}_{i}\) is the actual output of the neuron.
Performance evaluation
The following experimental results are the test data set from MIT-BIH under different activation functions according to the above process. Tables 4, 5 and 6 show the classification results of 1D-CNN based on Sect. 3.2. These results show the average of 10 independent experiments.
The results in Table 4 with sigm as the activation function show that the average classification accuracy is 96.0%, derived from the average accuracies are Normal 99.8%, LBBB 97.5%, RBBB 97.4% and PVC 89.5%.
The results in Table 5 with tanh as the activation function show that the average classification accuracy is 94.3%, derived from the average accuracies of Normal 99.5%, LBBB 93.3%, RBBB 97.3% and PVC 87.1%.
The results in Table 6 with ReLU as the activation function shows that the average classification accuracy is 98.9%, derived from the average accuracies of Normal 99.9%, LBBB 99.2%, RBBB 99.5% and PVC 96.9%.
The following experimental results are the test data set from the SCDH under different activation functions according to the above process. Tables 7, 8 and 9 show the classification results of 1D-CNN based on Sect. 3.2. These results show the average of 10 independent experiments.
The above results with two databases show that the 1D-CNN based on Sect. 3.2 has a good ECG recognition effect in the case of 30 epochs, except for using the sigm function in the SCDH database. Comparing the three activation functions, ReLU has the highest accuracy, reaching 98.9% and 89.3% for the two databases, respectively.
4.2 Experimental process after DE optimization
4.2.1 Simulation
The simulation experiment based on the SDE is implemented on MatlabR2017b. The optimization objective is set as the initial parameters of the 1D-CNN model built in Sect. 3.2. The record set used by 1D-CNN is still that of ECG signals from the MIT-BIH and SCDH databases with two-lead and corresponding heartbeat labels. Given the vast amount of calculations needed, the ECG data set is reduced in this section.
In both the MIT-BIH and SCDH arrhythmia databases, the specific method is to take 1000 for each type of signal, with 4000 total number of heartbeat samples and select 75% as the training set and 25% as the testing set [29].
The experimental process is as follows:
-
(1)
A data set with total heartbeats of 4000 is imported and built with the corresponding label set.
-
(2)
The data set is randomly divided into a 75% training set and a 25% test set.
-
(3)
According to the parameter design in Sect. 3.2, a 1D-CNN is established and initialized and the parameters of SDE are set. The population size is set to 30, the maximum evolution generation to 30 and 50, the initial value of the mutation operator is 0.5 and the initial value of the crossover operator is 0.9.
-
(4)
The total number of parameters to be optimized is calculated and the parameter vector—often called the chromosome—is generated.
-
(5)
According to the data in Step 5, we initialize the population elements to random numbers between (− 1,1), generate the population random matrix with the size of [30,732] and initialize the corresponding fitness.
-
(6)
The chromosome variables in DE are split and transformed into the organizational structure of corresponding weights in 1D-CNN and then assigned the weights.
-
(7)
The optimal fitness and the optimal parameter vector of the population in this iteration are calculated through mutation, crossover and selection. In addition, the positive prediction rate P + of 1D-CNN is used as the fitness function mentioned in Sect. 3.3.4.
-
(8)
Step 7 is repeated until the end of the iteration. The final fitness and parameter vector are recorded and the parameter vector is split and assigned to 1D-CNN.
4.2.2 Optimization analysis of standard differential evolution algorithm
Given that the fitness function is the positive prediction rate of 1D-CNN, it can be used to analyse the optimization performance of the algorithm. The other parameters are not adjusted except for mutation operator \(F\) and crossover operator \(CR\) to generalize the problem.
-
(1)
Optimization analysis based on 30 evolutionary generations
Thus far in literature, mutation operator less than 0.4 and greater than 1.2 are only occasionally valid [17]. Therefore, in the experiment, we choose F = 0.5, \(CR\) = 0.9 as the initial values and take F = 0.5, 0.8, 1.0, 1.2 and \(CR\) = 0.9, 0.8 as the parameter combination to test the optimization results step by step.
Although the final evaluation standard is the P + result of 1D-CNN, the average accuracy is still used to evaluate the classification performance of the algorithm. Ten independent experiments on the classification algorithm are carried out with different DE parameter combinations and the number of epochs for each experiment is set to 10. Table 10 shows the optimization results, where the average of P + based on the MIT-BIH arrhythmia database and 10 repeated experiments. Table 11 shows the optimization results based on SCDH, which are also the averages of 10 repeated experiments.
-
(2)
Optimization analysis based on 50 evolutionary generations
Consistent with the parameter setting of 30 evolutionary generation, F = 0.5 and CR = 0.9 are still selected as the initial values and the optimization results of the algorithm are tested step by step with the parameter combinations of F = 0.5, 0.8, 1.0 and 1.2 and CR = 0.9 and 0.8. Tables 12 and 13 shows the corresponding optimization results based on the MIT-BIH and SCDH arrhythmia databases.
As shown in the figures below, according to 30 generations, Fig. 5 shows the fitness lines of different configurations of SDE parameters under other activation functions based on the MIT-BIH database and Fig. 6 shows the fitness lines based on the SCDH database. Similarly, According to 50 generations, Fig. 7 shows the fitness lines of different DE parameter combinations under other activation functions based on the MIT-BIH database. Figure 8 shows the fitness lines based on the SCDH database.
The four parameter configurations are represented by four colors in the figures, respectively. In the figures below, further improvements in convergence are observed at later generations by comparing the fitness function lines measured by P + between 30 and 50 generations. For example, parameter 4 improves convergence at 48 generations whereas the remaining configurations show static convergence for sigm active function. Parameter 1 shows improvement in convergence for tanh, while parameter 2 shows improvement in convergence for ReLU at later generations. The optimal fitness associated with each active function is contributed by different configurations of parameters with different maximum generations.
The following figures show the fitness based on 50 generations with MIT-BIH and SCDH databases.
5 Results and comparative analysis of 1D-CNN before and after optimization
According to the statistics of SDE optimization results in Tables 10, 11, 12 and 13, the optimal configurations of SDE parameters associated with active functions for 30 and 50 generations are used to determine the initial weights for 1D-CNN. Then, the performances of the optimized and unoptimized 1D-CNNs at the selected epochs are recorded. Tables 14 and 15 show the accuracy of 1D-CNN across training times before and after optimization by SDE for 30 and 50 generations based on the MIT-BIH and SCDH databases, respectively. The better results between the optimized and unoptimized 1D-CNNs are bolded.
By referring to Tables 14 and 15, the comparisons between 30 epochs of training of unoptimized 1D-CNN and 10 epochs of training of optimized 1D-CNN show that the latter model is equally good or better regardless of generations. The reason is that the initial parameters of 1D-CNN after optimization have structural features, which when used in the data set training, the gradient descent algorithm in the BP increases the convergence speed of the loss function, resulting in fewer training times and higher accuracy. The results in Tables 14 and 15 demonstrate that the final optimization result of 50 generations is slightly better than that of 30 generations under the same parameters. Therefore, the initial weights produced by optimal parameters of 50 generations are used to analyse the processing time of the 1D-CNN optimized by SDE. Under the condition that the total number of samples is 4000, Tables 16 and 17 show the optimized and unoptimized 1D-CNN processing time at the selected epochs with the two different databases. The results are the average value after 10 independent experiments.
In the MIT-BIH and SCDH databases, the accuracies of the optimized 1D-CNN after 10 epochs equal or surpass the unoptimized 1D-CNN after 30 epochs. Therefore, we mainly compare the time consumed by these two accuracies. The shorter time between these two accuracies are bolded. Based on the corresponding columns in Tables 16 and 17, the time required significantly decreased. Moreover, this reduction increases sharply with the model complexity and the total number of samples. In other words, a higher model complexity and larger sample size can show apparent effects of DE optimization.
The results in Table 18 and 19 show that regardless of MIT-BIH or SCDH, the different combinations of variation factor F and crossover factor CR determine the convergence rate and optimization effect in the SDE stage. F value negatively correlates while the CR value positively correlates with the convergence rate. However, the best classification accuracy of 1D-CNN does not necessarily come from the best optimization results in the SDE stage. The reason is the difference of the two optimization strategies. DE uses random optimization based on the evolutionary process while 1D-CNN uses gradual optimization based on gradient descent. Therefore, when the classification accuracy of 1D-CNN is not very high, the actual development must avoid the blind pursuit of a more significant DE optimization effect that can only waste computing resources. Carrying out repeated experiments and cyclic training on the parameter combination of DE is reasonable to determine the better combinations to achieve better final classification accuracy and convergence effect.
6 Conclusions and future works
In this study, we propose a 1D-CNN method for ECG classification with initial weights optimized by SDE. Comparison of the experimental results before and after optimization shows the feasibility of using DE to optimize the initial parameters of 1D-CNN. The specific data are as follows:
Under the condition that the total number of samples is 4000, the accuracy of optimized 1D-CNN after10 epochs has reached or exceeded the accuracy of unoptimized 1D-CNN after 30 epochs of training. The 1D-CNN using the ReLU activation function shows the highest accuracy. After 10 epochs, the accuracy improves from 97.6% to 99.5% for the MIT-BIH arrhythmia and from 80.2% to 88.5% for the SCDH. Training time decreases from 28.12 s to 9.22 s and from 28.96 s to 10.35 s, respectively. Based on the optimized 1D-CNN performances in the two databases, the accuracy improves by 1.9% and 8.3% and the training time decreases by 67.2% and 64.2%, respectively.
Therefore, the original 1D-CNN has a faster convergence speed and less training time and the classification accuracy improves further through SDE optimization. One of the experimental findings in this study shows that different parameter configurations of SDE affect the 1D-CNN's accuracy differently with the active functions. Future research can then focus on adaptive differential optimization to determine the parameters corresponding to the active functions of 1D-CNN and improve the effectiveness and efficiency of ECG classification.
Data availability
The datasets generated during and/or analyzed during the current study are available in the PhysioNet repository, https://physionet.org/content/mitdb/1.0.0/ and https://physionet.org/content/sddb/1.0.0/.
References
Acharya UR et al (2017) A deep convolutional neural network model to classify heartbeats. Comput Biol Med 89:389–396. https://doi.org/10.1016/j.compbiomed.2017.08.022
Aghamaleki JA, AshkaniChenarlogh V (2019) Multi-stream CNN for facial expression recognition in limited training data. Multimed Tools Appl 78(16):22861–22882. https://doi.org/10.1007/s11042-019-7530-7
Bhagyalakshmi V, Pujeri RV, Devanagavi GD (2021) GB-SVNN: Genetic BAT assisted support vector neural network for arrhythmia classification using ECG signals. J King Saud Univ Inf Sci 33(1):54–67. https://doi.org/10.1016/j.jksuci.2018.02.005
Chandra BS, Sastry CS, Jana S (2019) Robust Heartbeat Detection from Multimodal Data via CNN-Based Generalizable Information Fusion. IEEE Trans Biomed Eng 66(3):710–717. https://doi.org/10.1109/TBME.2018.2854899
Degirmenci M, Ozdemir MA, Izci E, Akan A (2022) Arrhythmic heartbeat classification using 2d convolutional neural networks. Irbm 43(5):422–433
Diker A, Sönmez Y, Özyurt F, Avcı E, Avcı D (2021) Examination of the ECG signal classification technique DEA-ELM using deep convolutional neural network features. Multimed Tools Appl 80(16):24777–24800. https://doi.org/10.1007/s11042-021-10517-8
Ding XYYQF (2017) Research survey of differential evolution algorithms. CAAI Trans Intell Syst 12:431–442
El_Rahman SA (2019) Biometric human recognition system based on ECG. Multimed Tools Appl. 78(13):17555-17572, https://doi.org/10.1007/s11042-019-7152-0
Erdenebayar U, Kim H, Park JU, Kang D, Lee KJ (2019) Automatic prediction of atrial fibrillation based on convolutional neural network using a short-term normal electrocardiogram signal. J Korean Med Sci 34(7):1–10. https://doi.org/10.3346/jkms.2019.34.e64
Escalona-Morán MA, Soriano MC, Fischer I, Mirasso CR (2015) Electrocardiogram classification using reservoir computing with logistic regression. IEEE J Biomed Heal Inform 19(3):892–898. https://doi.org/10.1109/JBHI.2014.2332001
Fan X, Yao Q, Cai Y, Miao F, Sun F, Li Y (2018) Multiscaled Fusion of Deep Convolutional Neural Networks for Screening Atrial Fibrillation from Single Lead Short ECG Recordings. IEEE J Biomed Heal Inform 22(6):1744–1753. https://doi.org/10.1109/JBHI.2018.2858789
Goldberger AL et al. (2000) PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals, Circulation, 101(23), 10.1161/01.cir.101.23.e215
González-Lozoya SM, de la Calleja J, Pellegrin L, Escalante HJ, Medina MA, Benitez-Ruiz A (2020) Recognition of facial expressions based on CNN features. Multimed Tools Appl 79(19–20):13987–14007. https://doi.org/10.1007/s11042-020-08681-4
Hadi SJ, Tombul M, Salih SQ, Al-Ansari N, Yaseen ZM (2020) The Capacity of the Hybridizing Wavelet Transformation Approach with Data-Driven Models for Modeling Monthly-Scale Streamflow. IEEE Access 8:101993–102006. https://doi.org/10.1109/ACCESS.2020.2998437
Hammad M, Luo G, and Wang K (2019) Cancelable biometric authentication system based on ECG, 78(2). Multimedia Tools and Applications, https://doi.org/10.1007/s11042-021-10517-8
Khairandish MO, Sharma M, Jain V, Chatterjee JM, and Jhanjhi NZ (2021) A hybrid CNN-SVM threshold segmentation approach for tumor detection and classification of MRI brain images, IRBM, https://doi.org/10.1016/j.irbm.2021.06.003
Leon M and Xiong N (2016) Adapting differential evolution algorithms for continuous optimization via greedy adjustment of control parameters, J Artif Intell soft Comput Res, 6, https://doi.org/10.1515/jaiscr-2016-0009
Liu T, Si Y, Wen D, Zang M, Lang L (2016) Dictionary learning for VQ feature extraction in ECG beats classification. Expert Syst Appl 53:129–137. https://doi.org/10.1016/j.eswa.2016.01.031
Ma Y, Liu Y, Xie Q, Li L (2019) CNN-feature based automatic image annotation method. Multimed Tools Appl 78(3):3767–3780. https://doi.org/10.1007/s11042-018-6038-x
Pant M, Zaheer H, Garcia-Hernandez L, Abraham A (2020) Differential Evolution: A review of more than two decades of research. Eng Appl Artif Intell 90:103479. https://doi.org/10.1016/j.engappai.2020.103479
Sadrawi M et al (2017) Arrhythmia evaluation in wearable ECG devices. Sensors (Switzerland) 17(11):1–14. https://doi.org/10.3390/s17112445
Sahoo S, Kanungo B, Behera S, Sabut S (2017) Multiresolution wavelet transform based feature extraction and ECG classification to detect cardiac abnormalities. Meas J Int Meas Confed 108:55–66. https://doi.org/10.1016/j.measurement.2017.05.022
Salem M, Taheri S, Yuan JS (2018) "ECG Arrhythmia Classification Using Transfer Learning from 2- Dimensional Deep CNN Features, 2018 IEEE Biomed Circuits Syst Conf BioCAS 2018 - Proc., 1–4, https://doi.org/10.1109/BIOCAS.2018.8584808
Sánchez-Reolid R, de la Rosa FL, López MT, Fernández-Caballero A (2022) One-dimensional convolutional neural networks for low/high arousal classification from electrodermal activity. Biomed Signal Process Control 71:103203. https://doi.org/10.1016/j.bspc.2021.103203
Shepoval’nikov RA, Nemirko AP, Kalinichenko AN, Abramchenko VV (2006) Investigation of time, amplitude, and frequency parameters of a direct fetal ECG signal during labor and delivery. Pattern Recognit Image Anal 16(1):74–76. https://doi.org/10.1134/S1054661806010238
Wang Y, Deepu CJ, Lian Y (2011) A computationally efficient QRS detection algorithm for wearable ECG sensors, Proc Annu Int Conf IEEE Eng Med Biol Soc EMBS, 5641–5644, https://doi.org/10.1109/IEMBS.2011.6091365
Wang D, Meng Q, Chen D, Zhang H, Xu L (2020) Automatic detection of arrhythmia based on multi-resolution representation of ECG signal. Sensors 20(6):1579. https://doi.org/10.1007/s11042-021-10517-8
Xie Y et al (2021) Stroke prediction from electrocardiograms by deep neural network. Multimed Tools Appl 80(11):17291–17297. https://doi.org/10.1007/s11042-020-10043-z
Xu Y, Goodacre R (2018) On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning. J Anal Test 2(3):249–262. https://doi.org/10.1007/s41664-018-0068-2
Zubair M, Kim J, Yoon C (2016) An automated ECG beat classification system using convolutional neural networks, IT Convergence and Security (ICITCS), 2016 6th International Conference on. IEEE. https://doi.org/10.1109/ICITCS.2016.7740310
Zhang Q, Zhou D, Zeng X (2017) HeartID: A Multiresolution Convolutional Neural Network for ECG-Based Biometric Human Identification in Smart Health Applications. IEEE Access 5(c):11805–11816. https://doi.org/10.1109/ACCESS.2017.2707460
Acknowledgements
The authors are grateful to the Ministry of Higher Education Malaysia for providing the Fundamental Research Grant Scheme (FRGS) with grant number FRGS/1/2022/ICT02/UPSI/02/1, which helped fund the research. The authors also thank Universiti Pendidikan Sultan Idris (UPSI) that helped manage the grant.
Author information
Authors and Affiliations
Contributions
Shan Wei Chen: Conceptualization, Methodology, Software, Data curation, Writing-Original draft preparation. Shir Li Wang: Supervision, Visualization, Writing-Reviewing and Editing. XiuZhi Qi: Formal Analysis. Theam Foo Ng: Writing-Reviewing and Editing. Haidi Ibrahim: Visualization, Writing-Reviewing and Editing. Shan Wei Chen wrote the first draft of the manuscript and all authors commented on previous versions. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interest
The authors declare that they have no known competing financial or personal relationships that influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, S.W., Wang, S.L., Qi, X. et al. Convolutional neural network optimized by differential evolution for electrocardiogram classification. Multimed Tools Appl 82, 45811–45837 (2023). https://doi.org/10.1007/s11042-023-15407-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15407-9