Next Article in Journal
Speed and Agility Predictors among Adolescent Male Football Players
Next Article in Special Issue
Improvement of Ultrasound Image Quality Using Non-Local Means Noise-Reduction Approach for Precise Quality Control and Accurate Diagnosis of Thyroid Nodules
Previous Article in Journal
The General Public’s Perceptions of How the COVID-19 Pandemic Has Impacted the Elderly and Individuals with Intellectual Disabilities
Previous Article in Special Issue
Bandemia as an Early Predictive Marker of Bacteremia: A Retrospective Cohort Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluation of a Single-Channel EEG-Based Sleep Staging Algorithm

1
Centre for Sport and Exercise Sciences, Universiti Malaya, Kuala Lumpur 50603, Malaysia
2
Department of Psychology, Nanjing University, Nanjing 210093, China
3
Institute of Social Psychology, School of Humanities and Social Sciences, Xi’an Jiaotong University, Xi’an 710049, China
4
Department of the Psychology of Military Medicine, Air Force Medical University, Xi’an 710032, China
5
Xi’an Middle School of Shaanxi Province, Xi’an 710006, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Int. J. Environ. Res. Public Health 2022, 19(5), 2845; https://doi.org/10.3390/ijerph19052845
Submission received: 23 November 2021 / Revised: 6 February 2022 / Accepted: 22 February 2022 / Published: 1 March 2022

Abstract

:
Sleep staging is the basis of sleep assessment and plays a crucial role in the early diagnosis and intervention of sleep disorders. Manual sleep staging by a specialist is time-consuming and is influenced by subjective factors. Moreover, some automatic sleep staging algorithms are complex and inaccurate. The paper proposes a single-channel EEG-based sleep staging method that provides reliable technical support for diagnosing sleep problems. In this study, 59 features were extracted from three aspects: time domain, frequency domain, and nonlinear indexes based on single-channel EEG data. Support vector machine, neural network, decision tree, and random forest classifier were used to classify sleep stages automatically. The results reveal that the random forest classifier has the best sleep staging performance among the four algorithms. The recognition rate of the Wake phase was the highest, at 92.13%, and that of the N1 phase was the lowest, at 73.46%, with an average accuracy of 83.61%. The embedded method was adopted for feature filtering. The results of sleep staging of the 11-dimensional features after filtering show that the random forest model achieved 83.51% staging accuracy under the condition of reduced feature dimensions, and the coincidence rate with the use of all features for sleep staging was 94.85%. Our study confirms the robustness of the random forest model in sleep staging, which also represents a high classification accuracy with appropriate classifier algorithms, even using single-channel EEG data. This study provides a new direction for the portability of clinical EEG monitoring.

1. Induction

Sleep is an extremely important physiological phenomenon for human beings, a process of restructuring the organism [1]. When people enter the sleep state, most of the physiological activities of the body are inert. At this time, the pituitary gland secretes more growth hormones and prohormones, promoting the adjustment and reorganization of cells and tissue repair, eliminating human fatigue, and preparing for human physiological activities when awake [2,3].
It is worth noting that sleep is not a single process and can be divided into different sleep periods depending on the depth of sleep [4,5]. Current research suggests that sleep staging is divided into three major stages distinguished by specific brain waves and their ratios: wake (W), no-rapid eyes movement (NREM), and rapid eye movement (REM) [6,7]. According to the Rechtstaffen and Kamp (R&K) guidelines, the NREM stage was further subdivided into four stages, 1, 2, 3, and 4 (also referred to as S1, S2, S3, and S4) [6]. In general, the standard R&K sleep is divided into six stages, namely W, S1, S2, S3, S4, and REM [7]. In 2007, the American Academy of Sleep Medicine (AASM) divided NREM into three phases consisting of NREM1 (N1), NREM 2 (N2), and NREM 3 (N3). Therefore, according to the AASM standard, the sleep epoch can be divided into five stages: W, N1, N2, N3, and REM. Accurate sleep staging is the foundation for understanding sleep mechanisms and the clinical diagnosis and treatment of sleep disorders.
Traditional sleep staging requires manual labeling by a professional physician based on Polysomnography (PSG) of subjects during sleep. Although manual labeling by experts enables accurate sleep staging, the disadvantages are cumbersome collection process and time-consuming manual labeling [8,9,10]. In addition, patients must wear special equipment and complete the PSG acquisition in the laboratory throughout the night [11,12]. The patient’s sleep efficiency is also affected by the discomfort of sleeping in an unfamiliar environment [13]. Based on these challenges, researchers have tried to develop scoring methods that automatically analyze sleep stages. In recent years, more and more studies have been conducted using machine learning algorithms for sleep staging based on features such as physiological signals such as electroencephalography (EEG), electrocardiogram (ECG), electrooculogram (EOG), electromyogram (EMG), and respiration [14,15,16]. Numerous studies have found that the EEG signals are considered the most important and commonly used signals in sleep staging analysis [17,18]. The authors of [19] used multiple EEG channels to sleep stages and obtained a high accuracy rate. However, equipment with multiple EEG channels limits the movement of participants and affects the portability and wearability of sleep quality assessment devices.
Automatic sleep staging based on single-channel EEG signals has become a research focus in this field. The authors of [20] extracted 39 features from the time domain, frequency domain, and nonlinear features of the EEG signal and obtained an accuracy of 85.7% using a support vector machine (SVM) algorithm for automatic classification of sleep. The authors of [21] performed staged sleep based on a random forest (RF) classifier, and the classifier could achieve 87.82% accuracy when the number of selected features was 136. The accuracy of sleep staging has largely relied on the type of classifier. Besides SVM and RF classifiers, K-nearest neighbors, linear discriminant analysis (LDA), and naive Bayes classifiers were also used to perform EEG sleep stage classification [9,22]. In addition to the different choices of classifiers, researchers also optimize the feature set selection to improve accuracy. This is because using more features means that more computational power is required, which also increases the complexity of the system. However, there is no uniform standard on feature optimization methods. Some studies directly chose feature selection methods, such as modified graph clustering ant colony optimization [21], to select the most optimal feature set from the feature pool for correlation and redundancy analysis. There are also studies that selected the feature with the highest weight as the most optimal feature set based on the weight of each feature [17]. It is also worth noting that electrode selection in single-channel-based automated staging is also an essential factor affecting the correct rate. Some studies have used F4-M1 channels [23], and others have used Pz-Oz channels, or Fpz-Cz channels, and staging based on prefrontal FP1 and FP2 channels [24,25,26,27]. Ghimatgar et al. revealed that the results of sleep stage staging using Fpz-Cz EEG signals were more accurate than other channels [21]. Additionally, most of the current tools based on a single-channel design use the Fpz-Cz channel [8,21]. In the present study, we also performed automatic staging of sleep based on EEG signals from the Fpz-Cz channel.
The difference from previous studies is that the current study used four classifiers, namely SVM, RF, backpropagation neural network (BPNN), and decision tree (DT), which have been applied in previous studies, to stage sleep. The optimal classifier was identified by comparing their classification accuracies in the same dataset. Due to the nonlinearity and non-stationarity character of EEG signals, it is not possible to fully reflect the signal characteristics by extracting features from only one dimension, resulting in poor classification results. Therefore, we used three types of parameters in this study: time domain, frequency domain, and nonlinear features, making the classifier obtain the optimal input. In addition, the optimization of the feature set was also the focus of this study. On the basis of retaining the original multi-dimensional features, we tried to use the embedded method to filter features for the establishment of the sleep staging model. The embedded method is a feature filtering method that uses machine learning algorithms and models to obtain the weight coefficients of each feature and then selects features based on the coefficients from largest to smallest [28]. If the feature set filtered by the embedded method can achieve the same staging accuracy, the cost of computing sleep staging will be reduced in practical applications.
In conclusion, this study aimed to find an optimal feature set that can perform automatic sleep staging based on single-channel EEG signals by optimizing classifier algorithms, feature extraction, and feature filtering, which provide a theoretical reference for the design of clinical portable devices.

2. Material and Method

2.1. Material

The sleep EEG data used in this study came from the Expanded Sleep-EDF (ES-EDF) database [29]. We selected 24 h EEG recordings (marked as SC) from 12 healthy subjects aged 21 to 34 years. The sample consisted of five males and seven females. The Fpz-Cz single-channel EEG signals were used in this study, and the sampling rate was 100 Hz. The 30 s EEG data (3000-point data) were defined as a sample. The sleep sample distribution selected is shown in Table 1. The sleep staging results were manually labeled by experts according to AASM standards. The staging accuracy of the proposed method was tested by labeling the results of experts.

2.2. Feature Extraction

As EEG signals have strong variability and are easily disturbed by other physiological signals and the external environment, it is necessary to preprocess the original data to eliminate the noise interference. This study used a finite impulse response (FIR) bandpass filter in the range of 0.5–45 Hz to denoise the original EEG data. In order to achieve the accurate staging of sleep, 57 features were extracted from the three aspects of time domain, frequency domain, and nonlinear features. Table 2 describes the characteristics of each signal. The features are described below.

2.2.1. Time Domain Feature

The first to fourth moments (i.e., mean, variance, skewness, and kurtosis) are often used in statistical features of EEG signals. The calculation method is as follows:
X mean = S ( n ) ˉ = 1 N n = 1 N   S ( n )
s 2 = 1 N n = 1 N   X ( n ) 2
skewness = 1 N s 3 n = 1 N   X ( n ) 3
kurtosis = 1 N s 4 n = 1 N   X ( n ) 4 3

Zero Crossing Rate

The zero-crossing method is a systematic analysis method expressed in the waveform as the intersection of the waveform at that point with the horizontal midline of the waveform [30].
Calculating X (i) × X (i + 1) for i = 1, 2 …, n − 1 and counting the number NZ of i satisfying X (i) × X (i + 1) < 0, the zero-crossing rate can be defined as follows:
Z C R = N Z N 1

First-Order and Second-Order Difference and Its Normalization

Let X1(n) be a first-order difference of X(n) and X2(n) be a second-order difference of X(n); then, the following equations can be obtained [31,32,33]:
X 1 ( n ) = X ( n + 1 ) X ( n ) ( n = 1 , 2 , , N 1 ) X 2 ( n ) = X ( n + 2 ) X ( n ) ( n = 1 , 2 , , N 2 )
The mean value of the absolute value of the first-order difference:
δ X = 1 N 1 n = 1 N 1   | X 1 ( n ) |
The mean value of the absolute value of the normalized first-order difference:
δ ˜ x = δ X s
The mean value of the absolute value of the second-order difference:
γ x = 1 N 2 n = 1 N 2   | X 2 ( n ) |
The mean value of the absolute value of the normalized second-order difference:
γ ˜ X = γ X s

Hjorth

The time domain Hjorth parameter, also known as the normalized slope descriptor, is a statistical function that can describe the instantaneous characteristics of EEG signals in both the time domain and frequency domain [34]. The Hjorth parameter consists of three descriptors: activity, mobility, and complexity. The activity represents the average power of the EEG signal, which is the variance. Mobility is used to measure the average frequency of EEG signals. Complexity is used to measure the bandwidth of an EEG signal.
Let X1(n) be a first-order difference of X(n) and X2(n) be a second-order difference of X(n); then, the following equations can be obtained:
X 1 ( n ) = X ( n + 1 ) X ( n ) ( n = 1 , 2 , , N 1 ) X 2 ( n ) = X ( n + 2 ) X ( n ) ( n = 1 , 2 , , N 2 )
Note that the first-order difference here is the same as X1(n) defined in the previous section. The mean values of the first- and second-order differences of X(n) are denoted as μd and μdd, respectively, which satisfy:
μ d = X ( N ) X ( 1 ) N 1
μ d d = X ( 1 ) + X ( N ) X ( 2 ) X ( N 1 ) N 2
Then, their variances are denoted as Sd and Sdd, respectively, which satisfy:
s d 2 = 1 N 1 n = 1 N 1   ( X ( n ) μ d ) 2
s d u 2 = 1 N 2 n = 1 N 2   ( X ( n ) μ Δ ) 2
On this basis, the activity, mobility, and complexity formulas are as follows:
Activity = s 2
Mobility = s d 2 s 2 = s d s
Complexity = s d d s d 2 s d 2 s 2 = s d s d s d s = s · s d i s d 2

2.2.2. Frequency Domain Feature

Since EEG presents different rhythm distributions in different sleep stages, the filtered EEG signal was divided into seven frequency bands: low δ: 0.5–2 Hz; high δ: 1.2–4 Hz; θ: 4–8 Hz; α: 8–13 Hz; low β: 13–20 Hz; high β: 20–30 Hz; and γ wave: 30–45 Hz.
The energy can be obtained according to different frequency ranges. The specific calculation method is as follows:
  • Total frequency band power:
P = n = 1 N FT   ( F ( n ) N FFT ) 2
where F(n) is the results of the signal X(n) at frequency n.
2.
δ band power:
P δ = n = 1 N FE   ( ( F low - δ ( n ) + F high - δ ( n ) ) N FFT ) 2
3.
θ band power:
P θ = n = 1 N FFT   ( F θ ( n ) N FFT ) 2
4.
α band power:
P α = n = 1 N FFT   ( F α ( n ) N FFT ) 2
5.
β band power:
P β = n = 1 N GT   ( ( F low - β ( n ) + F high - β ( n ) ) N FFT ) 2
6.
γ band power:
P γ = n = 1 N FFT   ( ( F low - y ( n ) + F high - γ ( n ) ) N FFT ) 2

2.2.3. Nonlinear Features

Fractal Dimension

The fractal dimension (FD) can be used to represent the complexity of the time domain signal. The Higuchi algorithm was used to calculate the fractal dimension feature FD of X(n), as described [35].
The calculation formula is as follows:
H m ( k ) = N 1 [ N m k ] k 2 n = 1 [ N m k ]   | X ( m + n k ) X ( m + ( n 1 ) k ) |
where [x] represents the maximum integer not exceeding x. The average value H ¯ ( k ) of Hm (k) is calculated as follows:
H ˉ ( k ) = 1 k m = 1 k   H m ( k )
For different values of K, the calculated H ¯ ( k ) is different, but—log k is linearly related to log H ¯ ( k ) . The least-square method is used to fit the line equation, in which the slope is the fractal dimension (FD) obtained.
Let kmin = 1 and kmax = [ N 20 ], and calculate H ¯ ( k ) for all positive integers K (kminKkmax), and further calculate the following:
μ k = 1 k m a x k m i n + 1 k = k k i n k   ln k μ H = 1 k m a x k m i n + 1 k = k k   ln H ˉ ( k )
Then, the fractal dimension FD can be calculated by the following formula:
FD = k = k m i n k m a x   ( μ H ln H ˉ ( k ) ) ( μ k + ln k ) k = k m i n k m a x   ( μ k + ln k ) 2

Non-Stationary Index

The non-stationary index (NSI) measures the variation of the local mean over time. The signal is divided into m segments, the mean of each segment is calculated, and the NSI is defined as the standard deviation of these m means. A larger NSI indicates a larger oscillation of the local mean [36].
We used a large amount of experimental data as the basis, with the criterion of minimum variance and mean square error, and with the help of a ninth-order polynomial fit; after computational derivation, the stable value of NSI is best reflected as m = [0.15 × N]. Let N = mq + r, q being a positive integer and 0 ≤ r < m; then X(n) can be divided into m segments as follows:
If r > 0:
X k = { X q ( k 1 ) + 1 , , X q k } , k = 1 , , m
If r = 0:
X k = { X ( q + 1 ) ( k 1 ) + 1 , , X ( q + 1 ) k } , k = 1 , , r ; X k = { X ( q + 1 ) r + q ( k r 1 ) + 1 , , X q k + r } , k = r + 1 , , m 0
Let Xk be the average of the set Xk, μ = 1 m k = 1 m   X ˉ k , and the NSI can be calculated according to the following equation [37]:
NSI = 1 m k = 1 m   ( X ˉ k μ ) 2

Sample Entropy

The core of sample entropy lies in comparing the self-similarity of sequences by comparing the autocorrelation of equal-length subsequences in a sequence relative to the growth of subsequence length [38]. The calculation of sample entropy does not depend on the length of the data and has a better consistency.
For the signal X(n), the calculation method of sample entropy is as follows:
Expand X(n) into Nm + 1 subsequences of length m, denoted as Xm,1, Xm,2, …, Xm,N+n+1, where Xm,I = {X(i), X(I + 1), …, X(i + m − 1)} of 1 ≤ INm + 1.
Define the distance d between Xm,i and Xm,j as the absolute value of the maximum difference between the corresponding elements:
d ( X m , i , X m , j ) = m a x k = 0 , , m 1   ( | X ( i + k ) X ( j + k ) | )
For a given Xm,i, count the number of j (1 ≤ jnm + 1, jI) whose distance between Xm,I and Xm,j does not exceed r, and write it as Bi. For 1 ≤ INm + 1; the definition is the following:
B i m ( r ) = 1 N m 1 B i
Define Bm(r) as:
B m ( r ) = 1 N m i = 1 N m + 1   B i m ( r )
Increase the dimension to m + 1, count Xm,i, and count the number of J (1 ≤ jNm + 1, jI) whose distance between Xm,i and Xm,j is not more than r, denoted as Ai and Aim(r), defined as:
A i m ( r ) = 1 N m 1 A i
Define Am(r) as:
A m ( r ) = 1 N m i = 1 N m   A i m ( r )
Thus, Bm(r) is the probability that two sequences match m points under the similarity tolerance r, while Am(r) is the probability that two sequences match the m + 1 point. Sample entropy is defined as follow:
SampEn   ( m , r ) = l i m N { ln   [ A m ( r ) B m ( r ) ] }
When N is a finite value, it can be calculated by the following formula:
SampEn ( m , r , N ) = ln [ A m ( r ) B m ( r ) ]
Usually choose m = 2 or m = 3; r = 0.2 s; and s is the standard deviation of X(n) [39].

2.3. Rank-Based Feature Selection Method

To simplify the computation process and improve the portability of the algorithm, we performed feature screening on the features extracted in Section 2.2. The embedded method uses machine learning algorithms and models to obtain the weight coefficients of each feature and selects the features from the largest to the smallest according to the coefficients. Therefore, the study used the feature selection method based on the tree model to filter the features, and Table 3 shows the weight coefficients of each feature. In this study, features with feature weight coefficients greater than 0.02 were selected as the final set of classification features, so a total of 11 features was selected, including T6, T7, F2, F5, F6, F8, F9, F12, F19, F22, and N2.

2.4. Classification Models

In this study, four algorithms, namely the support vector machine (SVM), backpropagation neural network (BPNN), random forest (RF), and decision tree (DT) algorithms, were chosen to classify the extracted features, and the classification accuracy was obtained.
SVM is a robust classifier widely used in supervised classification problems [40]. Before using SVM classification, all features were converted into sequences 0–1 by the z-score standardization method. In this study, a linear function was selected as the kernel function, and the hyperparameters were tuned by grid search. The BP neural network algorithm is the most widely used neural network machine learning algorithm, which mainly contains an input layer, an implicit layer, and an output layer, and each layer is interconnected with the others for signaling through neural nodes [41]. Before classification using a BP neural network, all features are normalized in the range [0, 1] using the min–max normalization method. Since this study divided sleep into five periods, the number of nodes in the output layer was set to five, the number of nodes in the implicit layer was set to 20, the number of neural nodes in the input layer needed to be set according to the number of feature values in different sample sets, and the learning efficiency was set to 0.1. RF is an integrated algorithm consisting of multiple decision trees, and is one of the more common classification algorithms [42]. The decision trees in this algorithm are independent of each other, and the input sample set is analyzed and processed separately. The classification results of each tree are collated to obtain the final classification result. The Gini index measures the purity of the sample set, where the smaller the value, the lower the probability of misclassification of the sample. The DT algorithm is an inductive learning algorithm, a classification rule obtained by induction on a chaotic set of instances based on instances [43]. There are two steps to deal with the classification problem of the decision tree: first, the classification model of the decision tree is generated by a learning training set; second, the model is used to classify unknown types of samples. The C4.5 decision tree algorithm was applied in this study, and the splitting index was the information gain rate.

2.5. Validation of Classification Models

After the classifier design, a fair evaluation needs to estimate its performance over a large number of objects corresponding to a selected set of features and classifier designs. In this study, 20% of the samples (1940 samples) were randomly selected from the dataset as the test set, and the remaining samples were used for training. The model was trained on the training set using five-fold cross-validation, using 80% of the samples in each round as the training subset and the remaining 20% as the test subset.
After training with the model, there are four main categories when examining the prediction effect of the model: true positive, which means the prediction is positive and positive; fake positive, which means the prediction is positive but negative; true negative, which means the prediction is negative but negative; and fake negative, which means the prediction is negative but positive. Four metrics, namely accuracy, precision, recall, and f1-score, were used as the evaluation metrics of the classifier [44].
(1) Accuracy is the simplest index, consisting of the number of correctly predicted observations divided by the total number of observations:
accuracy = TP + TN TP + FP + TN + FN
(2) The precision describes the proportion of true positives among the predicted positive samples:
precision = TP TP + FP
(3) Recall is the percentage of all actual positive samples that are predicted to be positive:
recall = TP TP + FN
(4) f1 is a more balanced index between precision and recall:
f 1 = TP TP + FN + FP 2

3. Results

3.1. SVM Model: Results and Evaluation

Automatic staging of sleep EEG data was carried out using the SVM model. All 57-dimensional features were selected. After the model parameters were adjusted, the model with “C = 1.3, γ = 0.03” was selected for testing. The results show that the recognition rate of phase W was the highest, and that of phase N1 was the lowest, with an average accuracy of 81.86%, as shown in Table 4. The corresponding confusion matrix is shown in Figure 1. It can be seen from Figure 1 that the REM and N1 stages were most likely to be confused. The wrong predictions of the N3 stage are mainly concentrated in the N2 stage; the wrong predictions of the N2 stage are scattered in the N3, N1, and REM stages; and the wrong predictions of the W stage are mainly concentrated in the N1 stage.
The expert manual staging results are visualized with the SVM model staging results in Figure 2. The real label represents the result of manual staging by experts, while the prediction label is the result of the SVM model. According to the results in the figure, the sleep staging labeled by experts is highly consistent with that obtained by the SVM model, as in only 12 of the 100 samples the predicted labels did not match the real ones.

3.2. BPNN Model: Results and Evaluation

A BPNN model was used for automatic staging of sleep EEG. All 57-dimensional features were selected. After model parameters were adjusted, two hidden layers with 18 neurons in each layer were selected for testing. As shown in Table 5, the average recognition rate of stage W was the highest at 90%, followed by 84% of stage N2. The recognition rate of the N3 and REM stages was close to 75%, and the lowest recognition rate of the N1 stage was 66%, with an average accuracy of 78.33%. The corresponding confusion matrix is shown in Figure 3. It can be seen that the two are most easily confused in the REM period and N1 period; the wrong prediction of the N3 period is mainly concentrated in the N2 period; the wrong prediction of the N2 period, N1 period, and REM period is more scattered, indicating that these three periods are easily confused with other periods, and the wrong prediction of the W period is mainly concentrated in the N1 period.
The expert manual staging results were visualized with the BP neural network model staging results, as shown in Figure 4. Due to the large number of samples in the test set, only 100 samples are selected for visualization. According to the results in the figure, the sleep staging labeled by experts was highly consistent with that obtained by the BP neural network model, in which for only 12 of the 100 samples the predicted labels did not match the real ones.

3.3. DT Model: Results and Evaluation

The DT model was used for automatic staging of sleep EEG, and all 57-dimensional features were selected. After adjusting the model parameters, a tree model with a depth of 11 and a minimum number of leaf node samples of 11 was selected for testing, and the results were as follows. The results revealed the highest recognition rate of 88% for the W period, followed by 87% for the N3 period, and the lowest recognition rate of 62% for the N1 period, with an average accuracy rate of 76.25% (Table 6). The corresponding confusion matrix is shown in Figure 5. It can be seen that the REM period and N1 period were the two most easily confused, but the distinction between these two periods and the N3 period was relatively high, and this model had a better effect in distinguishing deep sleep from light sleep; the false prediction of the N3 period was mainly concentrated in the N2 period; the false prediction of the N2 period was scattered in the other four periods, and the false prediction of the W period was mainly concentrated in the N1 period.
The expert manual staging results were visualized with the DT model staging results, as shown in Figure 6. According to the results in the figure, the sleep staging labeled by experts was highly consistent with that obtained by the DT model, in which for only 13 of the 100 samples the predicted labels did not match the real ones.

3.4. RF Model: Results and Evaluation

The RF model was used for automatic staging of sleep EEG, and all 57-dimensional features were selected. After the model parameters were adjusted, a tree model with a random forest size of 100 trees, a depth of 22 per tree, and a minimum number of leaf node samples of 5 was selected for testing, with the following results. As can be seen from Table 7, the recognition rate of the W stage was the highest at 92%, followed by 91% (N3). The recognition rate of N2 and REM was about 80%. The lowest recognition rate of phase N1 was 73%, and the average accuracy of the five sleep stages was 83.61%. The corresponding confusion matrix is shown in Figure 7. The two most easily confused were the REM and N1 periods; the erroneous prediction of the N3 phase was mainly concentrated in the N2 phase, with a small number predicted as the W phase, which was caused by the low frequency and high amplitude characteristics of the waveform of N3, causing the model to misclassify it as EOG and thus predict it as the W phase; the wrong prediction of the N2 period was scattered over the N3, N1, and REM periods; the wrong prediction of the W period was mainly concentrated in the N1 period.
The expert manual staging results are visualized with the RF model staging results, as shown in Figure 8. According to the results in the figure, the sleep staging labeled by experts is highly consistent with that obtained by the RF model, in which for only 12 of the 100 samples the predicted labels did not match the real ones.

3.5. Comparison of the Results of Four Models before and after Feature Screening

The 11-dimensional features after feature selection were input into four machine learning models. The accuracy of the obtained models was compared with the accuracy of all features, as shown in Table 8. The results indicate that the RF model had better sleep staging than the other three models, with the highest recognition rate of 92.13% for stage W and the lowest recognition rate of 73.46% for stage N1, with an average accuracy of 83.56%. The results of sleep staging using the 11-dimensional features agreed with the results of sleep staging using all features at 94.85%.

4. Discussion

In this study, based on EEG signals of the Fpz-Cz-channel, a total of 57 features were extracted from three dimensions: time domain, frequency domain, and nonlinear parameters. Then, four classifiers, namely SVM, BPNN, DT, and RF, were used for automatic sleep staging. The results show that the four classifiers have consistent results, that is, the highest recognition rate for the W phase and the lowest recognition rate for the N1 phase. The RF model exhibits the highest recognition accuracy among the four classifiers, followed by SVM, BPNN, and DT.
We have sorted out previous studies regarding sleep staging, feature number, classifier, single-channel name and accuracy, and kappa coefficient. Our study has three advantages over previous studies. First, we used the Fpz-Cz channel EEG data with the best sleep staging effect [21]. Second, in terms of the feature number, we extracted 57 features from the time domain, frequency domain, and nonlinear parameters of the sleep EEG signal for machine learning. Additionally, we used the embedded method to optimize the features into 11 dimensions to explore their classification accuracy. Finally, although we did not use all classifiers in terms of classifier selection, we selected several classifiers that performed well in previous studies. Our results show that compared with other classifiers (Table 9), RF achieves higher accuracy and maintains robust classification results both with multidimensional features (57) and optimized feature sets (11), which is consistent with the results of other studies [9,45,46].
The performance of classifiers also relies heavily on the associated features. In this study, the embedded method was used to select features with feature weight coefficients greater than 0.02 as the final set of classification features. Among the 11 features, there are two features from the time domain, eight features from the frequency domain, and only one from the nonlinear domain. These findings indicated that frequency domain features accounted for a greater proportion of the automatic sleep staging, followed by time domain features, possibly because different sleep stages exhibited different frequency and energy characteristics. Studies have shown that δ and θ bands’ rhythm mainly existed in the N2 and N3 stages [49], while α and β bands’ rhythm was detected mostly in the REM, awakening, and N1 stages [47]. Moreover, the proportion of frequency domain features accounts for the highest proportion in the optimal feature set; thus, future studies may consider the accuracy of automatic staging explored by screening on frequency domain features.
In our study, regardless of the classifier algorithm used, the classification accuracy was extremely high for stage W, whereas the recognition accuracy was lower for stage N1. The stage characteristics of sleep staging may cause this. When in the W stage, the individual still has a fairly complete consciousness, and the prominent EEG signal is characterized by a mixture of alpha and beta waves with more pronounced EEG characteristics. The N1 is the transition period of the brain from the conscious state to the sleep state, where the alpha wave share gradually decreases, and theta waves begin to appear and gradually replace alpha waves, suggesting that the EEG signal changes significantly during this period [50]. Thus, the W phase with stable features is easier to identify than the N1 phase with more variable EEG signals.
It should be noted that previous studies have shown that an imbalance in the number of categories during staging will affect the final accuracy. This means that when the number of instances of one class in the training dataset far exceeds the number of instances of other classes, the results tend to classify the data into the larger category [51]. However, in this study, when using the staging data, the samples of both the W1 and N1 areas were 2029, and the differences in the number of samples for each classification were small, which could effectively avoid the problems caused by data distribution.
The boosting classification method was used in a previous study on the classification effect of single-channel EEG signals, which showed that after extracting signal features with ensemble empirical mode decomposition (EEMD), the classification accuracy of wake, REM, DS, and LS4 states could reach 92.66% [52]. In this study, although we did stage discrimination based on 57 features in the time domain, the frequency domain and nonlinear features, the classification accuracy with the RF and SVM classification models attained more than 80%, and RF achieved more than 90% classification accuracy for both the W and N3 stages. In addition, one point that surpasses previous studies in this study is that we used the embedded method to reduce the feature dimensions to 11; we still found better classification results under the RF model. The amount of data was reduced by feature screening, and the speed of computation and portability of the algorithm were improved. The results further confirm that single-channel EEG is an available monitoring technology, which will provide a new direction for the portability of clinical EEG monitoring.
The study has some disadvantages, which are mainly reflected in the results on sleep staging. First, the recognition rate of the REM and N1 phases was lower. Second, the wrong prediction of the W phase was mainly concentrated in the N1 phase. The main reasons for these two problems are as follows: the EEG of the REM and N1 stages are mainly low-voltage mixed frequency waves, and this study only extracted features based on EEG, resulting in the REM and N1 stages not being easily distinguished; for the second point, on the one hand, it is because there are slow eye movements in both the closed-eye W and N1 stages. On the other hand, during the transition from the W stage to the N1 stage, the experts’ interpretation is more subjective, making the accuracy of sleep staging results difficult to guarantee. Therefore, improving the recognition rate of the REM and N1 stages is still a direction to focus on in sleep staging research.

Author Contributions

Data curation, S.Z.; methodology, S.Z., B.W. and F.L.; formal analysis, B.W. and X.N.; writing—original draft, S.Z. and F.L.; writing—review & editing, X.W. and H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The sleep EEG data used in this study came from the Expanded Sleep-EDF (ES-EDF) database.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Vanini, G.; Torterolo, P. Sleep-Wake Neurobiology. Adv. Exp. Med. Biol. 2021, 1297, 65–82. [Google Scholar] [PubMed]
  2. Matricciani, L.; Bin, Y.S.; Lallukka, T.; Kronholm, E.; Wake, M.; Paquet, C.; Dumuid, D.; Olds, T. Rethinking the sleep-health link. Sleep Health 2018, 4, 339–348. [Google Scholar] [CrossRef] [PubMed]
  3. Emsellem, H. The reimagining of sleep and health. Sleep Health 2019, 5, 2. [Google Scholar] [CrossRef] [PubMed]
  4. Ackermann, S.; Rasch, B. Differential effects of non-REM and REM sleep on memory consolidation? Curr. Neurol. Neurosci. Rep. 2014, 14, 430. [Google Scholar] [CrossRef]
  5. Munoz, J.P.; Rivera, L.A. Towards Improving Sleep Quality Using Automatic Sleep Stage Classification and Binaural Beats. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2020, 2020, 4982–4985. [Google Scholar]
  6. Aserinsky, E.; Kleitman, N. Regularly occurring periods of eye motility, and concomitant phenomena, during sleep. Science 1953, 118, 273–274. [Google Scholar] [CrossRef] [Green Version]
  7. Hori, T.; Sugita, Y.; Koga, E.; Shirakawa, S.; Inoue, K.; Uchida, S.; Kuwahara, H.; Kousaka, M.; Kobayashi, T.; Tsuji, Y.; et al. Proposed supplements and amendments to ‘A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects’, the Rechtschaffen & Kales (1968) standard. Psychiatry Clin. Neurosci. 2001, 55, 305–310. [Google Scholar]
  8. Aboalayon, K.A.I.; Faezipour, M.; Almuhammadi, W.S.; Moslehpour, S. Sleep Stage Classification Using EEG Signal Analysis: A Comprehensive Survey and New Investigation. Entropy 2016, 18, 272. [Google Scholar] [CrossRef]
  9. Hassan, A.R.; Hassan Bhuiyan, M.I. Automatic sleep scoring using statistical features in the EMD domain and ensemble methods. Biocybern. Biomed. Eng. 2016, 36, 248–255. [Google Scholar] [CrossRef]
  10. Hassan, A.R.; Bhuiyan, M.I.H. Computer-aided sleep staging using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise and bootstrap aggregating. Biomed. Signal Process. Control 2016, 24, 1–10. [Google Scholar] [CrossRef]
  11. Lan, K.C.; Chang, D.W.; Kuo, C.E.; Wei, M.Z.; Li, Y.H.; Shaw, F.Z.; Liang, S.F. Using off-the-shelf lossy compression for wireless home sleep staging. J. Neurosci. Methods 2015, 246, 142–152. [Google Scholar] [CrossRef]
  12. Zoubek, L.; Charbonnier, S.; Lesecq, S.; Buguet, A.; Chapotot, F. Feature selection for sleep/wake stages classification using data driven methods. Biomed. Signal Process. Control 2007, 2, 171–179. [Google Scholar] [CrossRef]
  13. Hertenstein, E.; Gabryelska, A.; Spiegelhalder, K.; Nissen, C.; Johann, A.F.; Umarova, R.; Riemann, D.; Baglioni, C.; Feige, B. Reference Data for Polysomnography-Measured and Subjective Sleep in Healthy Adults. J. Clin. Sleep Med. 2018, 14, 523–532. [Google Scholar] [CrossRef] [PubMed]
  14. Zhang, Y.; Zhang, X.; Sun, H.; Fan, Z.; Zhong, X. Portable brain-computer interface based on novel convolutional neural network. Comput. Biol. Med. 2019, 107, 248–256. [Google Scholar] [CrossRef] [PubMed]
  15. Sun, H.; Ganglberger, W.; Panneerselvam, E.; Leone, M.J.; Quadri, S.A.; Goparaju, B.; Tesh, R.A.; Akeju, O.; Thomas, R.J.; Westover, M.B. Sleep staging from electrocardiography and respiration with deep learning. Sleep 2020, 43, zsz306. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. van Gilst, M.M.; Wulterkens, B.M.; Fonseca, P.; Radha, M.; Ross, M.; Moreau, A.; Cerny, A.; Anderer, P.; Long, X.; van Dijk, J.P.; et al. Direct application of an ECG-based sleep staging algorithm on reflective photoplethysmography data decreases performance. BMC Res. Notes 2020, 13, 513. [Google Scholar] [CrossRef] [PubMed]
  17. Kayikcioglu, T.; Maleki, M.; Eroglu, K. Fast and accurate PLS-based classification of EEG sleep using single channel data. Expert Syst. Appl. 2015, 42, 7825–7830. [Google Scholar] [CrossRef]
  18. Phan, H.; Do, Q.; Do, T.L.; Vu, D.L. Metric learning for automatic sleep stage classification. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2013, 2013, 5025–5028. [Google Scholar] [PubMed] [Green Version]
  19. Hsu, Y.-L.; Yang, Y.-T.; Wang, J.-S.; Hsu, C.-Y. Automatic sleep stage recurrent neural classifier using energy features of EEG signals. Neurocomputing 2013, 104, 105–114. [Google Scholar] [CrossRef]
  20. Koley, B.; Dey, D. An ensemble system for automatic sleep stage classification using single channel EEG signal. Comput. Biol. Med. 2012, 42, 1186–1195. [Google Scholar] [CrossRef]
  21. Ghimatgar, H.; Kazemi, K.; Helfroush, M.S.; Aarabi, A. An automatic single-channel EEG-based sleep stage scoring method based on hidden Markov Model. J. Neurosci. Methods 2019, 324, 108320. [Google Scholar] [CrossRef] [PubMed]
  22. Rahman, M.M.; Bhuiyan, M.I.H.; Hassan, A.R. Sleep stage classification using single-channel EOG. Comput. Biol. Med. 2018, 102, 211–220. [Google Scholar] [CrossRef] [PubMed]
  23. Lee, P.L.; Huang, Y.H.; Lin, P.C.; Chiao, Y.A.; Hou, J.W.; Liu, H.W.; Huang, Y.L.; Liu, Y.T.; Chiueh, T.D. Automatic Sleep Staging in Patients with Obstructive Sleep Apnea Using Single-Channel Frontal EEG. J. Clin. Sleep Med. 2019, 15, 1411–1420. [Google Scholar] [CrossRef] [PubMed]
  24. Jo, H.G.; Park, J.Y.; Lee, C.K.; An, S.K.; Yoo, S.K. Genetic fuzzy classifier for sleep stage identification. Comput. Biol. Med. 2010, 40, 629–634. [Google Scholar] [CrossRef]
  25. Wang, Y.; Loparo, K.A.; Kelly, M.R.; Kaplan, R.F. Evaluation of an automated single-channel sleep staging algorithm. Nat. Sci. Sleep 2015, 7, 101–111. [Google Scholar]
  26. Saastamoinen, A.; Huupponen, E.; Värri, A.; Hasan, J.; Himanen, S.L. Computer program for automated sleep depth estimation. Comput. Methods Programs Biomed. 2006, 82, 58–66. [Google Scholar] [CrossRef]
  27. Fu, M.; Wang, Y.; Chen, Z.; Li, J.; Xu, F.; Liu, X.; Hou, F. Deep Learning in Automatic Sleep Staging with a Single Channel Electroencephalography. Front. Physiol. 2021, 12, 628502. [Google Scholar] [CrossRef]
  28. Albahr, A.; Albahar, M.; Thanoon, M.; Binsawad, M. Computational Learning Model for Prediction of Heart Disease Using Machine Learning Based on a New Regularizer. Comput. Intell. Neurosci. 2021, 11, 8628335. [Google Scholar] [CrossRef]
  29. Imtiaz, S.A.; Rodriguez-Villegas, E. Recommendations for performance assessment of automatic sleep staging algorithms. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2014, 2014, 5044–5047. [Google Scholar]
  30. Wu, C.H.; Chang, H.C.; Lee, P.L.; Li, K.S.; Sie, J.J.; Sun, C.W.; Yang, C.Y.; Li, P.H.; Deng, H.T.; Shyu, K.K. Frequency recognition in an SSVEP-based brain computer interface using empirical mode decomposition and refined generalized zero-crossing. J. Neurosci. Methods 2011, 196, 170–181. [Google Scholar] [CrossRef]
  31. Alazrai, R.; Homoud, R.; Alwanni, H.; Daoud, M.I. EEG-Based Emotion Recognition Using Quadratic Time-Frequency Distribution. Sensors 2018, 18, 2739. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Hatamikia, S.; Maghooli, K.; Nasrabadi, A.M. The emotion recognition system based on autoregressive model and sequential forward feature selection of electroencephalogram signals. J. Med. Signals Sens. 2014, 4, 194–201. [Google Scholar] [CrossRef] [PubMed]
  33. Hadjidimitriou, S.K.; Hadjileontiadis, L.J. Toward an EEG-based recognition of music liking using time-frequency analysis. IEEE Trans. Biomed. Eng. 2012, 59, 3498–3510. [Google Scholar] [CrossRef] [PubMed]
  34. Hjorth, B. EEG analysis based on time domain properties. Electroencephalogr. Clin. Neurophysiol. 1970, 29, 306–310. [Google Scholar] [CrossRef]
  35. Zhang, Y.; Ji, X.; Zhang, S. An approach to EEG-based emotion recognition using combined feature extraction method. Neurosci. Lett. 2016, 633, 152–157. [Google Scholar] [CrossRef]
  36. Kroupi, E.; Yazdani, A.; Ebrahimi, T. EEG Correlates of Different Emotional States Elicited during Watching Music Videos. In Proceedings of the Fourth International Conference, ACII 2011, Memphis, TN, USA, 9–12 October 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 457–466. [Google Scholar]
  37. Hausdorff, J.M.; Lertratanakul, A.; Cudkowicz, M.E.; Peterson, A.L.; Kaliton, D.; Goldberger, A.L. Dynamic markers of altered gait rhythm in amyotrophic lateral sclerosis. J. Appl. Physiol. 2000, 88, 2045–2053. [Google Scholar] [CrossRef] [PubMed]
  38. Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol. 2000, 278, H2039–H2049. [Google Scholar] [CrossRef] [Green Version]
  39. García-Martínez, B.; Martínez-Rodrigo, A.; Zangróniz Cantabrana, R.; Pastor García, J.M.; Alcaraz, R. Application of Entropy-Based Metrics to Identify Emotional Distress from Electroencephalographic Recordings. Entropy 2016, 18, 221. [Google Scholar] [CrossRef]
  40. Huang, S.; Cai, N.; Pacheco, P.P.; Narrandes, S.; Wang, Y.; Xu, W. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. Cancer Genom. Proteom. 2018, 15, 41–51. [Google Scholar]
  41. Tian, S.K.; Dai, N.; Li, L.L.; Li, W.W.; Sun, Y.C.; Cheng, X.S. Three-dimensional mandibular motion trajectory-tracking system based on BP neural network. Math. Biosci. Eng. 2020, 17, 5709–5726. [Google Scholar] [CrossRef]
  42. Yang, L.; Wu, H.; Jin, X.; Zheng, P.; Hu, S.; Xu, X.; Yu, W.; Yan, J. Study of cardiovascular disease prediction model based on random forest in eastern China. Sci. Rep. 2020, 10, 5245. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  43. Che, D.; Liu, Q.; Rasheed, K.; Tao, X. Decision tree and ensemble learning algorithms with their applications in bioinformatics. Adv. Exp. Med. Biol. 2011, 696, 191–199. [Google Scholar] [PubMed]
  44. Mumtaz, W.; Ali, S.S.A.; Yasin, M.A.M.; Malik, A.S. A machine learning framework involving EEG-based functional connectivity to diagnose major depressive disorder (MDD). Med. Biol. Eng. Comput. 2018, 56, 233–246. [Google Scholar] [CrossRef] [PubMed]
  45. Sharma, R.; Pachori, R.B.; Upadhyay, A. Automatic sleep stages classification based on iterative filtering of electroencephalogram signals. Neural Comput. Appl. 2017, 28, 2959–2978. [Google Scholar] [CrossRef]
  46. Hassan, A.R.; Bhuiyan, M.I.H. A decision support system for automatic sleep staging from EEG signals using tunable Q-factor wavelet transform and spectral features. J. Neurosci. Methods 2016, 271, 107–118. [Google Scholar] [CrossRef] [PubMed]
  47. Liang, S.F.; Kuo, C.E.; Hu, Y.H.; Pan, Y.H.; Wang, Y.H. Automatic Stage Scoring of Single-Channel Sleep EEG by Using Multiscale Entropy and Autoregressive Models. IEEE Trans. Instrum. Meas. 2012, 61, 1649–1657. [Google Scholar] [CrossRef]
  48. Zhu, G.; Li, Y.; Wen, P.P. Analysis and classification of sleep stages based on difference visibility graphs from a single-channel EEG signal. IEEE J. Biomed. Health Inf. 2014, 18, 1813–1821. [Google Scholar] [CrossRef] [PubMed]
  49. Danker-Hopfe, H.; Anderer, P.; Zeitlhofer, J.; Boeck, M.; Dorn, H.; Gruber, G.; Heller, E.; Loretz, E.; Moser, D.; Parapatics, S.; et al. Interrater reliability for sleep scoring according to the Rechtschaffen & Kales and the new AASM standard. J. Sleep Res. 2009, 18, 74–84. [Google Scholar]
  50. García-Martínez, B.; Fernández-Caballero, A.; Alcaraz, R.; Martínez-Rodrigo, A. Assessment of dispersion patterns for negative stress detection from electroencephalographic signals. Pattern Recognit. 2021, 119, 108094. [Google Scholar] [CrossRef]
  51. Seiffert, C.; Khoshgoftaar, T.M.; Hulse, J.V.; Napolitano, A. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2010, 40, 185–197. [Google Scholar] [CrossRef]
  52. Hassan, A.R.; Bhuiyan, M.I.H. Automated identification of sleep states from EEG signals by means of ensemble empirical mode decomposition and random under sampling boosting. Comput. Methods Programs Biomed. 2017, 140, 201–210. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Confusion matrix of SVM model staging results.
Figure 1. Confusion matrix of SVM model staging results.
Ijerph 19 02845 g001
Figure 2. Comparison of expert manual staging results with SVM model staging results. The real label represents the result of manual staging by experts, while the prediction label is the result of the SVM model.
Figure 2. Comparison of expert manual staging results with SVM model staging results. The real label represents the result of manual staging by experts, while the prediction label is the result of the SVM model.
Ijerph 19 02845 g002
Figure 3. Confusion matrix of BPNN model staging results.
Figure 3. Confusion matrix of BPNN model staging results.
Ijerph 19 02845 g003
Figure 4. Comparison of expert manual staging results and BPNN model staging results. The real label represents the result of manual staging by experts, while the prediction label is the result of the BPNN model.
Figure 4. Comparison of expert manual staging results and BPNN model staging results. The real label represents the result of manual staging by experts, while the prediction label is the result of the BPNN model.
Ijerph 19 02845 g004
Figure 5. Confusion matrix of DT model staging results.
Figure 5. Confusion matrix of DT model staging results.
Ijerph 19 02845 g005
Figure 6. Comparison of expert manual staging results with DT model staging results. The real label represents the result of manual staging by experts, while the prediction label is the result of the DT model.
Figure 6. Comparison of expert manual staging results with DT model staging results. The real label represents the result of manual staging by experts, while the prediction label is the result of the DT model.
Ijerph 19 02845 g006
Figure 7. Confusion matrix of RF model staging results.
Figure 7. Confusion matrix of RF model staging results.
Ijerph 19 02845 g007
Figure 8. Comparison of expert manual staging results with RF model staging results. The real label represents the result of manual staging by experts, while the prediction label is the result of the RF model.
Figure 8. Comparison of expert manual staging results with RF model staging results. The real label represents the result of manual staging by experts, while the prediction label is the result of the RF model.
Ijerph 19 02845 g008
Table 1. Sample distribution by stage.
Table 1. Sample distribution by stage.
Sleep StagesSample Number
W stage 2029
N1 stage2029
N2 stage2029
N3 stage1671
REM stage1938
Total9696
Table 2. EEG sleep staging characteristics.
Table 2. EEG sleep staging characteristics.
Feature SymbolComputational MethodFeature SymbolComputational MethodFeature SymbolComputational Method
T1AmplitudeF5E6 + E7F24(E2 + E3)/E4
T2Mean ValueF6E8F25(E2 + E6)/E9
T3VarianceF7E2/E1F26MPF
T4SDF8E3/E1F27MPF-low-δ
T5MedianF9E4/E1F28MPF-high-δ
T6SkewnessF10E5/E1F29MPF-θ
T7KurtosisF11(E6 + E7)/E1F30MPF-α
T8MaximumF12E8/E1F31MPF-β
T9MinimumF13(E4 + E5)/E1F32MPF-γ
T10ZCRF14E5/(E6 + E7)F33FV
T11AFDNF15(E4 + E5)/(E5 + E6 + E7)F34FV-low-δ
T12ASDNF16E4/(E6 + E7)F35FV-high-δ
T13ActivityF17E3/(E4 + E5)F36FV-θ
T14MobilityF18E4/(E3 + E5)F37FV-α
T15ComplexityF19E5/(E3 + E4)F38FV-β
F1E1F20E2/(E3 + E9)F39FV-γ
F2E2 + E3F21E5/E9N1FD
F3E4F22(E6 + E7)/E9N2NSI
F4E5F23E5/E4N3E
T, time domain features; F, frequency domain features; N, non-stationary features. FD, fractal dimension; SE, sample entropy; ZCR, zero crossing rate; SD, standard deviation; E1, the total band power; E2, the low-frequency δ-band (0.5–2 Hz) power; E3, the high-frequency δ-band (1.2–4 Hz) power; E4, the θ-band (4–8 Hz) power; E5, the α-band (8–13 Hz) power; E6, the low β-band (13–20 Hz) power; E7, the high-frequency β-band (20–30 Hz) power; E8, the low-frequency γ-band (30–45 Hz) power; E9, the δ + θ + α + β + γ + δ band power.
Table 3. Weight coefficients of EEG sleep staging features.
Table 3. Weight coefficients of EEG sleep staging features.
Feature SymbolWeight CoefficientFeature SymbolWeight CoefficientFeature SymbolWeight Coefficient
T10.0030F50.0394F240.0012
T20.0006F60.1459F250.0053
T30.0051F70.0038F260.0011
T40.0122F80.0281F270.0028
T50.0111F90.0457F280.0007
T60.0513F100.0038F290.0011
T70.0201F110.0106F300.0013
T80.0015F120.2043F310.0007
T90.0066F130.0024F320.0006
T100.0086F140.0142F330.0007
T110.0013F150.0124F340.0037
T120.0049F160.0110F350.0014
T130.0101F170.0105F360.0006
T140.0048F180.0104F370.0010
T150.0050F190.0540F380.0005
F10.0148F200.0008F390.0006
F20.1049F210.0031N10.0045
F30.0056F220.0301N20.0470
F40.0125F230.0077N30.0032
Table 4. Comparison of SVM model staging results for all features with expert manual staging results.
Table 4. Comparison of SVM model staging results for all features with expert manual staging results.
Sleep StagesTraining SamplesTest SamplesCorrect SamplesPrecisionRecallf1-Score
N313263453210.89920.93040.9145
N216014283360.82760.78500.8058
N116393902550.71430.65380.6827
REM15473913160.72150.80820.7624
W16433863600.94240.93260.9375
Accuracy = 81.86%.
Table 5. Comparison of BP model staging results for all features with expert manual staging results.
Table 5. Comparison of BP model staging results for all features with expert manual staging results.
Sleep StagesTraining SamplesTest SamplesCorrect SamplesPrecisionRecallf1-Score
N313263453210.76850.96230.8546
N216014283360.83970.67290.7471
N116393902550.65550.65380.6547
REM15473913160.75210.69820.7241
W16433863600.90070.96370.9312
Accuracy = 78.35%.
Table 6. Comparison of decision tree model staging results for all features with expert manual staging results.
Table 6. Comparison of decision tree model staging results for all features with expert manual staging results.
Sleep StagesTraining SamplesTest SamplesCorrect SamplesPrecisionRecallf1-Score
N313263453210.87730.86090.8787
N216014283360.73010.70790.7189
N116393902550.62470.61030.6174
REM15473913160.68040.71870.6990
W16433863600.88000.91190.8957
Accuracy = 75.82%.
Table 7. Comparison of random forest model staging results for all features with expert manual staging results.
Table 7. Comparison of random forest model staging results for all features with expert manual staging results.
Sleep StagesTraining SamplesTest SamplesCorrect SamplesPrecisionRecallf1-Score
N313263453210.91710.93040.9237
N216014283360.81990.82940.8246
N116393902550.73460.70320.7032
REM15473913160.78770.80150.8015
W16433863600.92130.94040.9308
Accuracy = 83.56%.
Table 8. Comparison of sleep staging accuracy of different models before and after feature screening (%).
Table 8. Comparison of sleep staging accuracy of different models before and after feature screening (%).
Sleep StagesSVMBPDTRF
5711571157115711
N389.9288.6776.8589.8989.7388.4891.7192.11
N282.7676.5883.9780.0573.0170.9181.9980.61
N171.4370.5065.5563.2762.4761.4273.4672.85
REM72.1564.4475.2172.9568.0469.5778.7776.09
W94.2491.7590.0795.7988.0090.8492.1391.90
Total accuracy81.8677.9978.3579.5975.8275.8883.5682.53
Table 9. The accuracy of sleep staging using single-channel EEG information.
Table 9. The accuracy of sleep staging using single-channel EEG information.
Author/YearSleep StagesNumber of FeaturesClassifierChannel of EEGACC (%) or KC
[20]Five stages39SVMC3-A2ACC = 85.7%
[47]Five stagesMultiscale entropy and autoregressive modelsLDAC3-A2KC = 0.81
[48]Five stages9SVMPz-OzACC = 87.5% KC = 0.81
[9]Five stagesThe IMFs factor was set to 7 to obtain the optimal number of featuresLDA, BPNN, SVM, k-NN, LS-SVM, Bagging, AdaBoost and Naïve BayesPz–OzACC (44.80–88.62%), AdaBoost algorithm has the highest accuracy of 88.62%
[45]Five stages10K-NN, DT, RF, Multilayer perceptron and Naïve BayesFpz-Cz (highest ACC), Cz-A1, C3-A2, Pz-CzACC (71.80–89.74%)
RF classifier had the highest accuracy of 89.74%
[21]Five stages136RFFpz-Cz (highest ACC), Cz-A1, C3-A2, Pz-CzACC = 87.82%
The present studyFive stages57 and 11 (Embedded method feature optimization)SVM, DT, RF and BPNNFpz-CzThe ACC of 57 features (75.82–83.56%), RF with the highest
The ACC of 11 features (75.88–82.53%), RF with the highest
Note: KC, kappa coefficient; ACC, accuracy; SVM, support vector machine; LS-SVM, Least Squares-support vector machine; DT, decision trees; RF, random forest; LDA, linear discriminant analysis; BPNN, backpropagation neural network; k-NN, k-nearest neighbor.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhao, S.; Long, F.; Wei, X.; Ni, X.; Wang, H.; Wei, B. Evaluation of a Single-Channel EEG-Based Sleep Staging Algorithm. Int. J. Environ. Res. Public Health 2022, 19, 2845. https://doi.org/10.3390/ijerph19052845

AMA Style

Zhao S, Long F, Wei X, Ni X, Wang H, Wei B. Evaluation of a Single-Channel EEG-Based Sleep Staging Algorithm. International Journal of Environmental Research and Public Health. 2022; 19(5):2845. https://doi.org/10.3390/ijerph19052845

Chicago/Turabian Style

Zhao, Shanguang, Fangfang Long, Xin Wei, Xiaoli Ni, Hui Wang, and Bokun Wei. 2022. "Evaluation of a Single-Channel EEG-Based Sleep Staging Algorithm" International Journal of Environmental Research and Public Health 19, no. 5: 2845. https://doi.org/10.3390/ijerph19052845

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop