Study of Auditory Brain Cognition Laws-Based Recognition Method of Automobile Sound Quality

Xie, Liping; Lu, Chihua; Liu, Zhien; Yan, Lirong; Xu, Tao

doi:10.3389/fnhum.2021.663049

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 08 October 2021
Sec. Brain-Computer Interfaces
Volume 15 - 2021 | https://doi.org/10.3389/fnhum.2021.663049

Study of Auditory Brain Cognition Laws-Based Recognition Method of Automobile Sound Quality

Liping Xie^1,2

Chihua Lu^1,2

Zhien Liu^1,2^*

Lirong Yan^1,2

Tao Xu^1,2

¹Hubei Key Laboratory of Advanced Technology for Automotive Components, Wuhan University of Technology, Wuhan, China
²Foshan Xianhu Laboratory of the Advanced Energy Science and Technology Guangdong Laboratory, Foshan, China

The research shows that subjective feelings of people, such as emotions and fatigue, can be objectively reflected by electroencephalography (EEG) physiological signals Thus, an evaluation method based on EEG, which is used to explore auditory brain cognition laws, is introduced in this study. The brain cognition laws are summarized by analyzing the EEG power topographic map under the stimulation of three kinds of automobile sound, namely, quality of comfort, powerfulness, and acceleration. Then, the EEG features of the subjects are classified through a machine learning algorithm, by which the recognition of diversified automobile sound is realized. In addition, the Kalman smoothing and minimal redundancy maximal relevance (mRMR) algorithm is used to improve the recognition accuracy. The results show that there are differences in the neural characteristics of diversified automobile sound quality, with a positive correlation between EEG energy and sound intensity. Furthermore, by using the Kalman smoothing and mRMR algorithm, recognition accuracy is improved, and the amount of calculation is reduced. The novel idea and method to explore the cognitive laws of automobile sound quality from the field of brain-computer interface technology are provided in this study.

Introduction

Methods that are applied to evaluate automobile sound quality mainly rely on the psychological feelings of people and cannot guarantee the universality of evaluation results (Tan and Tan, 2012). Methods of ranking, semantic differentiation (Guo et al., 2017), grade score, pairing comparison (Parizet, 2002; Ellermeier et al., 2004) are commonly used for subjective evaluation. However, when the sound qualities with similar semantics (such as “comfort,” “powerfulness,” and “acceleration”) are designed under the dominance of sound forward design, and the traditional subjective evaluation methods are difficult to reflect the true feelings of the evaluator. In addition to inherent physical parameter characteristics of sounds, the evaluation of an evaluator for the sound is also related to their cognition, experience, and emotional state (Genuit, 2004). Therefore, it is necessary to introduce a new automobile sound quality evaluation method for evaluating the diversified automobile sound.

Related Works

In recent years, with the research on physiological signals in emotional computing, it has become possible to use physiological signals to evaluate automobile sound. EEG signals with high time and spatial resolution are widely used (Lin et al., 2010; Bhatti et al., 2016; Geethanjali et al., 2018).

The analysis of EEG signals is challenging, and the analysis of EEG signals in the field of emotion recognition relies on data pre-processing, feature extraction (Tsang et al., 2010; Kai et al., 2016; Poikonen et al., 2016), and feature classification. Feature extraction is crucial to ensure recognition performance. Only by selecting EEG features closely related to the purpose of research can effectively meet the performance of recognition (Nishimura and Mitsukura, 2013; Sheykhivand et al., 2020). Some studies indicated that rhythm characteristic of EEG can reflect human brain activities, which are δ (1–4 Hz), θ (4–8 Hz), α (8–12 Hz), β (12–30 Hz), and γ (>30 Hz) (Knyazev, 2012; Zheng and Lu, 2015). Chen et al. (2021) proposed an EEG physiological acoustic index to evaluate subjective annoyance by comparing EEG rhythm characteristics and the change in the trend of subjective annoyance index data. Li et al. (2014) used white noise and pure tone as stimulus sources to study the relationship between EEG characteristic signals and subjective annoyance, and it is found that the average power of θ waves has two peaks in each brain area during steady stimulation. Ali et al. (2013) studied EEG signals under different sound pressure levels and stimulation intervals, and the study found that the θ wave voltage increased significantly because of high sound pressure level stimulation. Di and Wu (2015) showed that the average α wave power in the left frontal lobe was significantly lower than that in the right frontal lobe under the stimulation of pleasant sounds.

In the study of automotive sound quality and EEG signals, Lee and Lee (2014) introduced a new method to study human sound perception by means of EEGs, where EEG analysis and measurement were performed to demonstrate human cerebral response to car acceleration sounds and concluded that the α-wave power could serve as an objective evaluation index of automobile acceleration sounds. Lee et al. (2013) selected the α-wave to calculate the correlation between subjective evaluations of passenger car sounds and their results indicate that the intensity of the correlation between the cerebral α-wave and subjective evaluations can be determined based on the size of the correlation. Nishimura and Mitsukura (2013) put forward a group method of data handling (GMDH) to analyze the sound quality of EERs utilizing neural networks. Compared with the result efficiency of the principal component analysis (PCA), the GMDH neural network resulted in a higher recognition of the target sound quality. The above studies showed that the distinct physiological response of the human brain to sound stimuli authentically exists.

Contribution

It is difficult to distinguish automobile sounds with similar semantics by means of traditional subjective evaluations. In contrast to the application of EEG signals for emotion recognition, the study of automobile sound quality based on EEG is in infancy, the relationship between EEG feature signals and automobile sound quality is still unclear, and there is less relevant literature. However, there are related research studies on actively playing music based on EEG to improve the subjective emotions of people (Bajaj and Pachori, 2015; Kalaganis et al., 2016). Therefore, a method for mapping EEGs and diversified sound quality for decoding automobile sounds is proposed to reveal the feasibility of using EEG signals as a method of automobile sound quality evaluation, which can avoid language description. The study on decoding automobile sound types can lay the foundation of neuroscience for realizing active playback of automobile sounds based on EEGs in the future.

The auditory brain cognition laws refer to the rhythmic activities of the brain under the stimulation of the automobile sound. At present, there are no unified standards for the selection of EEG features, and it requires relevant guidance in selecting EEG features. Thus, changing the law of EEG under the stimulation of automobile sound is studied here, so as to guide the selection of EEG features. By defining three subjective evaluation indices of automobile sound quality (namely, comfort, powerfulness, and acceleration), sounds that matched with the three subjective evaluation indices are collected, The EEGs of the subjects are measured under the stimulation of three automobile sounds, respectively, in a suitable temperature and quiet environment, and the analysis of EEG data contribute to explore the cognition laws of the brain. The differential asymmetry (DASM) and rational asymmetry (RASM) features of subjects are extracted based on cognition laws, and use classification models to identify differences in automobile sound. The flow chart is shown in Figure 1.

FIGURE 1

Figure 1. Flow chart for the new evaluation method of car sound quality based on brain signals.

Study Outline

The layout of this study is as follows: the design of the experiment is introduced in section Experiment Design. Section Methodology systematically describes the analysis methods of brain signal feature extraction, selection, and classifier. The results of data analysis are shown in section Experiment Result, including the cognitive laws of the brain under three types of automobile sounds, the use of classification models to compare the recognition accuracy differences of different features, and the optimization of model accuracy using the Kalman smoothing and mRMR algorithm. Section Discussion discusses the results of Section Experiment Result and describes the research significance of this study. Section Conclusions shows the summary and prospects of this study.

Experiment Design

The three types of automobile acceleration sounds are selected (namely, comfort, powerfulness, and acceleration) as inducing materials for EEG tests. These sounds that cause strong subjective and physiological changes in the subjects are mainly obtained by means of vehicle measurements, online research (such as collect acceleration sound samples of high-end automobile on website sites or from car game software), etc. Table 1 lists the three types of automobile sounds used in the experiment. It is of significance to emphasize that these automobile sounds are divided into three parts, namely, comfort, powerfulness, and acceleration, by 39 engineers with experience in sound quality analysis, and the characteristic distribution of the 39 evaluators is shown in Table 2. The aim of this study is to identify three types of automobile sounds based on EEG signals. Assuming that comfort is −1, powerfulness is 0, and acceleration is 1 here, these data labels make sense when training a classifier.

TABLE 1

Table 1. Details of the sound clips used in the EEG experiment.

TABLE 2

Table 2. Characteristic distribution of evaluators.

Based on the experimental design and selection of subjects by Zheng and Lu (2015), a total of 15 healthy subjects are recruited, who are different from the 39 engineers. All the subjects included 11 males and four females (aged: 22.4 ± 2.53 years) who are professors or graduate students from the Wuhan University of Technology. They all have experience in automobile sound quality evaluation and ensure their optimal mental health.

Before the experiment started, the test operation procedures and specifications were relayed to all the subjects in advance, and they were instructed to properly wear high-fidelity headphones and press buttons combined with the interface prompts. Making sure that the subjects concentrate on listening to sounds and avoid obvious limb movements during the experiment is of great importance. A 64-channel AgCl electrode cap is used to collect EEG at a sampling rate of 1,000 Hz. The EEG lead distribution and electrode cap test are shown in Figure 2.

FIGURE 2

Figure 2. EEG test setup: (A) distribution of EEG leads for 62 channels and (B) electrode cap test.

The three automobile sounds in each type are played randomly, and each sound is played 27 times repeatedly. There is a 5 s start prompt before each sound is played, and 10 s rest feedback after playing. A questionnaire format that the computer interface will pop up the type selection item during the 10 s rest feedback period is used, and the subjects judge which type the sound belongs to (namely, comfort, powerfulness, or acceleration). The playback process is shown in Figure 3.

FIGURE 3

Figure 3. The protocol used in the experiment for sound quality evaluation.

Methodology

Feature Extraction

Combining the effective features in the field of emotion recognition, the power spectral density (PSD) (Thammasan et al., 2016), Hjorth (Jorth, 1970), and differential entropy (DE) (García-Martínez et al., 2016) are extracted as the basic EEG features in this study.

The Welch algorithm is used to set a 1-s long rectangular window with an overlap rate of 50% and obtain the PSD corresponding to different frequency bands. The Hjorth parameters, such as activity, mobility, and complexity (Vidaurre et al., 2009; Kaboli et al., 2015) are defined as

\begin{array}{l} A c t i v i t y = var (X (t)) & (1) \end{array}

\begin{array}{l} M o b i l i t y = \sqrt{\frac{var (\frac{d X (t)}{d t})}{var (X (t))}} & (2) \end{array}

\begin{array}{l} C o m p l e x i t y = \frac{M o b i l i t y (\frac{d X (t)}{d t})}{M o b i l i t y (X (t))} & (3) \end{array}

where var denotes the variance of the calculated X(t) signal.

The DE that satisfies the Gaussian distribution is defined as (García-Martínez et al., 2016).

\begin{array}{l} H (X) = \int_{- \infty}^{\infty} \frac{1}{\sqrt{2 π σ^{2}}} exp \frac{{(x - μ)}^{2}}{2 σ^{2}} log \frac{1}{\sqrt{2 π σ^{2}}} \\ exp \frac{{(x - μ)}^{2}}{2 σ^{2}} d x = \frac{1}{2} log 2 π e σ^{2} & (4) \end{array}

where X means a continuous source, Gaussian distribution satisfies N(μ, σ²), and π and e are a constant.

There are also several pieces of evidence that asymmetry features can well represent the cognitive laws of the human brain (Zheng et al., 2017). In this study, the DASM and RASM of 26 pairs of asymmetric electrodes are calculated, and there are six type features, which are expressed as

\begin{array}{l} D A S M_P S D = P S D (X_{left}) - PSD (X_{r i g h t}) & (5) \end{array}

\begin{array}{l} D A S M_H j o r t h = H j o r t h (X_{left}) - H j o r t h (X_{r i g h t}) & (6) \end{array}

\begin{array}{l} D A S M_D E = D E (X_{left}) - DE (X_{r i g h t}) & (7) \end{array}

and

\begin{array}{l} R A S M_P S D = P S D (X_{left}) / PSD (X_{r i g h t}) & (8) \end{array}

\begin{array}{l} R A S M_H j o r t h = H j o r t h (X_{left}) / H j o r t h (X_{r i g h t}) & (9) \end{array}

\begin{array}{l} R A S M_D E = D E (X_{left}) / DE (X_{r i g h t}) & (10) \end{array}

The frequency is divided into five segments based on the EEG rhythm, as shown in Figure 4. The dimensions of DASM_PSD, DASM_Hjorth, DASM_DE, RASM_PSD, RASM_Hjorth, and RASM_DE are 130 (26 electrodes^*5 rhythms), 390 (26 electrodes^* 5^* 3 rhythms), 130 (26 electrodes^* 5 rhythms), 130 (26 electrodes^* 5 rhythms), 390 (27 electrodes^* 5 ^*3 rhythms), and 130 (27 electrodes^*5 rhythms), respectively.

FIGURE 4

Figure 4. Distribution map of EEG rhythm frequency band.

Feature Selection

Herein, the Kalman smoothing algorithm is used to filter out EEG components that are not associated with sounds. The purpose of Kalman smoothing is to calculate the smoothed value of the system state X_k at moment k after obtaining all observations up to time T (Cheng Y and, 2018), smoothing formula is expressed as

\begin{array}{l} p (X_{k} | y_{1 : T}) = N (X_{k} | m_{k}^{8}, P_{k}^{8}) & (11) \end{array}

where T > k, y_{1 : T} denotes all observations in the 1~T time period and N(X|μ, σ) denotes the random variable X satisfying a Gaussian distribution with mean μ and variance σ. T times forward recursion is completed from the initial time 1 to the time T, and then perform T times backward recursion from the time T to complete the Kalman smoothing process. The forward recursion process is Kalman filtering, and the state estimate m_T and covariance matrix P_T at the last time T obtained by the forward recursion are the initial state estimate $m_{T}^{8}$ and covariance matrix $P_{T}^{8}$ of the backward recursion process, namely, $m_{T} = m_{T}^{8}, P_{T} = P_{T}^{8}$ .

In addition, the most common problem that is “curse of dimensionality” for pattern recognitions leads to the rapid increase in computation with the increase in feature dimensions (Zheng et al., 2017). It is necessary to select EEG features after smoothing the EEG data with the target of avoiding feature redundancy, and the principal component analysis (PCA) and minimal redundancy maximal relevance (mRMR) algorithm are compared in this study.

The original domain information cannot be preserved by means of the PCA (Nakanishi et al., 2011). Hence, the mRMR algorithm is introduced to select a feature subset from EEG data here. The mRMR algorithm finds a set of features in the original feature set that is strongly correlated with the final output result (Max-Relevance), but the smallest correlation between the features (Min-Redundancy) (Peng et al., 2005). “Max-Relevance” and “Min-Redundancy” are defined as

\begin{array}{l} max D (S, c), D = \frac{1}{| S |} \sum_{x_{i} \in S} I (x_{i}; c) & (12) \end{array}

\begin{array}{l} min R (S), R = \frac{1}{{| S |}^{2}} \sum_{x_{i}, x_{i} \in S} I (x_{i}, x_{j}) & (13) \end{array}

Combining “Max-Relevance” D with “Min-Redundancy” R, we define Φ(D, R) as

\begin{array}{l} max Φ (D, R), Φ = D - R & (14) \end{array}

The approximate optimal solution can be obtained by the incremental search method, and the feature is selected by maximizing Φ(D, R).

Classifier

The reasonable design of the classifier affects the final result (Ackermann et al., 2016; Jenke et al., 2017; Hernández et al., 2018), and the linear discriminant analysis (LDA) and support vector machine (SVM) are the most common and effective classifiers. Thus, the performance differences between the LDA and SVM models are compared in this study.

The common basic idea of LDA classification assumes that every type of sample data can conform to the Gauss distribution. While a new sample arrives, it can be projected to bring their projected sample features into Gauss distribution probability density function, and then calculate its category corresponding to the peak probability.

The core idea of SVM is to find an optimal hyperplane to achieve the classification effect, and the corresponding decision function is

\begin{array}{l} f (x) = sgn(\sum_{i = 1}^{m} α_{i} y_{i} K (x_{i}, x) + b) & (15) \end{array}

where x_i represents the characteristics of the i-th sample, y_i represents the category of the i-th sample, and α_i the b are the calculation parameters in the SVM optimization process. The mostly used kernel function for EEG signals is the radial basis function (RBF), and the formula is as follows:

\begin{array}{l} K (x_{i}, x) = e^{- \frac{{| | x - x_{i} | |}^{2}}{2 α^{2}}} & (16) \end{array}

A “one-to-one” method was used to solve the problems of multi-classification, in which n types of training data are combined in pairs to construct n (n-1)/2 SVM. In this study, the recognition of three types of automobile sound quality is transformed into three two-classification problems. The two important SVM parameters [namely, penalty coefficient (C) and gamma] are tuned by simulation to obtain the optimal SVM model. The three sets of decision function judgment values are output, and the category with the largest judgment value is the output category of sound, namely, majority voting (Ang et al., 2012). The entire classification process is shown in Figure 5.

FIGURE 5

Figure 5. Classification flowchart of SVM.

Experiment Result

Since the signal-to-noise ratio of EEGs is low, the original data that contain a large number of external interference noises and artifacts are necessarily preprocessed; thus, pure EEG data are extracted with the EEGLAB toolbox, mainly including EEGs (0.1–100 Hz) are captured by means of a band-pass filter, the interference band of 50 Hz is eliminated by a notch filter, the sampling rate is reset to 200 Hz, the artifacts are removed by the method of Independent Component Correlation Algorithm (ICA) and so on.

The data set input to the classification model is N^*26, where 26 refers to the number of channel pairs, and N is the number of samples. There are a total of 27^*9^*5 = 1,215 samples (duration: 1 s) for each subject. After removal of some abnormal data, the number of EEG samples stimulated may be <1,215.

Cognitive Laws Induced by Automobile Sound

The EEG power topographic map shows the spatial distribution of power of five frequency rhythms, thereby turning complex brain function changes into easy-to-follow graphs. The power topographic maps of five frequency rhythms (δ, θ, α, β, and γ) of the 15 subjects are drawled, as shown in Figure 6.

FIGURE 6

Figure 6. Power topographic maps for the three types of car sounds in five rhythms.

First, the spectrum power of the five bands under these two kinds of sound stimulation is higher than that of comfort from the perspective of a sense of powerfulness and acceleration. Based on the stimulation of powerful automobile sounds, the energy of the δ rhythm is mainly concentrated in the top and occipital areas of the bottom-right, and the energy is also more prominent in the frontal area of the upper left corner. The θ rhythm is similar to the delta rhythm but lower than δ. The energy of the α rhythm is mainly concentrated in the top area of the lower left and the frontal area of the upper left, and the β rhythm is mainly concentrated in the frontal area of the upper left, and the γ rhythm is symmetrically distributed around the frontal area.

Under the stimulation of acceleration automobile sounds, the δ rhythm energy of the entire brain is more prominent. The energy of θ and α rhythm is symmetrical in the left and right frontal regions, but the energy of θ in the central region is obvious. The energy of α in the left lower occipital region is prominent. The energy distribution of the β and γ rhythms shows a symmetrical distribution in the left and right frontal areas. As for the comfort sounds, the energy of the five frequency rhythms is obvious in the upper left frontal area.

In general, there are clear differences in the frequency band characteristics of EEG rhythm under different quality of sound stimulation.

Feature Selection

The frequency band energy of the symmetric electrode has a significant difference under the stimulations of diversified automobile sound quality; thus, the symmetrical EEG features are used as input of classifiers in this study. The LDA and SVM are used as classifiers to recognize the three types of automobile sounds, a 5-fold cross-validation scheme is adopted, and the accuracy of the classifier as an evaluation index of classifier performance.

Table 3 shows the mean accuracy of LDA and SVM for symmetrical EEG features (namely, DASM_PSD, DASM_Hjorth, DASM_DE, RASM_PSD, RASM_Hjorth, and RASM_DE) obtained from the five rhythms (δ, θ, α, β, and γ) and the total frequency bands. The LDA average accuracies (%) are 75.01, 84.83, 81.47, 68.50, 83.67, and 80.63 for the six features from the total frequency bands, respectively. For SVM, the average accuracies (%) are 74.83, 86.26, 81.02, 69.11, 85.49, and 81.92. In the total frequency band, the optimal and worst accuracies (%) of the LDA classifier are 86.26 and 69.11, respectively, and for the SVM classifier 84.83 and 68.50, respectively. In the total frequency band, the best and worst accuracy results appear in DASM_DE and RASM_PSD, respectively.

TABLE 3

Table 3. The mean accuracy rates (%) of LDA and SVM classifiers for different features obtained from separate and total frequency bands.

Further, from the classification results of the five rhythms, the LDA classifier has the lowest accuracy with 40.1% in δ rhythms with RASM_Hjorth as the feature. The accuracy up to 87.6% of the SVM classifier is the highest in the γ rhythms with RASM_DE as the feature.

The method of one-factor analysis of variance is used to study the statistical significance of the data, where the results of DE and Hjorth are better than those of PSD, and the difference in classifier performance between LDA and SVM is not apparent (p > 0.05). There is a significant difference in classification accuracy (p < 0.05) in diverse rhythms, and the accuracies of β and γ bands are significantly better than those of the three rhythms. The classification accuracy of δ, θ, and α is not totally different (p = 0.04462).

The powerfulness and acceleration are semantically similar. It is difficult to distinguish the difference based on subjective feelings during the experiment, which is susceptible to lead to confusion. Figure 7 revealed that the semantic similarity recognition effect of automobile sound based on EEG signals is better than that of subjective questionnaire recognition method. Figure 7 shows the results of identifying the two types of automobile sounds (namely, powerfulness and acceleration) using SVM with DSAM_DE as the feature and the test subjects in form of a questionnaire. It is obvious that the accuracy of the questionnaire is lower than machine learning recognition, and the average accuracy of SVM is about 11% higher than the questionnaire. It is worth explaining that the subjective recognition rate of the two other pairwise comparisons (comfort vs. powerfulness and comfort vs. acceleration) is both high, and the average accuracy rate is about 90%, which makes it difficult to reflect the advantages of EEG recognition.

FIGURE 7

Figure 7. The results of identifying the two types of automobile sounds (namely, powerfulness and acceleration) using SVM with DSAM_DE as the feature and the test subjects in form of a questionnaire.

Optimization of Classifier Accuracy

Firstly, the Kalman smoothing algorithm introduced in section Feature Selection is used here to remove noise that is not related to the desired signal, and the RASM_PSD features of 120 dimensions as inputs, SVM as a classifier. Second, the PCA and mRMR are compared with RASM_Hjorth features of 360 dimensions as inputs and SVM as a classifier.

Table 4 compares the accuracy of the algorithm using Kalman smoothing and without any smoothing algorithm in different rhythms. The accuracy (%) of the unsmoothing method and the Kalman smoothing method in five rhythms is 49.02/68.8, 39.11/60.12, 41.78/62.1, 63.79/84.33, 64.27/85.67, and 69.11/90.36. It is obvious that the accuracy of the Kalman smoothing algorithm method is significantly better than unsmoothing (p < 0.05), and the accuracy of the Kalman smoothing method is improved by 19.78% in δ rhythms and 21.4% in γ rhythms. The above results showed that feature smoothing can effectively improve the recognition accuracy.

TABLE 4

Table 4. The accuracies (%) of unsmoothing and Kalman smoothing method with RASM_PSD features of 120 dimensions as inputs and SVM as a classifier from the total frequency bands.

Figure 8 compares the impact of dimension reduction using PCA and MRMR algorithms on model precision performance, in which the dimension of the model is reduced from 350 to 50 dimensions with 50 intervals. It is clear that the usage of the PCA algorithm, which can reduce the dimensionality, does not significantly improve the accuracy. The accuracy rate drops from 64.8 to 49.8% when the dimensionality reduced to 50, and it reaches 62.5% at 250 dimension, which is lower than the original 360 dimension of 1.7%. However, the mRMR algorithm can not only reduce the dimensionality, but also improve the accuracy of the classifier, the accuracy using the mRMR algorithm reached the local maximum (72.00%) at 50 dimension, which is 7.2% higher than the original 360 dimension. Moreover, the accuracy improved significantly when the dimension is 50, 100, and 150, and the dimensionality reduction is not obvious when the dimension is >150.

FIGURE 8

Figure 8. The average accuracies of SVM using RASM_ Hjorth obtained from total frequency bands base on PCA and mRMR for subject 12.

Discussion

This study demonstrates the feasibility of EEG-based recognition of the diversified sound quality of the automobile. Several important issues are explored.

Some studies showed that the brain waves in a certain rhythm band are indeed aroused (Lee et al., 2013; Lee and Lee, 2014) under the stimulation of automobile sounds. As shown in Figure 6, there are frequency band differences in brain cognition under the stimulation of different sounds, which is specifically reflected in positive the correlation between EEG energy and sound energy intensity. The recognition of automobile sound quality is improved based on frequency band characteristics, which can well reflect the laws of brain cognition. Some literature has proved that the frontal area is closely related to human brain cognition (Saxe, 2006; Shamay-Tsoory and Aharon-Peretz, 2007), and there is a large proportion of energy in the frontal area under musical stimulation (Sammler et al., 2010; Di and Wu, 2015). Therefore, the results shown in Figure 6 of this study provide further evidence that the cognition laws in the frontal portion of the human brain can indeed be aroused by automobile, so as to guide the selection of EEG features.

The DASM has better classification accuracy than RASM, which is consistent with the conclusion of the literature (Lin et al., 2010). Among the three basic features (PSD, Hjorth, DE), DE has the best classification performance, and it is most suitable for the recognition of automobile sounds. Although the classification accuracy of DASM_DE and DASM_Hjorth is close, the dimension of DASM_DE is 1/3 of DASM_Hjorth. Among the five rhythms, the classification accuracy of the β and γ rhythms is better than the other three rhythms, which proves that the correlation between different sound quality and different rhythms of brain waves is also different. The classification accuracy of the SVM model is slightly better than LDA, but SVM has the advantages of a small number of training sets, fast training speed, and high accuracy. The best accuracy of motion classification (82.29% ± 3.06%) is obtained by SVM, as demonstrated in the literature in both Lin et al. (2010) and Hadjidimitriou and Hadjileontiadis (2012), which are both similar to our study.

The comfortable sound is light and natural, and the sound pressure level is small. On the contrary, the other pairs are powerful, booming, and exciting, and the topographic map corresponding to the comfort as shown in Figure 6 differs significantly from the other two types. For experienced automotive engineers, it is easy to distinguish the sound characteristic difference between comfort and powerfulness (or acceleration), but it is difficult to distinguish the difference between the powerfulness and acceleration sounds. In Figure 7, compared with recognizing sounds based on subjective feelings, using the classification model has higher recognition accuracy based on EEG characteristics. The literature (Nakanishi et al., 2011) verified the difference of EEG between three kinds of acoustic quality by using PCA and FDA in a similar way to this study. In which, the result proved that they can obtain the information that they cannot obtain from questionnaires by EEG. It is possible that the change of subjective emotion is provoked by the stimulation of the automobile sounds. However, it is not yet clear which emotion it is related to and it is the next step in the research.

As discussed in section Feature Selection, the Kalman smoothing algorithm can effectively improve the recognition accuracy and confirm that feature smoothing plays an important role in EEG-based recognition. In Figure 8, it is obvious that the mRMR algorithm is an effective method to optimize the accuracy of recognition, which retains the original information, such as electrode channels and frequency bands, while reducing the complexity of calculations. In the literature (Zheng et al., 2017), the mRMR algorithm was also used to achieve dimensionality reduction for improving recognition accuracy of emotion, which improves the accuracy by 14.41%.

The main contributions of this study to sound quality recognition from EEG can be summarized as follows: (1) an EEG signal acquisition test paradigm is designed based on automobile sounds, which provide experimental guidance for studying the correlation between automobile sounds and EEG signals; (2) it was systematically described the processing process of EEG data from three aspects: feature extraction, feature selection, and pattern recognition and proves that the selection of EEG features, the smoothing and dimensionality reduction of data, and the reasonable design of classifier are crucial for the recognition of sounds; (3) this study confirms that the neural characteristics of the three types of automobile sounds do exist, and the SVM can effectively identify the three types of automobile sounds through the input of the DASM_DE of γ rhythm; and (4) this research takes the brain–computer interface technology as the breakthrough point and introduces the physiological features of EEG to recognize the automobile sound quality innovatively.

Conclusions

The objective of this research is to investigate the laws of brain cognition under the stimulation of diverse automobile sounds and propose an effective method to identify diversified automobile sounds. The results show that the frequency band features can well reflect the laws of brain cognition, which can effectively realize the recognition of automobile sound quality by constructing asymmetric EEG feature indices and using machine learning models. The DASM_DE of the γ rhythm is used as the input, and the accuracy of automobile sounds reached up to 86.26% by SVM. Also, it proves that the Kalman smoothing and mRMR algorithm can not only improve the recognition accuracy but also reduce the amount of model calculation. In summary, this study proposes a new method of automobile sound quality recognition from the field of brain–computer interface technology.

Future study will include further evaluation of the specific relationship between EEG signals and the inherent characteristics of automobile sounds, proposed indices that can quantify automobile sound quality, and the usage of deep learning algorithms that automatically extract the potential features of EEGs.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics Statement

The studies involving human participants were reviewed and approved by the Ethical Review Committee of Wuhan University of Technology. The participants provided their written informed consent to participate in the study.

Author Contributions

ZL and LX designed the data processing schema and wrote the manuscript. CL and TX designed the experiment and were involved in the data collection. LY made a great contribution to the content of the manuscript during the revision process. All authors contributed to the article and approved the submitted version.

Funding

This work supported by National Natural Science Foundation of China (Grant Nos. 52175111 and 61876137) and Foshan Xianhu Laboratory of the Advanced Energy Science and Technology Guangdong Laboratory (No. XHD2020-003).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Ackermann, P., Kohlschein, C., Bitsch, J. A., Wehrle, K., and Jeschke, S. (2016). “EEG-based automatic emotion recognition: feature extraction, selection and classification methods,” in IEEE International Conference on E-health Networking (IEEE).

Google Scholar

Ali, A., Luktuke, A. S., Ramachandran, E., and Karanth, N. V. (2013). Sound quality based benchmarking methodology for vehicle interior noise. SAE Tech. Pap. 12. doi: 10.4271/2013-01-2853

CrossRef Full Text | Google Scholar

Ang, K. K., Chin, Z. Y., Wang, C., Guan, C., and Zhang, H. (2012). Filter bank common spatial pattern algorithm on BCI competition IV datasets 2a and 2b. Front. Neurosci. 6:39. doi: 10.3389/fnins.2012.00039

PubMed Abstract | CrossRef Full Text | Google Scholar

Bajaj, V., and Pachori, R. B. (2015). Detection of Human Emotions Using Features Based on the Multiwavelet Transform of EEG Signals. Springer International Publishing.

Google Scholar

Bhatti, A. M., Majid, M., Anwar, S. M., and Khan, B. (2016). Human emotion recognition and analysis in response to audio music using brain signals. Comput. Human Behav. 65, 267–275. doi: 10.1016/j.chb.2016.08.029

CrossRef Full Text | Google Scholar

Chen, X., Lin, J., Jin, H., Huang, Y., and Liu, Z. (2021). The psychoacoustics annoyance research based on EEG rhythms for passengers in high-speed railway. Appl. Acoustics 171:107575. doi: 10.1016/j.apacoust.2020.107575

CrossRef Full Text | Google Scholar

Cheng, Y., Wu, X., Han, H., Xie, Y., Deng, Y., and Song, T. (2018). GNSS common-view time comparison algorithm based on Kalmanfiltering and Rauch-Tung-Streibel smoothing. J. Harbin Eng. Univ. 39, 120–125. doi: 10.11990/jheu?201702002

CrossRef Full Text

Di, G. Q., and Wu, S. X. (2015). Emotion recognition from sound stimuli based on back-propagation neural networks and electroencephalograms. J. Acoust. Soc. Am. 138:994. doi: 10.1121/1.4927693

PubMed Abstract | CrossRef Full Text | Google Scholar

Ellermeier, W., Mader, M., and Daniel, P. (2004). Scaling the unpleasantness of sounds according to the BTL model: ratio-scale representation and psychoacoustical analysis. Acta Acustica United Acustica 90, 101–107.

Google Scholar

García-Martínez, B., Martínez-Rodrigo, A., Zangroniz Cantabrana, R., Pastor Garcia, J. M., and Alcaraz, R. (2016). Application of ntropy-based metrics to identify emotional distress from electro-encephalographic recordings. Entropy 18:221. doi: 10.3390/e18060221

CrossRef Full Text | Google Scholar

Geethanjali, B., Adalarasu, K., Jagannath, M., and Seshadri, N. G. (2018). Music induced brain functional connectivity using EEG sensors: a study on Indian Music. IEEE Sens. J. 19:1499. doi: 10.1109/JSEN.2018.2873402

CrossRef Full Text | Google Scholar

Genuit, K. (2004). The sound quality of vehicle interior noise: a challenge for the NVH engineers. Vehicle Noise Vibrat. 1, 158–168. doi: 10.1504/IJVNV.2004.004079

CrossRef Full Text | Google Scholar

Guo, D., Shi, X. H., Wei Qing, H. U., Wen li, L. I., and Peng, Y. I. (2017). Sound quality evaluation of driveline system of certain automobile based on semantic differential method. China J. Highway Transport 30, 307–314. doi: 10.19721/j.cnki.1001-7372.2017.06.014

CrossRef Full Text

Hadjidimitriou, S. K., and Hadjileontiadis, L. J. (2012). Toward an EEG-Based Recognition of Music Liking Using Time-Frequency Analysis. IEEE Trans. Biomed. Eng. 59, 3498–3510. doi: 10.1109/TBME.2012.2217495

PubMed Abstract | CrossRef Full Text | Google Scholar

Hernández, D. E., Trujillo, L., Z.-Flores, E., Villanueva, O. M., and Romo-Fewell, O. (2018). “Detecting epilepsy in EEG signals using time, frequency and time-frequency domain features,” in Computer Science and Engineering Theory and Applications (Springer), 167–182.

Google Scholar

Jenke, R., Peer, A., and Buss, M. (2017). Feature extraction and selection for emotion recognition from EEG. IEEE Transac. Affect. Comput. 5, 327–339. doi: 10.1109/TAFFC.2014.2339834

CrossRef Full Text | Google Scholar

Jorth, B. (1970). EEG analysis based on time domain properties. Electroencephalogr. Clin. Neurophysiol. 29, 306–310. doi: 10.1016/0013-4694(70)90143-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Kaboli, M., Walker, R., and Cheng, G. (2015). “In-hand object recognition via texture properties with robotic hands, artificial skin, and novel tactile descriptors,” in IEEE-RAS International Conference on Humanoid Robots (IEEE).

Google Scholar

Kai, S., Fujinaga, I., and Mcadams, S. (2016). A comparison of approaches to timbre descriptors in music information retrieval and music psychology. J. New Music Res. 45, 1–15. doi: 10.1080/09298215.2015.1132737

CrossRef Full Text | Google Scholar

Kalaganis, F., Adamos, D. A., and Laskaris, N. (2016). A Consumer BCI for Automated Music Evaluation Within A Popular on-Demand Music Streaming Service “Taking Listener's Brainwaves to Extremes”. Cham: Springer.

Google Scholar

Knyazev, G. G. (2012). EEG delta oscillations as a correlate of basic homeostatic and motivational processes. Neurosci. Biobehav. Rev. 36, 677–695. doi: 10.1016/j.neubiorev.2011.10.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Lee, S. M., and Lee, S. K. (2014). Objective evaluation of human perception of automotive sound based on physiological signal of human brain. Int. J. Automot. Technol. 15, 273–282. doi: 10.1007/s12239-014-0028-4

CrossRef Full Text | Google Scholar

Lee, Y. J., Shin, T. J., and Lee, S. K. (2013). Sound quality analysis of a passenger car based on electroencephalography. J. Mech. Sci. Technol. 27, 319–325. doi: 10.1007/s12206-012-1248-z

CrossRef Full Text | Google Scholar

Li, Z. G., Di, G. Q., and Jia, L. (2014). Relationship between electroencephalogram variation and subjective annoyance under noise exposure. Appl. Acoustics 75, 37–42. doi: 10.1016/j.apacoust.2013.06.011

CrossRef Full Text | Google Scholar

Lin, Y. P., Wang, C. H., Jung, T. P., Wu, T. L., Jeng, S. K., Duann, J. R., et al. (2010). EEG-based emotion recognition in music listening. IEEE Transac. Biomed. Eng. 57, 1798–1806. doi: 10.1109/TBME.2010.2048568

PubMed Abstract | CrossRef Full Text | Google Scholar

Nakanishi, M., Mitsukura, Y., and Hara, A. (2011). EEG Analysis for Acoustic Quality Evaluation Using PCA and FDA. Roman. IEEE.

Google Scholar

Nishimura, K., and Mitsukura, Y. (2013). “Sound quality indicating system using EEG and GMDH-type neural network,” in 2013 Asia-Pacific Signal And Information Processing Association Annual Summit And Conference (APSIPA).

Google Scholar

Parizet, E. (2002). Paired comparison listening tests and circular error rates. ACTA Acust. United Acust. 88, 594–598.

Google Scholar

Peng, H., Long, F., and Ding, C. (2005). Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Computer Society.

PubMed Abstract | Google Scholar

Poikonen, H., Alluri, V., Brattico, E., Lartillot, O., Tervaniemi, M., and Huotilainen, M. (2016). Event-related brain responses while listening to entire pieces of music. Neuroscience 312, 58–73. doi: 10.1016/j.neuroscience.2015.10.061

PubMed Abstract | CrossRef Full Text | Google Scholar

Sammler, D., Grigutsch, M., Fritz, T., and Koelsch, S. (2010). Music and emotion: electrophysiological correlates of the processing of pleasant and unpleasant music. Psychophysiology 44, 293–304. doi: 10.1111/j.1469-8986.2007.00497.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Saxe, R. (2006). Uniquely human social cognition. Curr. Opin. Neurobiol. 16, 235–239. doi: 10.1016/j.conb.2006.03.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Shamay-Tsoory, S. G., and Aharon-Peretz, J. (2007). Dissociable prefrontal networks for cognitive and affective theory of mind: a lesion study. Neuropsychologia 45, 3054–3067. doi: 10.1016/j.neuropsychologia.2007.05.021

PubMed Abstract | CrossRef Full Text | Google Scholar

Sheykhivand, S., Mousavi, Z., Rezaii, T. Y., and Farzamnia, A. (2020). Recognizing emotions evoked by music using CNN-LSTM networks on EEG signals. IEEE Access 8, 139332–139345. doi: 10.1109/ACCESS.2020.3011882

CrossRef Full Text | Google Scholar

Tan, Z. F., and Tan, G. P. (2012). Evaluation method of vehicle sound quality. Appl. Mech. Mater. 178–181, 2829–2833. doi: 10.4028/www.scientific.net/AMM.178-181.2829

CrossRef Full Text | Google Scholar

Thammasan, N., Fukui, K. I., and Numao, M. (2016). “Application of deep belief networks in eeg-based dynamic music-emotion recognition,” in The International Joint Conference on Neural Networks (IJCNN 2016) (IEEE).

Google Scholar

Tsang, C. D., Trainor, L. J., Santesso, D. L., Tasker, S. L., and Schmidt, L. A. (2010). Frontal EEG responses as a function of affective musical features. Ann. N. Y. Acad. Sci. 930, 439–442. doi: 10.1111/j.1749-6632.2001.tb05764.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Vidaurre, C., Krämer, N., Blankertz, B., and Schlögl, A. (2009). Time domain parameters as a feature for EEG-based brain-computer interfaces. Neural Netw. 22, 1313–1319. doi: 10.1016/j.neunet.2009.07.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, W. L., and Lu, B. L. (2015). “Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks,” in IEEE Transactions on Autonomous Mental Development.

Google Scholar

Zheng, W. L., Zhu, J. Y., and Lu, B. L. (2017). “Identifying stable patterns over time for emotion recognition from EEG,” in IEEE Transactions on Affective Computing.

Google Scholar

Keywords: automobile sound quality, EEG, brain cognition laws, Kalman smoothing, mRMR

Citation: Xie L, Lu C, Liu Z, Yan L and Xu T (2021) Study of Auditory Brain Cognition Laws-Based Recognition Method of Automobile Sound Quality. Front. Hum. Neurosci. 15:663049. doi: 10.3389/fnhum.2021.663049

Received: 03 February 2021; Accepted: 04 June 2021;
Published: 08 October 2021.

Edited by:

Jose Luis Contreras-Vidal, University of Houston, United States

Reviewed by:

Bradley Jay Edelman, Max Planck Institute of Neurobiology (MPIN), Germany
Seong-Eun Kim, Seoul National University of Science and Technology, South Korea

Copyright © 2021 Xie, Lu, Liu, Yan and Xu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Zhien Liu, lzen@whut.edu.cn

ORIGINAL RESEARCH article

Study of Auditory Brain Cognition Laws-Based Recognition Method of Automobile Sound Quality

Introduction

Related Works

Contribution

Study Outline

Experiment Design

Methodology

Feature Extraction

Feature Selection

Classifier

Experiment Result

Cognitive Laws Induced by Automobile Sound

Feature Selection

Optimization of Classifier Accuracy

Discussion

Conclusions

Data Availability Statement

Ethics Statement

Author Contributions

Funding

Conflict of Interest

Publisher's Note

References

This article is part of the Research Topic

People also looked at