Next Article in Journal
Beyond Word-Based Model Embeddings: Contextualized Representations for Enhanced Social Media Spam Detection
Next Article in Special Issue
Enhancing Reliability in Wind Turbine Power Curve Estimation
Previous Article in Journal
TER-CA-WGNN: Trimodel Emotion Recognition Using Cumulative Attribute-Weighted Graph Neural Network
Previous Article in Special Issue
Pruning Quantized Unsupervised Meta-Learning DegradingNet Solution for Industrial Equipment and Semiconductor Process Anomaly Detection and Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improved Adversarial Transfer Network for Bearing Fault Diagnosis under Variable Working Conditions

1
Department of Mechanical Engineering, Xi’an Jiaotong University, Xi’an 710049, China
2
Department of Electronic and Electrical Engineering, Brunel University London, London UB8 3PH, UK
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(6), 2253; https://doi.org/10.3390/app14062253
Submission received: 25 January 2024 / Revised: 29 February 2024 / Accepted: 2 March 2024 / Published: 7 March 2024

Abstract

:
Bearings are one of the critical components of rotating machinery, and their failure can cause catastrophic consequences. In this regard, previous studies have proposed a variety of intelligent diagnosis methods. Most existing bearing fault diagnosis methods implicitly assume that the training and test sets are from the same distribution. However, in real scenarios, bearings have been working in complex and changeable working environments for a long time. The data during their working processes and the data used for model training cannot meet this condition. This paper proposes an improved adversarial transfer network for fault diagnosis under variable working conditions. Specifically, this paper combines an adversarial transfer network with a short-time Fourier transform to obtain satisfactory results with the lighter network. Then, this paper employs a channel attention module to enhance feature fusion. Moreover, this paper designs a novel domain discrepancy hybrid metric loss to improve model transfer learning performance. Finally, this paper verifies the method’s effectiveness on three datasets, including dual-rotor, a Case Western Reserve University dataset and the Ottawa dataset. The proposed method achieves average accuracy, surpassing other methods, and shows better domain alignment capabilities.

1. Introduction

Bearings are critical rotating machinery components and one of these machines’ weakest links [1,2]. Their performance directly affects the system’s stable operation and production efficiency [3,4]. Taking an aerospace engine as an example, as shown in Figure 1, rolling bearings are easy to start at low temperatures and have small friction losses, wide operating ranges and strong resistance to oil cutoffs; engines use rolling bearings as main bearings [5,6]. The main bearings support the engine rotor in the engine and transfer rotor loads. The main bearing is divided into the thrust bearing and the traveling bearing. Thrust bearings carry axial loads and control the axial clearance of the rotor and stator. Travel bearings bear radial loads and control radial clearance. When any bearing fails, the aircraft may be forced to land; otherwise, the aircraft may be destroyed, and personnel may lose their lives [7,8]. Therefore, developing efficient and accurate bearing fault diagnosis methods has become necessary to ensure equipment reliability and reduce maintenance costs.
In the past decades, previous studies have proposed various methods for intelligent bearing fault diagnosis [9,10], which can be divided into two categories: model-based methods and data-driven methods [11]. The model-based methods extract features from raw data according to prior knowledge and construct mathematical models to evaluate the health state of bearings [12,13]. However, model-based methods rely too much on prior knowledge, so their robustness could be better. Data-driven methods refer to establishing mappings from fault space to feature space by using a large amount of training data without any knowledge and experience [14]. With the recent development of sensing and intelligent computing technology, it has become more accessible and easier to collect massive amounts of data, and data-driven intelligent diagnosis methods have attracted increased attention [15].
Data-driven methods have swept through the field of intelligent bearing fault diagnosis due to their end-to-end diagnosis mode [16]. For instance, Ahmed et al. [17] used a sparse autoencoder to learn over-complete sparse representations of datasets compressed by compressed sampling. Linshan et al. [18] proposed a novel convolution neural network (CNN) model named the Gramian Time-Frequency Enhancement Network (GTFE-Net) for bearing fault diagnosis. Kaicheng et al. [19] combined Improve Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) and CNN to enhance recognition of the state of rolling bearings. Diwang et al. [20] improved the performance of CNN by guiding the CNN design with the physical characteristics of bearing acceleration signals. Wang et al. [21] presented an improved spiking neural network for inter-shaft bearing fault diagnosis. Sinitsin et al. [22] designed a novel hybrid CNN-MPL model-based bearing fault-diagnosis method. He et al. [23] proposed a new framework based on small labeled infrared thermal images and an enhanced convolutional neural network transferred from a convolutional autoencoder.
Although data-driven methods have been applied to bearing fault diagnosis, most assume that the source dataset, which indicates the dataset used for the training model, and the target dataset, which indicates the datasets used for the testing model, are from the same distribution [24,25]. However, the bearing usually works under complex and changing conditions in actual scenarios, so the assumption cannot be achieved [26,27]. To address this problem, transfer learning (TL) emerged [28,29]. Transfer learning improves performance in the target domain by transferring knowledge from different but related source domains [30,31]. In this way, the reliance on copious amounts of target domain data for building learners can be reduced [32,33]. A feasible solution is to find standard latent features through feature transformation and use them as a bridge for knowledge transfer, converting each original feature into a new feature representation for knowledge transfer [34,35]. Wen et al. [36] reduced the distribution discrepancy across the two domains by minimizing the maximum mean discrepancy (MMD). Sinno et al. [37] designed a novel metric to measure the discrepancy between source and target datasets. Chunran et al. [38] proposed a class-level matching TL network to match source and target domain data. Wang et al. [39] used domain-adversarial training based on the Wasserstein distance to learn domain-invariant features from the raw signal. Cui et al. [40] combined distance-based domain adaptation and adversarial-based domain adaptation to propose a multi-adversarial joint distribution adaptation network.
This paper proposes an improved adversarial transfer network (IATN) for bearing fault diagnosis under variable working conditions, inspired by the effectiveness of distance-based and adversarial-based domain adaptation. Specifically, this paper combines an adversarial transfer network with a short-time Fourier transform (STFT) [41] to obtain satisfactory results with lighter networks. Then, this paper employs an attention module to enhance feature fusion. Moreover, this paper designs a novel domain discrepancy hybrid metric loss to improve model transfer learning performance. Finally, this paper verifies the method’s effectiveness on a private dual-rotor dataset, a Case Western Reserve University (CWRU) dataset [42] and the Ottawa dataset [43]. The main contributions are as follows:
(1)
This paper combines adversarial TL with the traditional signal processing method, STFT, which allows us to obtain satisfactory results with a much lighter network.
(2)
This paper employs a channel attention module to enhance feature fusion.
(3)
This paper designs a novel domain discrepancy hybrid metric to measure the discrepancy between features of the source dataset and target dataset features.
(4)
This paper evaluates the proposed approach on three test benches’ single-source tasks and multi-source tasks. The results demonstrate that the proposed method performs better than other methods.
The structure of this paper is as follows: In Section 2, related knowledge is introduced. Section 3 presents the proposed method in detail. Section 4 provides information about the experiment and analysis of the results. Section 5 provides further discussion, and Section 6 provides conclusions.

2. Preliminary Information

2.1. Related Works

In industrial applications, bearings are key rotating machinery components, and monitoring and fault diagnosis of their condition is crucial to ensuring the stable operation of equipment. Especially under changing working conditions, traditional fault diagnosis methods are often difficult to adapt to different operating environments, resulting in a decline in diagnostic performance. In recent years, deep learning, as a powerful end-to-end strategy, has been widely used to diagnose bearing faults intelligently. This section reviews significant recent research progress in the intelligent diagnosis of bearing faults under variable operating conditions.
In recent years, multi-scale feature learning has shown strong potential in bearing fault diagnosis. For example, Chen et al. [44] proposed a method based on multi-scale convolutional neural networks to effectively extract bearing fault features for fault diagnosis. Hu et al. [45] proposed a novel fault diagnosis method based on enhanced multi-scale sample entropies and balanced adaptation regularization for fault diagnosis needs under different working conditions. To solve the problem of inconsistent training and test data distribution, Guo et al. [46] explored the application of transfer learning technology in bearing fault diagnosis. They proved its effectiveness in cross-domain fault diagnosis. Saha et al. [47] employed transfer learning and random forest classification to exemplify the potential of combining deep learning techniques with traditional algorithms in bearing fault diagnosis. Moreover, researchers have also widely discussed data enhancement methods based on generative adversarial networks (GANs). Wang et al. [48] developed a data augmentation strategy based on GANs to improve the model’s generalization ability under varying working conditions. Shi et al. [49] encapsulated a shared feature extractor, a label predictor and a series of domain discriminators to propose an adversarial multi-source data subdomain adaptation (AMDSA) model. Zhu et al. [50] present a deep subdomain adaptation network (DSAN) by aligning the relevant subdomain distributions of domain-specific layer activations across domains based on a local maximum mean. Meng et al. [51] explore the application of graph neural networks in bearing fault diagnosis. In addition, ensemble learning methods improve diagnostic accuracy by combining the prediction results of multiple learners and have also received increasing attention. Tong et al. [52] proposed a multi-sensor information fusion method based on ensemble learning to automatically learn fault-related information from multi-sensor signals and provide accurate diagnosis results. Chen et al. [53] proposed an explainable learning framework to further improve the accuracy and robustness of bearing fault diagnosis under variable working conditions.
Through these studies, one can see the rapid development of intelligent diagnostic technology in bearing fault detection. These techniques improve the accuracy of fault diagnosis and enhance the model’s adaptability in the face of changing operating conditions.

2.2. Domain-Adversarial Adaptation Network

DAT maps two different domains into a common subspace to eliminate the differences between the domains [54]. As Figure 2 shows, a domain-adversarial adaptation network consists of three parts—feature extractor θ f , classifier θ c and domain discriminator θ D [55].
Given a labeled source dataset D s and an unlabeled target dataset D T , both are input into θ f , and then the features of D s are input into θ c and obtain the predicted class. According to the predicted label and the true label, class loss L C is calculated. Moreover, the features of D s and D T are input into θ D to obtain the predicted domain, followed by the domain loss L D .
In the training process of domain-adversarial adaptation networks, all the parameters are updated by Equation (1):
θ f θ f α L C θ f β L D θ f θ C θ C α L C θ C θ D θ D α L D θ D
where α and β are positive constants. When this paper updates the parameters, this paper multiplies the gradient of L D to θ f by a negative constant β , which prevents θ f from extracting features for domain discrimination.
In general, domain-adversarial adaptation networks can be thought of as a two-player game. In this game, the players are the domain discriminator θ D and the feature extractor θ f [56]. The target of θ D is to identify the domain of each sample, but θ f strives to deceive θ D . After long-term adversarial training, θ f specifically extracts domain-invariant features.

2.3. Maximum Mean Difference (MMD)

The purpose of TL is to apply the knowledge learned in the source domain to a different but related target domain. Essentially, it is to find a loss function that minimizes the distance between domain features and target domain features. Therefore, MMD is used to measure the difference in data distribution between the two domains [57]. Given two distributions, X and Y , MMD is defined as follows:
M M D ( X , Y ) = 1 n i = 1 n   ϕ x i 1 m j = 1 m   ϕ y j H 2
where H means that this distance is measured by mapping the data into the regenerative Hilbert space using the kernel function ϕ x . m ,   n are the sample numbers of two distributions, X and Y .
In a word, the meaning of MMD is to map variables onto a high-dimensional space through a kernel function and then find the expected difference between two distributed random variables after mapping.

3. IATN for Bearing Fault Diagnosis under Variable Working Conditions

Like domain-adversarial adaptation networks, the proposed IATN also consists of three parts—feature extractor θ f , classifier θ c and domain discriminator θ D . Meanwhile, there are some improvements in the proposed approach.

3.1. Preprocess

Compared with the raw time-domain signal, the time-frequency image has both time and frequency domain information and contains more fault information. In addition, the features of time-frequency images are more prominent and require lower feature extraction capabilities of the network. In fault diagnosis research, relevant studies have proved that time-frequency domain input is better than time-domain input [58]. By converting the raw 1D time-domain signal into a 2D time-frequency diagram, the features of the signal are enhanced, allowing us to use lighter downstream networks for classification.
Therefore, this paper performs STFT on the collected raw signals. STFT selects a time-frequency localized window function, assumes that the signal is stationary (pseudo-stationary) within a brief time interval and moves the window function to calculate the frequency spectrum at different moments [59]. STFT is defined as follows:
S T F T x t , f = + x τ g * τ t e j 2 π f τ d τ
where x ( t ) is the raw signal, and g * ( τ t ) is the window function.
Ideally, STFT can provide time and frequency information at the same time so that the frequency distribution of the signal over time can be observed, which is crucial for fault diagnosis. By converting a signal into its time-frequency representation, STFT can reveal structural features that may not be apparent in the original time-domain signal. This conversion can improve the separability between different categories, such as inner rings, outer rings, rolling elements and different degrees, such as standard, minor and major faults, so shallow neural networks can achieve high-accuracy classification.
However, in actual scenarios, various factors often interfere with bearing signals, including background noise, structural resonance, amplitude modulation, etc. At this time, the classification of time-frequency diagrams may involve highly nonlinear and complex data structures, especially when there are multiple fault types and changing work conditions. Shallow neural network performance cannot model such complexity and nonlinearity. In contrast, deep neural networks could be better at extracting high-level features from complex data. Deep neural networks extract abstract features that are difficult to observe directly through multi-layer nonlinear transformations to achieve high-precision classification of time-frequency images.

3.2. Feature Extractor

As shown in Figure 3, differently from previous studies, the input of IATN is a 2D time-frequency diagram rather than a 1D vibration signal. Therefore, the feature extractor consists of a 2D CNN. Since this paper performs STFT on the raw signal, the feature extractor can use a lighter network. The structure of the feature extractor of the model is shown in Table 1.
Moreover, the hyperparameters of IATN for all tasks are shown in Table 2.
In addition, since convolution only operates in a local space, obtaining enough information to extract the relationship between channels takes much work. Therefore, this paper introduces the channel attention mechanism to determine the importance of each feature channel, thereby enhancing the weight of essential features and reducing the weight of irrelevant information [60].
The channel attention mechanism includes three steps: squeezing, excitation and scaling. As shown in Figure 3, the number of channels of the feature X 2 of the input channel attention layer is C 2 , and the data are expressed as H 2 × W 2 . The squeeze operation applies global average pooling to convert the data into Z , with a shape of 1 × 1 × C 2 . This process expands the receptive field and encodes the entire spatial feature on a channel into a global feature, as shown in (4).
Z c = F s q u c = 1 H × W i = 1 H   j = 1 W   u c ( i , j )
where z c denotes the c-th element of matrix Z , and u c denotes the c-th channel of C 2 .
Subsequently, excitation obtains the feature weight of each channel through two fully connected layers, as shown in (5):
S = F e x ( Z , W ) = σ ( g ( Z , W ) ) = σ W 2 δ W 1 Z
where σ ( x ) denotes sigmoid function, δ ( x ) denotes the Relu function.
Finally, scaling multiplies the obtained feature weights S with the input X 2 channel by channel, as shown in (6):
X 3 c = F scale   u c , s c = s c u c
where X 3 c denotes the c-th channel of X 3 .
After that, in the training, both the samples from source datasets and the target dataset are input into the feature extractor, and then the feature extractor outputs source features F S and F T .

3.3. Transfer Loss

As mentioned in Section 2, the most used transfer loss function is currently MMD. One of the most important concepts of MMD is the kernel function ϕ x . In MMD, ϕ x is fixed. The Gaussian kernel function is usually chosen as the kernel function because the Gaussian kernel function can map data onto an infinite-dimensional space:
k x , x = exp x x 2 / 2 σ 2
Theoretically, any selection of a kernel function can determine the MMD of two different distributions. But in some extreme cases, this kernel function is determined to make the MMD distance of two different distributions exceedingly small. To avoid this situation, MK-MMD is proposed.
K k = u = 1 m   β u k u : u = 1 m   β u = 1 , β u 0 , u
By using multi-kernel functions, this paper can better represent the differences in data distribution in high-dimensional space and improve the representation ability of model features.
M K M M D ( X , Y ) = 1 n i = 1 n   K x i 1 m j = 1 m   K y j H 2
Moreover, to further enhance the domain adaptation performance of the transfer loss, this paper designs a novel domain difference measure combining MK-MMD and feature center distance (FCD). FCD refers to calculating the distance between the target domain feature clustering center and the source domain feature clustering center. Assuming that there are m samples X k = [ X 1 , X 2 X m ] of the k -th category in the source domain, and there are n samples Y k = [ Y 1 , Y 2 Y m ] in the target domain judged to be the k -th category by the model, the definition of FCM is shown in (10):
L F C M = 1 k k = 1 K   s ˙ k = 1 k k = 1 K   X k ¯ Y k ¯
where K is the number of categories. By combining (9) and (10), the novel transfer loss is defined as (11):
L T = L M K M M D + L F C M
The combination of MMD and FCD enables the model to use different feature alignment strategies for feature transfer. FCD promotes local feature alignment, and MMD is used for global feature alignment. The integration of these two loss functions combines local and global feature alignment, thereby enhancing domain adaptation performance.

3.4. Training Process

As Figure 3 shows, the total loss function consists of three parts, transfer loss L T , class loss L C and domain loss L D . They are defined as follows:
L C = MSE ( Y C , Y C l a b e l ) L D = MSE ( Y D , Y D l a b e l ) L t o t a l = γ L T + L C + L D
where M S E is the mean squared error, Y C and Y D are the output of the classifier and domain discriminators and Y C l a b e l and Y D l a b e l are the class label and domain label.
Since there is gradient reverse in the backpropagation update process of domain loss L D , the model parameters are updated as shown in (13):
θ f θ f α L C θ f β L D θ f + γ L T θ f θ c θ c α L C θ c θ D θ D α L D θ D

3.5. Dataset

To evaluate the performance of the proposed method more comprehensively, this paper verifies its effectiveness on three datasets.
The first dataset is private. As Figure 4 shows, it is a dual-rotor test bench to simulate dual-rotor engine operation. The low-pressure rotor (LR) and the high-pressure rotor (HR) are driven by different motors and rotate in the same direction at different speeds. The LR is supported by the No. 1 and No. 4 bearings, and the high-pressure rotor HR is supported by the No. 2 and 3 bearings. The inter-shaft bearing is No. 3, and the experimental bearing is between the two rotors. The outer ring of the inter-shaft bearing rotates with the HR at a higher speed, and the inner ring rotates with the LR at a lower speed.
This paper collects vibration data under four working conditions (WDs), as shown in Table 3. Each working condition includes bearing data in nine different health states, including normal state, three degrees of inner-ring failure, three degrees of rolling-element failure and two different degrees of outer-ring failure. The sampling frequency is 20.48 k Hz. In addition, this paper uses a sliding window of a length of 1024 to sample the raw signal without overlap and collect 100 samples for each health state under each operating condition.
The second dataset is the public Case Western Reserve University dataset (CWRU). The CWRU dataset has four working conditions: 1797 rpm (R0), 1772 rpm (R1), 1750 rpm (R2) and 1730 rpm (R3). Under all working conditions, this paper selects normal data and three different degrees of fault data of the inner ring, outer ring and rolling elements collected at the driving end as experimental data. The sampling frequency is 12,000 Hz.
The third dataset is the public Ottawa dataset. The Ottawa dataset has four working conditions: increasing speed (O0), decreasing speed (O1), increasing then decreasing speed (O2) and decreasing then increasing speed (O3). Under all working conditions, this paper selects normal data and fault data of the inner ring and outer ring. The sampling frequency is 200,000 Hz.
The training epoch is uniformly set to 300 using the Adams optimizer, and the learning rate is 0.001. The code is written through the Pytorch 2.2 framework and runs on the GPU of GPTX1650.

4. Results

Here this paper compares the performance of the IATN with CNN [61], DANN [62], AMDSA, DASN, ISAE-CSDF [63] and CDGATLN [64] models on the three datasets, and the results are shown in Table 4, Table 5 and Table 6. In addition, the experiment in this paper is repeated 20 times to eliminate the influence of randomness.
One can generally make three observations: (1) the IATN has the highest accuracy on most tasks. (2) the IATN has the most minor variances. (3) The more similar the working conditions of the source and target domain data are, the better the transfer learning effect will be.
In Table 4, the diagnosis accuracy of the IATN is significantly higher than the other six methods on most tasks. In the task of D0 D3, all the methods have the lowest accuracy. The accuracies of the CNN, DANN, AMDSA, DASN, ISAE-CSDF and SA-SN-DCGAN are 58.75%, 61.82%, 65.28%, 60.32%, 65.64% and 66.82%, respectively, but the accuracy of the IATN is 70.62%. In the task of D0, D1, D3 D2, the accuracies of the CNN, DANN, TL and DSAN are 95.12%, 96.16%, 95.44%, 93.66%, 96.08% and 96.06%, respectively, but the accuracy of the IATN is 98.44%. In the last row, this paper calculates the average accuracy of the five methods on all tasks, and the average accuracy of the IATN is 91.89%, which is 6.82%, 4.77%, 1.06%, 2.52%, 2.58% and 1.23% higher than the CNN, DANN, ISAE-CSDF and SA-SN-DCGAN, respectively. As for variance, the average variance of the IATN is 1.10%, which is significantly lower than the averages of the other six methods, which are 2.29%, 1.72%, 1.52%, 1.91%, 1.35% and 1.49%, respectively.
In Table 5 and Table 6, one could also find similar discoveries by comparing the performance of the IATN and the other methods. The average accuracy of the IATN on all CWRU tasks is 97.74%, which is 4.38%, 3.22%, 1.48%, 1.95%, 2.19% and 1.93% higher than the CNN, DANN, AMDSA, DASN, ISAE-CSDF and SA-SN-DCGAN, respectively. The average variance of the IATN on all CWRU tasks is 0.47%, which is lower than the other six methods, which are 0.52%, 0.75%, 0.96%, 1.07%, 0.64% and 0.72%, respectively. The average accuracy of the IATN on all Ottawa dataset tasks is 81.11%, which is 6.77%, 4.23%, 1.42%, 3.78%, 3.28% and 0.84% higher than the CNN, DANN, AMDSA, DASN, ISAE-CSDF and SA-SN-DCGAN, respectively. The average variance of the IATN on all Ottawa dataset tasks is 1.24%, which is lower than those of the other six methods, which are 1.56%, 1.97%, 2.10%, 2.38%, 1.28% and 1.34%, respectively.
The comparison results demonstrate that the IATN significantly improves bearing fault diagnosis under variable WDs. Moreover, when the WDs of the source and target domains are similar, the test accuracy obtained is also higher. For example, the accuracy of all methods on the D2 D3 task is higher than that on the D0 D3 task. Data similarity may be higher when operating conditions are similar, making it easier to transfer fault diagnosis knowledge. This suggests that data similarity may be higher when WDs are similar, making it easier to transfer fault diagnosis knowledge.

5. Discussion

5.1. Running Process Comparison

To evaluate the performance of the model more comprehensively, this paper compares the running process of 300 epochs of different methods on the D0 D1 task, the D0, D1 D2 task and the D0, D1, D2 D3 task, the results are shown in the Table 7 and Figure 5.
According to Table 7, it is obvious that the running time of the IATN is more than those of the CNN and DANN but less than those of the ISAE-CSDF and CDGATLN. By comparing the running time of the IATN with those of the CNN and DANN, it can be concluded that the transfer process is much more time-consuming than the adversarial process.
Moreover, the training curves of the IATN, CNN and DANN are presented in Figure 5. In Figure 5, there are three subfigures. Figure 5a presents the training curves of the D0 D1 task; Figure 5b presents the training curves of the D0, D1 D2 task; and Figure 5c presents the training curves of the D0, D1, D2 D3 task.
In Figure 5, the blue lines represent the accuracy of the CNN; the green lines represent the accuracy of the DANN, and the red lines represent the accuracy of the IATN. According to Figure 5, it is obvious that although the calculation process of the IATN is more complex, this does not affect its convergence. On the three tasks with different numbers of source domains, the convergence speed of the IATN is like that of the CNN and DANN.

5.2. Feature Visualization

According to Table 4, Table 5 and Table 6, it is obvious that the IATN performs better than the other methods in extracting domain-invariant features. To demonstrate the superiority of the IATN more intuitively in extracting domain-invariant features, this paper performs feature visualization of the D2 D3 task. Specifically, the source domain features and target domain features extracted by each model are reduced to two dimensions for visualization with t-distributed stochastic neighbor embedding (t-SNE) [65]. The results are shown in Figure 6.
In Figure 6, there are five subfigures. Figure 6a is the feature visualization of the CNN, Figure 6b is the feature visualization of the DANN, Figure 6c is the feature visualization of the ISAE-CSDF, Figure 6d is the feature visualization of the CDGATLN and Figure 6e is the feature visualization of the IATN. In each subfigure, there are 18 clusters; 9 belong to D2 tasks and 9 belong to D3 tasks. There are two criteria for judging model performance. One is the degree of separation of the nine fault clusters in each WD, and the other is the degree of overlap of the same fault clusters in the two working conditions.
According to Figure 6, it is obvious that there is significant overlap between different fault clusters of source domain features and target domain features extracted by the CNN, DANN, ISAE-CSDF and CDGATLN, which initially have completely separate clusters, but there is still aliasing at the boundaries of some categories. In Figure 6e, all the clusters of each WD are separated entirely, and all the same fault clusters of D2 and D3 are basically overlapped, which indicates that the IATN has strong domain-invariant feature extraction capabilities.

5.3. Data Analysis

Based on the observations in Table 4 and Table 5, the closer the working conditions are, the more similar the distribution of the collected data will be. This paper calculates the MK-MMD and MMD between the data collected under different working conditions to verify the hypothesis. The results are shown in Table 8.
According to Table 8, it is evident that the more similar the working conditions of the source domain data and the target domain data are, the lower the MK-MMD and MMD between them are. To express the connection between the distribution difference between source and target domain data and the model performance more intuitively, we perform correlation analysis on them according to (14):
ρ x , y = Cov ( X , Y ) D ( X ) D ( Y ) = E ( X Y ) E ( X ) E ( Y ) E X 2 E 2 ( X ) E Y 2 E 2 ( Y )
The correlation coefficient between the MK-MMD of dual-rotor tasks and their transfer learning accuracy is defined as ρ 1 ; the correlation coefficient between the MMD of dual-rotor tasks and their transfer learning accuracy is defined as ρ 2 ; the correlation coefficient between the MK-MMD of CWRU tasks and their transfer learning accuracy is defined as ρ 3 ; the correlation coefficient between the MMD of CWRU tasks and their transfer learning accuracy is defined as ρ 4 ; the correlation coefficient between the MK-MMD of Ottawa dataset tasks and their transfer learning accuracy is defined as ρ 5 ; the correlation coefficient between the MMD of Ottawa dataset tasks and their transfer learning accuracy is defined as ρ 6 . According to (11), it can be calculated that ρ 1 =   0.9523 , ρ 2 =   0.9044 , ρ 3 =   0.8456 , ρ 4 =   0.5045 , ρ 5 =   0.9574 and ρ 6 =   0.9549 .
It is obvious that there exists a strong negative correlation. Therefore, it can be summarized that the more similar the working conditions of the source domain data and the target domain data are, the lower the MK-MMD and MMD between them are, and the better the transfer learning results will be.

5.4. Hyperparameter Optimization

The IATN has two main hyperparameters, the inverse gradient coefficient β and the transfer learning coefficient γ . To study the impact of their values on model performance, this paper adjusts k and ϑ to find the optimal values. Here, this paper sets β and γ to different values on D1 D2 and D2 D3 tasks and conducts 10 repeated experiments to obtain their average accuracy. The results are shown in Figure 7. In Figure 7, the abscissa of each graph represents the value of the hyperparameter, the ordinate represents the accuracy, the blue line represents the experimental results of D1 D2 and the red line represents the experimental results of D2 D3.
In Figure 6a, it is obvious that the optimal parameters are different for different tasks. For the D1 D2 task, the best β is 0.03; for the D2 D3 task, the best β is 0.04. For the D1 D2 task, the best β is 0.03; for the D2 D3 task, the best β is 0.04. In Figure 6b, the optimal γ   for both tasks is 0.1.
In summary, this paper chooses β = 0.04 and γ = 0.1 as the experimental parameters.

6. Conclusions

This paper proposes an IATN for bearing fault diagnosis under variable WDs. Specifically, this paper combines adversarial transfer networks with an STFT to obtain satisfactory results with lighter networks. Then, this paper employs a channel attention module to enhance feature fusion, allowing the model to emphasize useful features and suppress less-useful features selectively. Moreover, this paper designs a novel domain discrepancy hybrid metric loss by combining MK-MMD and FCD. FCD promotes local feature alignment, and MMD is used for global feature alignment. The integration of these two loss functions combines local and global feature alignment. Finally, this paper verifies the method’s effectiveness on three datasets, a private dual-rotor dataset, a CWRU dataset and the Ottawa dataset. The IATN achieves an average accuracy that is significantly higher than those of the other six methods, CNN, DANN, AMDSA, DASN, ISAE-CSDF and SA-SN-DCGAN. The results indicate that the IATN has strong domain-invariant feature extraction capabilities.
In the future, we will pay more attention to the interpretability of the transfer process and study what knowledge is transferred during the transfer learning process.

Author Contributions

J.W. contributed to conceptualization, formal analysis, investigation, methodology and writing—original draft; H.A. contributed to methodology, validation and writing—review and editing; X.C. contributed to funding acquisition, project administration and writing—review and editing; R.Y. contributed to funding acquisition, project administration and data curation; and A.K.N. contributed to supervision, validation and writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Natural Science Foundation of China (No. 52175116), Major Research Programs of the Natural Science Foundation of China (No. 92060302), the Research Foundation of the Higher Educational Key Laboratory for Flexible Manufacturing Equipment Integration of Fujian Province, the Xiamen Institute of Technology, the National Key Science and Technology Infrastructure Opening Project Fund for Research and Evaluation facilities for Service Safety of Major Engineering Materials and the Aeronautical Science Foundation (No. 2019ZB070001). Also, this work was supported in part by the Royal Society award (number IEC\NSFC\223294) to Asoke K. Nandi. Jun Wang acknowledges the financial support from the Innovative Leading Talents Scholarship and Brunel University London.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ahmed, H.; Nandi, A.K. Condition Monitoring with Vibration Signals: Compressive Sampling and Learning Algorithms for Rotating Machines; John Wiley & Sons: Chichester, UK, 2020; ISBN 978-1-119-54462-3. [Google Scholar]
  2. Desnica, E.; Ašonja, A.; Radovanović, L.; Palinkaš, I.; Kiss, I. Selection, Dimensioning and Maintenance of Roller Bearings. In Proceedings of the 31st International Conference on Organization and Technology of Maintenance (OTO 2022), Osijek, Croatia, 12 December 2022; Blažević, D., Ademović, N., Barić, T., Cumin, J., Desnica, E., Eds.; Lecture Notes in Networks and Systems. Springer: Cham, Switzerland, 2022; Volume 592, pp. 210–227. [Google Scholar] [CrossRef]
  3. Liu, S.; Chen, S.; Chen, Z.; Gong, Y. Fault Diagnosis Strategy Based on BOA-ResNet18 Method for Motor Bearing Signals with Simulated Hydrogen Refueling Station Operating Noise. Appl. Sci. 2024, 14, 157. [Google Scholar] [CrossRef]
  4. Patil, A.A.; Desai, S.S.; Patil, L.N.; Patil, S.A. Adopting Artificial Neural Network for Wear Investigation of Ball Bearing Materials Under Pure Sliding Condition. Appl. Eng. Lett. 2022, 7, 81–88. [Google Scholar] [CrossRef]
  5. Xu, Z.; Ji, F.; Ding, S.; Zhao, Y.; Zhou, Y.; Zhang, Q.; Du, F. Digital Twin-Driven Optimization of Gas Exchange System of 2-Stroke Heavy Fuel Aircraft Engine. J. Manuf. Syst. 2021, 58, 132–145. [Google Scholar] [CrossRef]
  6. Mu, X.; Wang, Y.; Yuan, B.; Sun, W.; Liu, C.; Sun, Q. A New Assembly Precision Prediction Method of Aeroengine High-Pressure Rotor System Considering Manufacturing Error and Deformation of Parts. J. Manuf. Syst. 2021, 61, 112–124. [Google Scholar] [CrossRef]
  7. He, Y.; Hu, M.; Feng, K.; Jiang, Z. An Intelligent Fault Diagnosis Scheme Using Transferred Samples for Intershaft Bearings under Variable Working Conditions. IEEE Access 2020, 8, 203058–203069. [Google Scholar] [CrossRef]
  8. Berghout, T.; Benbouzid, M. Diagnosis and Prognosis of Faults in High-Speed Aeronautical Bearings with a Collaborative Selection Incremental Deep Transfer Learning Approach. Appl. Sci. 2023, 13, 10916. [Google Scholar] [CrossRef]
  9. Ahmed, H.O.A.; Nandi, A.K. Connected Components-Based Colour Image Representations of Vibrations for a Two-Stage Fault Diagnosis of Roller Bearings Using Convolutional Neural Networks. Chin. J. Mech. Eng. 2021, 34, 37. [Google Scholar] [CrossRef]
  10. Ahmed, H.; Nandi, A.K. Three-Stage Hybrid Fault Diagnosis for Rolling Bearings with Compressively Sampled Data and Subspace Learning Techniques. IEEE Trans. Ind. Electron. 2019, 66, 5516–5524. [Google Scholar] [CrossRef]
  11. Zuo, L.; Zhang, L.; Zhang, Z.-H.; Luo, X.-L.; Liu, Y. A Spiking Neural Network-Based Approach to Bearing Fault Diagnosis. J. Manuf. Syst. 2021, 61, 714–724. [Google Scholar] [CrossRef]
  12. Li, Y.; Wang, S.; Deng, Z. Intelligent Fault Identification of Rotary Machinery Using Refined Composite Multi-Scale Lempel-Ziv Complexity. J. Manuf. Syst. 2020, 61, 725–735. [Google Scholar] [CrossRef]
  13. Ahmed, H.; Nandi, A.K. Compressive Sampling and Feature Ranking Framework for Bearing Fault Classification with Vibration Signals. IEEE Access 2018, 6, 44731–44746. [Google Scholar] [CrossRef]
  14. Wang, Z.; Liang, B.; Guo, J.; Wang, L.; Tan, Y.; Li, X. Fault Diagnosis Based on Residual–Knowledge–Data Jointly Driven Method for Chillers. Eng. Appl. Artif. Intell. 2023, 125, 106768. [Google Scholar] [CrossRef]
  15. Pan, J.; Zi, Y.; Chen, J.; Zhou, Z.; Wang, B. LiftingNet: A Novel Deep Learning Network with Layerwise Feature Learning from Noisy Mechanical Data for Fault Classification. IEEE Trans. Ind. Electron. 2018, 65, 4973–4982. [Google Scholar] [CrossRef]
  16. Wen, L.; Li, X.; Gao, L.; Zhang, Y. A New Convolutional Neural Network-Based Data-Driven Fault Diagnosis Method. IEEE Trans. Ind. Electron. 2018, 65, 5990–5998. [Google Scholar] [CrossRef]
  17. Ahmed, H.O.A.; Wong, M.L.D.; Nandi, A.K. Intelligent Condition Monitoring Method for Bearing Faults from Highly Compressed Measurements Using Sparse Over-Complete Features. Mech. Syst. Signal Process. 2018, 99, 459–477. [Google Scholar] [CrossRef]
  18. Jia, L.; Chow, T.W.S.; Yuan, Y. GTFE-Net: A Gramian Time Frequency Enhancement CNN for Bearing Fault Diagnosis. Eng. Appl. Artif. Intell. 2023, 119, 105794. [Google Scholar] [CrossRef]
  19. Zhao, K.; Xiao, J.; Li, C.; Xu, Z.; Yue, M. Fault Diagnosis of Rolling Bearing Using CNN and PCA Fractal-Based Feature Extraction. Measurement 2023, 223, 113754. [Google Scholar] [CrossRef]
  20. Ruan, D.; Wang, J.; Yan, J.; Gühmann, C. CNN Parameter Design Based on Fault Signal Analysis and Its Application in Bearing Fault Diagnosis. Adv. Eng. Inform. 2023, 55, 101877. [Google Scholar] [CrossRef]
  21. Wang, J.; Li, T.; Sun, C.; Yan, R.; Chen, X. Improved Spiking Neural Network for Intershaft Bearing Fault Diagnosis. J. Manuf. Syst. 2022, 65, 208–219. [Google Scholar] [CrossRef]
  22. Sinitsin, V.; Ibryaeva, O.; Sakovskaya, V.; Eremeeva, V. Intelligent Bearing Fault Diagnosis Method Combining Mixed Input and Hybrid CNN-MLP Model. Mech. Syst. Signal Process. 2022, 180, 109454. [Google Scholar] [CrossRef]
  23. Zhiyi, H.; Haidong, S.; Xiang, Z.; Yu, Y.; Junsheng, C. An Intelligent Fault Diagnosis Method for Rotor-Bearing System Using Small Labeled Infrared Thermal Images and Enhanced CNN Transferred from CAE. Adv. Eng. Inform. 2020, 46, 101150. [Google Scholar] [CrossRef]
  24. Huo, C.; Jiang, Q.; Shen, Y.; Qian, C.; Zhang, Q. New Transfer Learning Fault Diagnosis Method of Rolling Bearing Based on ADC-CNN and LATL under Variable Conditions. Measurement 2022, 188, 110587. [Google Scholar] [CrossRef]
  25. Zheng, B.; Huang, J.; Ma, X.; Zhang, X.; Zhang, Q. An Unsupervised Transfer Learning Method Based on SOCNN and FBNN and Its Application on Bearing Fault Diagnosis. Mech. Syst. Signal Process. 2024, 208, 111047. [Google Scholar] [CrossRef]
  26. Li, C.; Zhang, S.; Qin, Y.; Estupinan, E. A Systematic Review of Deep Transfer Learning for Machinery Fault Diagnosis. Neurocomputing 2020, 407, 121–135. [Google Scholar] [CrossRef]
  27. Liu, Z.; Su, S.; Cao, J.; Yu, P. Transfer Learning for Bearing Fault Diagnosis Based on Improved Residual Network. In Proceedings of the 2023 IEEE 2nd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA), Changchun, China, 24–26 February 2023; pp. 1214–1216. [Google Scholar] [CrossRef]
  28. Liu, Y.; Wang, Y.; Chow, T.W.S.; Li, B. Deep Adversarial Subdomain Adaptation Network for Intelligent Fault Diagnosis. IEEE Trans. Ind. Inform. 2022, 18, 6038–6046. [Google Scholar] [CrossRef]
  29. Öztürk, C.; Taşyürek, M.; Türkdamar, M.U. Transfer Learning and Fine-Tuned Transfer Learning Methods’ Effectiveness Analyse in the CNN-Based Deep Learning Models. Concurr. Comput. Pract. Exp. 2023, 35, e7542. [Google Scholar] [CrossRef]
  30. Nguyen, C.T.; Van Huynh, N.; Chu, N.H.; Saputra, Y.M.; Hoang, D.T.; Nguyen, D.N.; Pham, Q.-V.; Niyato, D.; Dutkiewicz, E.; Hwang, W.-J. Transfer Learning for Wireless Networks: A Comprehensive Survey. Proc. IEEE 2022, 110, 1073–1115. [Google Scholar] [CrossRef]
  31. Chen, K.-F.; Lee, M.-C.; Lin, C.-H.; Yeh, W.-C.; Lee, T.-S. Multi-Fault and Severity Diagnosis for Self-Organizing Networks Using Deep Supervised Learning and Unsupervised Transfer Learning. IEEE Trans. Wirel. Commun. 2024, 23, 141–157. [Google Scholar] [CrossRef]
  32. Azari, M.S.; Flammini, F.; Santini, S.; Caporuscio, M. A Systematic Literature Review on Transfer Learning for Predictive Maintenance in Industry 4.0. IEEE Access 2023, 11, 12887–12910. [Google Scholar] [CrossRef]
  33. Bu, K.; He, Y.; Jing, X.; Han, J. Adversarial Transfer Learning for Deep Learning Based Automatic Modulation Classification. IEEE Signal Process. Lett. 2020, 27, 880–884. [Google Scholar] [CrossRef]
  34. Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A Comprehensive Survey on Transfer Learning. Proc. IEEE 2021, 109, 43–76. [Google Scholar] [CrossRef]
  35. Liu, D.; Zhang, J.; Wu, H.; Liu, S.; Long, J. Multi-Source Transfer Learning for EEG Classification Based on Domain Adversarial Neural Network. IEEE Trans. Neural Syst. Rehabil. Eng. 2023, 31, 218–228. [Google Scholar] [CrossRef]
  36. Wen, L.; Gao, L.; Li, X. A New Deep Transfer Learning Based on Sparse Auto-Encoder for Fault Diagnosis. IEEE Trans. Syst. Man Cybern. Syst. 2019, 49, 136–144. [Google Scholar] [CrossRef]
  37. Pan, S.J.; Kwok, J.T.; Yang, Q. Transfer Learning via Dimensionality Reduction. In Proceedings of the 23rd National Conference on Artificial Intelligence—Volume 2, Chicago, IL, USA, 13–17 July 2008; AAAI Press: Washington, DC, USA, 2008; pp. 677–682. [Google Scholar]
  38. Huo, C.; Jiang, Q.; Shen, Y.; Lin, X.; Zhu, Q.; Zhang, Q. A Class-Level Matching Unsupervised Transfer Learning Network for Rolling Bearing Fault Diagnosis under Various Working Conditions. Appl. Soft Comput. 2023, 146, 110739. [Google Scholar] [CrossRef]
  39. Wang, Y.; Sun, X.; Li, J.; Yang, Y. Intelligent Fault Diagnosis with Deep Adversarial Domain Adaptation. IEEE Trans. Instrum. Meas. 2021, 70, 1–9. [Google Scholar] [CrossRef]
  40. Cui, Z.; Cao, H.; Ai, Z.; Wang, J. A Multi-Adversarial Joint Distribution Adaptation Method for Bearing Fault Diagnosis under Variable Working Conditions. Appl. Sci. 2023, 13, 10606. [Google Scholar] [CrossRef]
  41. Guo, Y.; Zhou, X.; Li, J.; Ba, R.; Xu, Z.; Tu, S.; Chai, L. A Novel and Optimized Sine–Cosine Transform Wavelet Threshold Denoising Method Based on the sym4 Basis Function and Adaptive Threshold Related to Noise Intensity. Appl. Sci. 2023, 13, 10789. [Google Scholar] [CrossRef]
  42. Case Western Reserve University. Bearing Data Center. Available online: https://engineering.case.edu/bearingdatacenter/apparatus-and-procedures (accessed on 25 January 2024).
  43. Huang, H.; Baddour, N. Bearing Vibration Data Collected under Time-Varying Rotational Speed Conditions. Data Brief 2018, 21, 1745–1749. [Google Scholar] [CrossRef] [PubMed]
  44. Chen, J.; Huang, R.; Zhao, K.; Wang, W.; Liu, L.; Li, W. Multiscale Convolutional Neural Network With Feature Alignment for Bearing Fault Diagnosis. IEEE Trans. Instrum. Meas. 2021, 70, 1–10. [Google Scholar] [CrossRef]
  45. Hu, Q.; Si, X.; Qin, A.; Lv, Y.; Liu, M. Balanced Adaptation Regularization Based Transfer Learning for Unsupervised Cross-Domain Fault Diagnosis. IEEE Sens. J. 2022, 22, 12139–12151. [Google Scholar] [CrossRef]
  46. Guo, L.; Lei, Y.; Xing, S.; Yan, T.; Li, N. Deep Convolutional Transfer Learning Network: A New Method for Intelligent Fault Diagnosis of Machines with Unlabeled Data. IEEE Trans. Ind. Electron. 2019, 66, 7316–7325. [Google Scholar] [CrossRef]
  47. Saha, D.; Hoque, M.E.; Chowdhury, M.E.H. Enhancing Bearing Fault Diagnosis Using Transfer Learning and Random Forest Classification: A Comparative Study on Variable Working Conditions. IEEE Access 2024, 12, 5986–6000. [Google Scholar] [CrossRef]
  48. Wang, H.; Li, P.; Lang, X.; Tao, D.; Ma, J.; Li, X. FTGAN: A Novel GAN-Based Data Augmentation Method Coupled Time–Frequency Domain for Imbalanced Bearing Fault Diagnosis. IEEE Trans. Instrum. Meas. 2023, 72, 1–14. [Google Scholar] [CrossRef]
  49. Shi, J.; Wang, X.; Lu, S.; Zheng, J.; Dong, H.; Zhang, J. An Adversarial Multisource Data Subdomain Adaptation Model: A Promising Tool for Fault Diagnosis of Induction Motor Under Cross-Operating Conditions. IEEE Trans. Instrum. Meas. 2023, 72, 1–14. [Google Scholar] [CrossRef]
  50. Zhu, Y.; Zhuang, F.; Wang, J.; Ke, G.; Chen, J.; Bian, J.; Xiong, H.; He, Q. Deep Subdomain Adaptation Network for Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 1713–1722. [Google Scholar] [CrossRef] [PubMed]
  51. Meng, Z.; Zhu, J.; Cao, S.; Li, P.; Xu, C. Bearing Fault Diagnosis Under Multisensor Fusion Based on Modal Analysis and Graph Attention Network. IEEE Trans. Instrum. Meas. 2023, 72, 1–10. [Google Scholar] [CrossRef]
  52. Tong, J.; Liu, C.; Bao, J.; Pan, H.; Zheng, J. A Novel Ensemble Learning-Based Multisensor Information Fusion Method for Rolling Bearing Fault Diagnosis. IEEE Trans. Instrum. Meas. 2023, 72, 1–12. [Google Scholar] [CrossRef]
  53. Chen, Z.; Qin, W.; He, G.; Li, J.; Huang, R.; Jin, G.; Li, W. Explainable Deep Ensemble Model for Bearing Fault Diagnosis under Variable Conditions. IEEE Sens. J. 2023, 23, 17737–17750. [Google Scholar] [CrossRef]
  54. Sun, S.; Yeh, C.-F.; Hwang, M.-Y.; Ostendorf, M.; Xie, L. Domain Adversarial Training for Accented Speech Recognition. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 4854–4858. [Google Scholar] [CrossRef]
  55. Li, Y.; Wu, H.; Kang, Y.; Guo, Y.; Cui, Z.; Xing, J.; Wang, Q.; Meng, J. Channel Discrepancies Adaptive Modulation Recognition Using Domain Adversarial Training. In Proceedings of the 2021 Asia-Pacific International Symposium on Electromagnetic Compatibility (APEMC), Nusa Dua, Bali, Indonesia, 27–30 September 2021; pp. 1–4. [Google Scholar] [CrossRef]
  56. Ma, X.; Mou, X.; Wang, J.; Liu, X.; Geng, J.; Wang, H. Cross-Dataset Hyperspectral Image Classification Based on Adversarial Domain Adaptation. IEEE Trans. Geosci. Remote Sens. 2021, 59, 4179–4190. [Google Scholar] [CrossRef]
  57. Qin, R.; Lu, C. Research on Measurement Methods of Transferability between Different Domains in Transfer Learning. In Proceedings of the 2019 CAA Symposium on Fault Detection, Supervision and Safety for Technical Processes (SAFEPROCESS), Xiamen, China, 5–7 July 2019; pp. 926–931. [Google Scholar] [CrossRef]
  58. Zhang, Q.; Deng, L. An Intelligent Fault Diagnosis Method of Rolling Bearings Based on Short-Time Fourier Transform and Convolutional Neural Network. J. Fail. Anal. Prev. 2023, 23, 795–811. [Google Scholar] [CrossRef]
  59. Wang, X.; Ying, T.; Tian, W. Spectrum Representation Based on STFT. In Proceedings of the 2020 13th International Congress on Image and Signal Processing, BioMedical Engineering, and Informatics (CISP-BMEI), Chengdu, China, 17–19 October 2020; pp. 435–438. [Google Scholar] [CrossRef]
  60. Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef] [PubMed]
  61. Zhang, W.; Li, C.; Peng, G.; Chen, Y.; Zhang, Z. A Deep Convolutional Neural Network with New Training Methods for Bearing Fault Diagnosis under Noisy Environment and Different Working Load. Mech. Syst. Signal Process. 2018, 100, 439–453. [Google Scholar] [CrossRef]
  62. Di, Y.; Yang, R.; Huang, M. Fault Diagnosis of Rotating Machinery Based on Domain Adversarial Training of Neural Networks. In Proceedings of the 2021 IEEE 30th International Symposium on Industrial Electronics (ISIE), Kyoto, Japan, 20–23 June 2021; pp. 1–6. [Google Scholar] [CrossRef]
  63. Luo, S.; Huang, X.; Wang, Y.; Luo, R.; Zhou, Q. Transfer Learning Based on Improved Stacked Autoencoder for Bearing Fault Diagnosis. Knowl.-Based Syst. 2022, 256, 109846. [Google Scholar] [CrossRef]
  64. Wu, Z.; Jiang, H.; Liu, S.; Liu, Y.; Yang, W. Conditional Distribution-Guided Adversarial Transfer Learning Network with Multi-Source Domains for Rolling Bearing Fault Diagnosis. Adv. Eng. Inform. 2023, 56, 101993. [Google Scholar] [CrossRef]
  65. Senanayake, D.A.; Wang, W.; Naik, S.H.; Halgamuge, S. Selforganizing Nebulous Growths for Robust and Incremental Data Visualization. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4588–4602. [Google Scholar] [CrossRef]
Figure 1. Two kinds of main bearings in aeroengine.
Figure 1. Two kinds of main bearings in aeroengine.
Applsci 14 02253 g001
Figure 2. The structure of a domain-adversarial adaptation network. The lighter the output color, the closer the value is to 0; the darker the output color, the closer the value is to 1.
Figure 2. The structure of a domain-adversarial adaptation network. The lighter the output color, the closer the value is to 0; the darker the output color, the closer the value is to 1.
Applsci 14 02253 g002
Figure 3. IATN for bearing fault diagnosis under variable working conditions.
Figure 3. IATN for bearing fault diagnosis under variable working conditions.
Applsci 14 02253 g003
Figure 4. Dual-rotor test bench. (a) The test rig; (b) engineering sketch; (c) bearing failure mode.
Figure 4. Dual-rotor test bench. (a) The test rig; (b) engineering sketch; (c) bearing failure mode.
Applsci 14 02253 g004
Figure 5. Training curves. (a) D0 D1, (b) D0, D1 D2, (c) D0, D1, D2 D3.
Figure 5. Training curves. (a) D0 D1, (b) D0, D1 D2, (c) D0, D1, D2 D3.
Applsci 14 02253 g005
Figure 6. t-SNE feature visualization. (a) CNN. (b) DANN. (c) ISAE-CSDF. (d) CDGATLN. (e) IATN. D2-0 indicates data collected for category “0” fault bearings in D2 working conditions.
Figure 6. t-SNE feature visualization. (a) CNN. (b) DANN. (c) ISAE-CSDF. (d) CDGATLN. (e) IATN. D2-0 indicates data collected for category “0” fault bearings in D2 working conditions.
Applsci 14 02253 g006
Figure 7. The testing accuracy of β and γ on D1→D2 and D2→D3 tasks. (a) The impact of different values of β on D1→D2 and D2→D3 tasks; (b) the impact of different values of γ on D1→D2 and D2→D3 tasks.
Figure 7. The testing accuracy of β and γ on D1→D2 and D2→D3 tasks. (a) The impact of different values of β on D1→D2 and D2→D3 tasks; (b) the impact of different values of γ on D1→D2 and D2→D3 tasks.
Applsci 14 02253 g007
Table 1. The structure of feature extractor.
Table 1. The structure of feature extractor.
LayerKernelChannel
Conv2D3 × 316
BatchNorm/16
MaxPool/16
Relu/16
Conv2D3 × 332
BatchNorm/32
MaxPool/32
Relu/32
Table 2. Hyperparameters used in IATN.
Table 2. Hyperparameters used in IATN.
HyperparameterFirst AppearedValue
d (4)32
w (4)64
β (14)0.04
γ (13)0.1
Table 3. The working conditions of dual-rotor test bench.
Table 3. The working conditions of dual-rotor test bench.
NumberLR Speed (rpm)HR Speed (rpm)
D020004000
D130006000
D240008000
D3550010,000
Table 4. Performance comparison of dual-rotor test bench.
Table 4. Performance comparison of dual-rotor test bench.
TaskModel Performance (%)
CNNDANNAMDSADSANISAE-CSDF CDGATLNIATN
D1 D084.22 ± 1.2185.22 ± 1.0889.24 ± 0.9889.20 ± 1.0489.78 ± 1.2289.92 ± 1.1490.76 ± 1.06
D2 D068.80 ± 2.4272.80 ± 3.4279.26 ± 1.8276.84 ±  2.4278.24 ±  1.7979.94 ± 1.6282.82 ± 1.58
D3 D068.67 ± 1.4672.76 ± 3.1879.32 ± 1.7478.75 ±  2.3676.58 ±  1.5878.75 ±  2.3680.76 ± 1.38
D1, D2 D084.11 ± 1.8287.14 ± 1.6991.42 ± 1.5891.26 ±  1.7890.68 ± 1.5891.36 ± 1.6892.44 ± 1.28
D1, D3 D086.44 ± 2.0589.84 ± 2.1690.72 ± 1.4691.13 ±  1.6889.94 ± 1.0291.24 ± 1.0892.31 ± 0.76
D2, D3 D078.89 ± 2.2881.86 ± 1.0689.92 ± 1.8888.98 ±  2.5686.73 ± 1.4290.28 ± 1.5291.89 ± 1.26
D1, D2, D3 D087.33 ± 3.7989.67 ± 0.9692.56 ± 1.2493.12 ± 1.3690.54 ± 1.0692.02 ± 1.1293.42 ±  0.88
D0 D188.89 ± 1.6789.16 ± 1.5692.12 ± 1.7292.02 ± 1.9289.49 ± 1.8292.12 ± 1.2692.82 ± 1.08
D2 D188.26 ± 1.8889.06 ± 1.6895.86 ± 1.9495.46 ±  2.1294.62 ± 1.5895.46 ±  2.1297.24 ± 1.32
D3 D180.33 ± 1.2481.33 ± 1.1281.48 ± 1.6881.86 ± 1.9481.28 ±  1.2681.86 ± 1.9482.62 ± 1.12
D0, D2 D197.49 ± 1.4097.89 ± 0.6298.02 ± 1.7298.08 ±  1.6497.77 ± 0.9897.08 ±  1.2498.10 ± 0.86
D0, D3 D190.67 ± 1.3892.04 ± 1.2894.32 ± 1.7296.28 ±  1.9894.76 ± 1.4496.58 ± 1.7897.52 ±  1.28
D2, D3 D190.72 ± 2.3892.06 ± 1.4292.58 ± 1.2096.38 ±  1.0494.82 ± 0.9496.28 ± 1.0297.13 ±  0.80
D0, D2, D3 D197.67 ± 1.2498.06 ± 0.8298.00 ± 0.5298.02 ± 0.6697.78 ± 0.5898.02 ± 0.7698.12 ± 0.46
D0 D273.42 ± 3.4275.72 ± 3.1298.24 ± 2.0879.36 ±  2.3678.24 ± 1.9679.06 ± 2.4679.26 ± 1.82
D1 D291.11 ± 2.1492.48 ± 2.0895.62 ± 0.7595.46 ± 0.9295.26 ±  0.6894.62 ± 0.9096.78 ± 0.76
D3 D287.18 ± 1.0487.78 ± 0.9492.12 ± 1.5290.16 ±  2.0291.20 ± 1.1692.76 ± 1.4294.58 ± 1.06
D0, D1 D292.24 ± 1.5893.92 ± 1.7696.88 ± 1.0696.84 ±  1.3895.58 ± 0.9296.82 ± 0.9897.12 ± 0.76
D0, D3 D289.94 ± 1.8891.72 ± 1.0694.52 ± 1.7493.24 ±  2.1293.56 ± 1.4294.28 ± 1.5296.13 ± 1.08
D1, D3 D295.12 ± 0.8896.16 ± 1.4295.44 ± 0.9893.66 ±  1.3896.02 ± 0.8897.68 ± 0.6898.44. ± 0.44
D0, D1, D3 D296.76 ± 1.0296.72 ± 1.4296.06 ± 1.4296.16 ± 1.9896.08 ±  0.8296.06 ± 1.5897.36 ± 1.02
D0 D358.78 ± 5.6861.82 ± 3.2865.28 ± 2.9260.32 ±  3.8865.64 ± 2.9666.82 ±  3.2870.62 ±  2.18
D1 D375.12 ± 2.6677.24 ± 3.0877.42 ± 2.4074.78 ± 3.1879.54 ± 2.6280.18 ± 2.7881.68 ±  1.88
D2 D386.08 ± 2.1490.06 ± 2.0293.64 ± 1.4893.82 ±  2.2691.54 ± 1.2894.42 ± 0.7697.89 ± 0.48
D0, D1 D370.24 ± 5.9875.24 ± 2.4684.28 ± 1.6276.17 ±  2.1484.21 ± 1.5688.12 ± 1.6490.16 ± 1.18
D0, D2 D392.12 ± 3.5694.66 ± 1.0697.16 ± 1.0497.26 ±  1.2493.53 ± 0.9695.16 ±  1.0497.62 ± 0.64
D1, D2 D389.33 ± 2.1691.53 ± 1.4895.68 ± 0.8895.92 ± 1.8893.52 ± 0.9294.02 ± 1.0896.74 ± 0.74
D0, D1, D2 D392.24 ± 3.6895.48 ± 1.0996.12 ± 1.6896.18 ±  2.2693.88 ± 1.4296.78 ± 1.1697.33 ± 1.08
Average85.07 ± 2.2987.12 ± 1.7290.83 ± 1.5289.37 ± 1.9189.31 ± 1.3590.66 ± 1.4991.89 ± 1.10
Table 5. Performance comparison of CWRU.
Table 5. Performance comparison of CWRU.
TaskModel Performance (%)
CNNDANNAMDSADSANISAE-CSDFSA-SN-DCGANIATN
R1 R093.10 ± 0.3894.35 ± 0.9896.02 ± 1.2095.62 ± 1.3294.68 ± 0.5295.22 ± 0.7297.76 ± 0.36
R2 R086.30 ± 0.4688.82 ± 0.8290.82 ± 1.5290.42 ± 1.8890.26 ± 0.8991.22 ± 0.8894.82 ± 0.58
R3 R083.76 ± 1.2286.86 ± 1.6888.72 ± 1.9884.26 ± 2.1490.13 ± 1.7687.06 ± 1.5493.76 ± 1.28
R0 R197.25 ± 0.3697.66 ± 0.6298.40 ± 0.9297.88 ± 1.0498.26 ± 0.7297.28 ± 0.3299.40 ±  0.26
R2 R197.22 ±  0.4899.02 ±  0.4699.30 ± 0.6599.02 ±  0.7898.52 ±  0.4899.22 ±  0.7899.60 ±  0.18
R3 R194.76 ±  0.8294.96 ±  0.5898.24 ± 0.9597.76 ± 1.1496.28 ±  0.7897.06 ± 0.7297.86 ± 0.62
R0 R291.22 ± 0.6691.76 ± 1.6696.28 ± 1.6595.64 ± 1.6893.58 ±  0.9294.64 ± 1.1497.64 ± 0.84
R1 R299.96 ± 0.0499.98 ± 0.0299.90 ± 0.0499.96 ±  0.0499.98 ± 0.0299.96 ±  0.04100.00 ± 0.00
R3 R299.94 ± 0.0699.96 ± 0.0499.95 ± 0.0499.92 ±  0.0699.98 ± 0.0299.92 ±  0.08100.00 ± 0.00
R0 R381.56 ± 1.2486.26 ± 1.3689.24 ± 1.5390.26 ± 1.6287.38 ±  0.9290.06 ± 1.4292.56 ± 1.06
R1 R395.32 ±  0.4894.82 ±  0.6898.38 ± 0.9298.82 ± 1.0897.66 ± 0.6898.22 ± 0.8899.52 ± 0.48
R2 R399.90 ± 0.0899.90 ± 0.1099.90 ± 0.1099.92 ±  0.0899.95 ±  0.0599.92 ±  0.08100.00 ± 0.00
Average93.36 ± 0.5294.52 ± 0.7596.26 ± 0.9695.79 ± 1.0795.55 ±  0.6495.81 ± 0.7297.74 ± 0.47
Table 6. Performance comparison of Ottawa dataset.
Table 6. Performance comparison of Ottawa dataset.
TaskModel Performance (%)
CNNDANNAMDSADSANISAE-CSDFSA-SN-DCGANIATN
O 1 O064.20 ±  1.8265.38 ± 2.4866.81 ± 2.5765.58 ± 2.9565.76 ± 1.8666.64 ± 1.7267.53 ±  1.42
O2 O058.30 ± 1.8662.82 ± 2.5267.84 ± 2.6272.26 ± 3.01 64.86 ± 1.4268.52 ± 1.7870.36 ±  1.46
O3 O070.26 ± 1.5271.86 ± 1.6873.79 ± 1.8663.82 ± 2.0472.58 ± 1.2673.36 ± 1.1474.76 ± 1.08
O0 O160.25 ±  1.7663.76 ± 2.6266.94 ± 2.6864.04 ± 3.1464.36 ± 1.6265.78 ± 1.6868.53 ±  1.56
O2 O184.32 ±  1.2885.08 ± 1.4687.98 ± 1.5685.82 ± 1.7186.92 ±  0.7887.12 ± 1.0888.33 ± 0.76
O3 O185.26 ± 1.2286.42 ± 1.5687.24 ± 1.7286.72 ± 1.9887.08 ± 1.1897.76 ± 1.1488.76 ± 1.28
O0 O248.76 ± 2.5654.36 ± 2.6861.9 ± 3.0455.75 ± 3.2957.18 ± 1.6262.54 ± 1.8865.67 ±  1.84
O1 O276.26 ± 1.2480.88 ± 1.8284.84 ± 1.8481.33 ± 2.1481.88 ± 1.0283.76 ±  0.9486.83 ±   0.98
O3 O293.94 ± 1.2894.98 ± 1.5495.91 ± 1.6495.25 ± 1.8295.58 ± 0.9295.92 ± 0.8896.38 ± 0.86
O0 O370.56 ± 1.8473.56 ± 1.9876.84 ± 2.3273.92 ± 2.5774.38 ± 1.8276.26 ± 1.8278.48 ± 1.78
O1 O385.32 ±  1.2887.72 ± 1.8889.54 ± 1.9487.64 ± 2.2787.56 ± 1.2988.88 ± 1.2890.46 ± 1.18
O2 O394.70 ± 1.1295.76 ± 1.4696.76 ± 1.5295.87 ± 1.7295.85 ±  0.6896.42 ± 0.8297.26 ± 0.78
Average74.34 ± 1.5676.88 ± 1.9779.69 ± 2.1077.33 ± 2.3877.83 ± 1.2880.27 ± 1.3481.11 ± 1.24
Table 7. Comparison of running-times of dual-rotor dataset.
Table 7. Comparison of running-times of dual-rotor dataset.
TaskRunning Time (s)
CNNDANNISAE-CSDFCDGATLNIATN
D0 D1276280508526410
D0, D1→D2284291564602417
D0, D1, D2→D3287295765826601
Table 8. The distribution connections of different domains.
Table 8. The distribution connections of different domains.
TaskMK-MMDMMDMean Acc
D0 D10.4380.12291.79
D0 D21.0850.31290.03
D0 D31.7820.51175.74
D1 D20.3160.08797.01
D1 D31.1750.25382.15
D2 D30.3210.08496.23
R0 R10.0610.01198.58
R0 R20.0720.01096.23
R0 R30.0710.01493.16
R1 R20.0520.00899.80
R1 R30.0650.01398.69
R2 R30.0540.012100
O0 O10.8540.19868.03
O0 O20.9270.20968.01
O0 O30.8330.19476.62
O1 O20.2070.04587.58
O1 O30.0290.00689.61
O2 O30.0210.00496.82
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, J.; Ahmed, H.; Chen, X.; Yan, R.; Nandi, A.K. Improved Adversarial Transfer Network for Bearing Fault Diagnosis under Variable Working Conditions. Appl. Sci. 2024, 14, 2253. https://doi.org/10.3390/app14062253

AMA Style

Wang J, Ahmed H, Chen X, Yan R, Nandi AK. Improved Adversarial Transfer Network for Bearing Fault Diagnosis under Variable Working Conditions. Applied Sciences. 2024; 14(6):2253. https://doi.org/10.3390/app14062253

Chicago/Turabian Style

Wang, Jun, Hosameldin Ahmed, Xuefeng Chen, Ruqiang Yan, and Asoke K. Nandi. 2024. "Improved Adversarial Transfer Network for Bearing Fault Diagnosis under Variable Working Conditions" Applied Sciences 14, no. 6: 2253. https://doi.org/10.3390/app14062253

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop