1 Introduction

COVID 19 is a respiratory contaminant due to the most severe respiratory disease, Covid 2 (SARS-CoV-2) (Trouvain & Truong, 2015). Many worldwide have an infection rate between 1 and 10% in many countries, and the condition has not been officially reported (James, 2015). Figure 1 shows the Evolution of COVID-19 cases and deaths up to august 2020. This development direction began on January 4, 2020, and has constrained numerous nations to take serious control estimates across country lockdowns and scaling-up of the confinement offices in emergency clinics (Sakai, 2015; Schuller et al., 2014). Lockdown process is valuable because it gives excellent time and scope of testing for a maximum number of patients. Reverse transcription polymerase chain reaction (RT-PCR) is one of the best methods for analyzing and detecting COVID 19 within 48 h (Ghosh et al., 2015, 2016a, 2016b; Usman, 2017).

Fig. 1
figure 1

Sourcehttps://arxiv.org/pdf/2005.10548.pdf

Ratio of COVID-19 cases up to August 2020.

The testing interaction incorporates (i) avoid social distance, it grows the chances for effectively spreading the infection, (ii) the expense of having chemical reagents and widgets, (iii) testing time is high, and (iv) obstacles in huge-scale spread. Attempts to predict a more significant number of COVID-19 cases have led to productive recommendations on innovative solutions for medical services (Botha et al., 2018; McKeown et al., 2012; Porter et al., 2019; Windmon et al., 2018). In particular, progress needs to be made to test simpler, less expensive, and more accurate diagnosis approaches (Breathing sounds for COVID-19, 2020; Indian Institute of Science, 2020; Menni et al., 2020). A few countries have changed the essential, policymaking, and economic restructuring of medical services. The attention is also focused on the purpose of diagnosis tools, innovation arrangements that can be facilitated quickly for pre-screening, and exploring less expensive options than RT-PCR test, which will overcome the chemical testing method's drawbacks.

COVID 19 identification and testing development are being carried out in various laboratories around the world. The WHO and the CDC have identified speech loss as one of the main symptoms of this infectious illness, presenting as difficult coughing, a dry cough, and chest pain up to 14 days after exposure to the virus. Clinical testing projects that incorporate structural and physiological (Huber & Stathopoulos, 2015) improvements in the unpredictable respiratory system are speech breathing models. Based on our observations, we believe that speech signals might blame the shift in COVID 19 detection.

Bringing together an enormous data set of breathing sounds and respiratory disease skills from clinical experts can evaluate the expected effect of utilizing breath sounds to recognize COVID-19 indications using deep learning methods (Thorpe et al., 2001). This work's primary purpose is to supplement existing chemical testing methods by replacing them with low cost, fast process, and high accuracy. This research work provides efforts in this direction.

1.1 Dataset

First, to generate data on healthy and unhealthy sound samples, including COVID-19 identification. The generated samples are analyzed using the proposed generative adversarial network method. It has built on assistive mathematical models that identify biomarkers from sound models. Progress should be made when creating task data at this stage.

1.2 Literature survey

Several studies have proposed sound features that detect symptoms and vocal signals in respiratory diseases in recent years.

As the examination has focused on expanded COVID 19, ongoing works have started researching the utilization of deep neural networks by people to characterize sick dependent on cough sounds. Venkata Srikanth and Strik (2019) use Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) architectures for breath occasion discovery as a likely pointer of COVID-19 recognition. As of late, Basheer et al. (2020) used the CNN architecture to perform direct COVID-19 symptomatic groupings dependent on cough sounds. The work in Chon et al. (2012) uses a learning step technique of deep finding out how to do a similar analysis to our own, with an F1 score of 0.929, which is not at all like the methods discussed in this article.

More recently, microphones in devices, for example, cell phones and wearable devices, have been abused for voice examination. In Rachuri et al. (2010), the microphone audio is utilized to comprehend the client's current circumstance. This data is assembled to briefly look at the environmental factors in places around the city alone. In COVID-19 recognition (Nandakumar et al., 2015), a sensor recognizes clients' feelings through the telephone's receiver wild Gaussian compound models. In Oletic and Bilas (2016), Pramono et al. (2017), Praveen Sundar et al. (2020), the authors distinguished COVID-19 in the investigation using sound samples based on different machine learning methods.

2 Proposed COVID-19 detection using speech signal

The generative adversarial network with speech signal-based COVID-19 detection system is shown in Fig. 2. The proposed system consists of two stages, pre-processing and classification. The Least Mean Square filter removes the artifacts or noise from the input speech signal in the pre-processing step. After completing the pre-processing process, the GAN classifier analyses the filtering signal to classify COVID-19 and non-COVID-19 signals.

Fig. 2
figure 2

Block diagram of COVID-19 detection

2.1 Noise reduction using LMS

Typically, all biomedical signals contain noise or artifacts. Hence, before classifying the signals, we need to remove the noise or artifacts for accurate results. In this research work, the Least-Mean-Square (LMS) filtering method is used to remove the noise. As compared with other filters, the LMS decreases the variance of weights to stabilize the signal using the Lagrangian approach. This Lagrangian method has a nonlinear transformation rule, and it differentiates the input and output derivatives, which solves the optimization problem of the LMS algorithm. The LMS pre-processing steps are discussed below.

2.1.1 LMS algorithm

figure a

The optimization issues is overcome using the strategy of Lagrange multipliers. The equation of Lagrangian is given in Eq. (3)

$${\text{L}}\left( {{\text{w}}\left( {{\text{n}} + 1} \right)} \right) = \left| {\left| {\S{\text{w}}\left( {{\text{n}} + 1} \right)} \right|} \right|^{2} + {\text{Re}}\left[ {\lambda *{\text{e}}^{{[{\text{n}} + 1]}} \left( {\text{n}} \right)} \right]$$
(3)

where w(n + 1) = tap weight vector, §w(n + 1) = w(n + 1) – w(n) in the tap-weight vector w (n + 1) with respect to its old worth w(n).

Here λ* is known as the Lagrange multiplier, in this way getting the famous variation rule in (3) with the standardized advance size gave by \(\mu = \hat{\mu }{/}\left\| {x\left( n \right)} \right\|^{2}\). The last restriction is unnecessarily obstructive in open applications; therefore, an additional interesting solution is derived when we relax it.

2.2 GAN classifier

This section discusses the Generative Adversarial Network method's working function based on COVID-19 detection from the speech signal. The optimal threshold value of COVID-19 is above 1.2 Hz, and non-COVID-19 is below 0.60 Hz. The investigation model's unsupervised learning piece is developed for the Deep Convolution Generative Adversarial Network (GAN) design or DCGAN.DCGAN contains two main blocks known as generators and discriminators, and these blocks are trained using min–max arrangement. The Generator receives the samples from random distributions variance of output conditions. The discriminator takes samples from either the output of the generator or actual speech samples from the dataset. During training, the discriminator utilizes the cross-entropy loss function to distinguish the number of classified models completely in genuine models, and the Generator classifies the number of good ones. The mathematical calculation of real (y) and predicted (\(\hat{y}\)) values are defined in Eq. (4).

$$L\left( w \right) = - \frac{1}{N}\mathop \sum \limits_{n = 1}^{N} [y_{n} \log \hat{y}_{n} + \left( {1 - y_{n} } \right)\log (1 - \hat{y}_{n} )]$$
(4)

where w = weights of learned vectors, N = size of samples.

For this calculation, 1 represents the real sample, and 0 represents the generated samples. The prediction of discriminator (\(\hat{y}_{r}\)) is computed using Eq. (5).

$$L_{r} \left( W \right) = - \frac{1}{N}\mathop \sum \limits_{n = 1}^{N} \log \hat{y}_{r} ,n$$
(5)

All the correct predictions are considered as zero for this case. Similarly, the \(\hat{y}_{g}\) discrimination represents prediction. Therefore, the correct prediction of the cross-entropy function is simplified by using Eq. (6)

$$L_{f} \left( W \right) = - \frac{1}{N}\mathop \sum \limits_{n = 1}^{N} 1 - \log \hat{y}_{g} ,n$$
(6)

The generator also uses cross-entropy loss, which should be interpreted in terms of fallen generator outputs into the real sample. The cross-entropy loss of the Generator is computed using Eq. (7).

$$L_{g} \left( W \right) = - \frac{1}{N}\mathop \sum \limits_{n = 1}^{N} \log \hat{y}_{g} ,n$$
(7)

If the generator has low loss, the proposed system gives the discriminator results as accurate.

This process leads the Generator to produce output and looks like an actual sample of well-trained iterations shown in Fig. 3. Both the activation of the valence classifier cross-entropy misfortune function to reduce the loss. The cross-entropy function is discussed by Eq. (7): the valence, activation classifier network, and the discrimination share layer model, which learns the characteristics. The convolution filter is effectively used for the valence classification task to activate the classification network to distinguish between actual and generated speech samples.

Fig. 3
figure 3

The architecture of GAN classifier

Figure 4 discusses the overall process for describing the proposed Deep Convolution Generative Adversarial Network with record cough-breath sound, extract audio features, split the training/testing ratio, and performance validation. The testing and training ratio is 80:20. The classification response of the proposed COVID-19 detection system's performance is validated using precision, recall, and accuracy. Compared to other deep learning methods, GAN does not require labeled data; they can be trained using unlabeled data to learn the data's internal representations. So the performance is automatically improved.

Fig. 4
figure 4

Overall process of proposed method

Precision It is the fraction of relevant speech samples among the retrieved speech samples. The mathematical formula of precision is shown in Eq. (8).

$${\text{Precision}}\;\left( P \right) = \frac{{T_{p} }}{{T_{p} + F_{p} }}$$
(8)

Recall It is the fraction of retrieved relevant speech samples among all relevant speech samples. The mathematical formula of recall is shown in Eq. (9).

$${\text{Recall}}\;\left( {\text{R}} \right) = \frac{{T_{p} }}{{T_{p} + F_{n} }}$$
(9)

Accuracy Accuracy is the ratio of correctly classify the COVID-19 samples from the total number of samples. The following Eq. (10) is used to compute the accuracy.

$${\text{Accuracy}} = \frac{{T_{p} + T_{n} }}{{\left( {T_{p} + T_{n} + F_{p} + F_{n} } \right)}}$$
(10)

where Tp = true positive, Tn = true negative, Fp = false positive, Fn = false negative.

3 Simulation results and discussion

Simulation results and performance analysis of the proposed COVID 19 detection system are discussed in this section. This work aims to classify speech samples from normal and abnormal people, include to identifying COVID-19 patients.

The input speech signal of the proposed COVID-19 detection is depicted in Fig. 5. The input signal's frequency range is 8 kHz.

Fig. 5
figure 5

Noisy signal

Time-domain representation of proposed Generative Adversarial Neural Network-based COVID-19 detection is shown in Fig. 6.

Fig. 6
figure 6

Time domain representation of the desired signal

The proposed Generative Adversarial Neural Network-based time-domain representation of the noise signal of COVID-19 detection is shown in Fig. 7.

Fig. 7
figure 7

Time domain representation of noise signal

The proposed Generative Adversarial Neural Network-based time and frequency response of the filtered signal COVID-19 detection is shown in Fig. 8.

Fig. 8
figure 8

Time and frequency response of a filtered signal

Figure 9 shows the Spectrogram of the pre-processed speech signal. The Spectrogram splits the Window that allows overlapping elements in each section with windows notation.

Fig. 9
figure 9

Spectrogram of a speech signal

Figure 10 shows the simulation results of validation accuracy and loss in training. The proposed COVID-19 detection system reduces the validation loss and increases the validation accuracy, making the model learning low mean squared error.

Fig. 10
figure 10

Validation accuracy and loss during the training

Figure 11 and Table 1 discuss the performance analysis of the proposed COVID-19 classification system with existing methods. As compared with existing methods, the proposed GAN method achieves a good result. The precision, recall, accuracy and F-measure are 96.54%, 96.15%, 98.56% and 0.96% respectively.

Fig. 11
figure 11

Performance analysis of classification ratio

Table 1 Performance evaluation of classification ratio

4 Conclusion

This research work introduces Generative Adversarial Network for the detection of COVID-19 symptoms from a speech signal. Typically, speech signals contain intrinsic information regarding the physiological as well as emotional conditions of humans. Accurate measurement of such physiological parameters using speech signals has facilitated real-time, remote monitoring of infected/symptomatic individuals and early detection of COVID-19 symptoms, resulting in containing the spread of the infection. The reverse transcription-polymerase chain reaction (RT-PCR) testing strategy is used to determine the COVID-19 virus. This RT-PCR processing is more expensive and inducing social distancing rules violation, and time-consuming. Therefore, this research work introduces the Generative Adversarial Network (GAN) based deep learning method to detect COVID-19 from speech signals quickly. As compared with existing methods, the proposed GAN method achieves a good result. The precision, recall, accuracy, and F-measure are 96.54%, 96.15%, 98.56%, and 0.96, respectively.