1 Introduction

Covid-19, which began in Wuhan, China [1], has afflicted all nationals worldwide since December 2019 [2, 3]. As of July 15, 2022, the WHO received 557,917,904 affirmed cases of Covid-19, totaling 6,358,899 mortalities. As of July 11, 2022, 12,130,881,147 vaccines had been administered [4, 5] globally. According to the Ministry of Health and Family Welfare [6], 43,730,071 affirmed cases[5], 140,760 active cases, 525,660 deaths, and 1,99,71,61,438 vaccines had been administered in India by July 16, 2021.

SARSCoV2 is supporter of Coronaviridae and Nidovirales groups [7]. Coronavirinae is divided into four sub-groups: Humanoid coronavirus belongs to alpha; SARS belongs to beta [8] containing humanoid coronavirus and MERS-CoV; viruses of whales and poultry belong to gamma; viruses from pigs and birds belong to the delta. SARSCoV2 [9] goes to beta composed of two extremely bacterium viruses such as SARS-CoV [10] and MERS-CoV [11]. A human contaminating beta Covid-19 measures SARSCoV2. The ethnological examination of SARSCoV2-DNA[12] shows that the disease is severely interconnected to two bat animals. Resulting in SARS similar coronavirus composed in eastern China(2018) and hereditarily different from SARS and MERS-CoV [13]. Using the DNA orders of SARSCoV2 and SARS-CoV, an additional reading [14] found that the virus is further correlated to bat category virus, as previously noticed within Rhinolophus in the Yunnan area, by 96.2% DNA order uniqueness.

In the RT-PCR study [15], the disease was spotted beginning in the upper respiratory tract (URT) and lower respiratory tract (LRT) samples on two to three days of indication. The virus-related burden was improved from day three to five in the LRT sample. The Rapid Diagnostic Test (RDT) is an antigen-finding assessment that can yield outcomes in thirty minutes [16]. The sensitivity of RT-PCR [17] experiments was poor (60–70%); however, it improved significantly over time [18]. Also, it has various defies like false positives, little compassion, costly, and needs specialists to test. As the Covid-19 patient number increases, there is an utmost necessity to create a fast testing technique that is precise and cheaper [19].

The coronavirus spreads through droplets [20] of air by coughing, sneezing, cold, and talk [21] from their nose or mouth, dispersing and affecting those nearby people's lungs. The transferred droplets into the lungs through the respiratory system started killing lung cells. This Covid-19 virus disturbs the lung system of a patient & produces patchy white shades in the respiratory part of the lung [22]. In some instances & recent studies show that individuals with no symptoms but are infected with the virus [4] also play a part in the virus spread and need to detect the coronavirus-affected portion of the lungs [23].

The finding of Covid-19 using AI methods and notably DL will support the identification of disease in the upcoming days, which will imitate in growing chances of patients' speedy recovery globally [24]. As a result, the healthcare system's work strain will be relieved worldwide [25]. A few papers have validated that CT could be a compassionate diagnostic examination to identify Covid-19 pneumonia. Also, early PCR testing is more sensitive for Covid-19 pneumonia detection than late PCR testing to assess its severity. According to a WHO report, contaminated zones used chest CT to screen Covid-19 positive individuals [26]. However, other studies hypothesize that chest X-ray holds a high potential to ease the organization of patients with Covid-19 [27]. This article elaborates on the various examples of DL-based Covid-19 detection systems with the following key contributions:

  1. (i)

    Recent research literature collection and classification based on the development of X-ray, CT, and multimodal radiography images.

  2. (ii)

    Create a taxonomy of the examined research based on pre-trained and custom models concerning image modality.

  3. (iii)

    Discussion on the challenging facets of current developments in DL-based Covid-19 diagnostic systems

  4. (iv)

    Providing direction for future research to further develop an effective and reliable Covid-19 detection system.

  5. (v)

    Detailed experimental and performance analysis of 64 Covid-19 detection systems based on DL.

The remaining sections of this review study are separated into subsequent sections. Section 2 classifies the taxonomy of the analyzed systems based on neural networks, classification-based tasks, and datasets that can be understood and analyzed quantitatively. Covid-19 diagnosis employing X-ray, CT, and multimodal imaging modalities using pre-trained, hybrid, and custom-trained experimental architecture with DTL are discussed in Sects. 3, 4 and 5. Section 6 discusses challenges and potential future trends. Finally, closing remarks have been concluded in Sect. 7. Performance metrics are provided in Appendix A, and Appendix B lists abbreviations utilized throughout the paper.

2 Classification of Deep Learning-Based Covid-19 Diagnosis Systems

This literature review emphasizes the Covid-19 prediction and classification with medical imaging modalities using DL techniques. ML and DL methods can resolve multifaceted complications by gaining insight knowledge from simple representations. The critical structure that has resulted in widespread neural network approaches is the ability to learn precise depictions. As a result, numerous layers are used in a stacked sequence of CNN, and the benefits of gaining in-depth knowledge are maximized for optimal model performance [28]. DL systems are broadly utilized in medical diagnosis systems [29] like biomedicine, competent healthcare, and medical image analysis [30]. The taxonomy used from the selected studied systems is grouped into CNN architecture, class-based task formulation, and three radiography modalities (X-ray, CT, and multimodal); each is linked to a pre-trained and custom DL model.

The primary task was to conduct a literature review and search for relevant published work for this paper. The literature review is a collection of principles or methods for providing a solution to the investigator's queries. The PRISMA [31] framework was utilized for this review. The article screening metrics included a heading, keywords, aims, competence requirements, datasets sources, search, measurement, synthesizing results, summarization of proofs, challenges, and conclusions. This methodology (Fig. 1) aimed to provide clear steps in DL, Covid-19, and classification. A set of inclusion criteria was determined to find the articles, and query patterns were analyzed on Elsevier, IEEEXplore, ScienceDirect, LWW, tandfonline, nature, RSNA, MDPI, SpringerLink, ArXiv, MedRxiv, ResearchGate, and other publications till July 2021. Most popular keywords were used like “Deep Learning,” “Covid-19”, “Diagnosis,” and “Radiological Imaging” for obtaining the articles. Figure 2 shows the total articles published/indexed on DL techniques on Covid-19 by several databases.

Fig. 1
figure 1

PRISMA methodology used for research review

Fig. 2
figure 2

Selected published papers on Covid-19 diagnosis using DL

2.1 CNN Architecture

DL is broadly used in various domains, such as finding abnormalities, object detection, and cataloging [32] in the biomedical arena using pre-trained or personalized models [33]. Creating a fresh CNN model is a challenging and slow process. However, pre-trained models are quicker with additional functionalities than creating a programmer-defined or custom model. Pre-trained networks are generally employed for feature mining, transfer learning, and classification [10].

Several pre-trained models used in transfer learning(TL) are built for the scale of CNN[34], 2DCNN [35], GAN [36], CNNRNN, CNNLSTM [37] and hybrid networks like FCONet [38], COVID-CheXNet [39], CovidCTNet [40], COVIDetection-Net [41], COVIDX-Net [42], DRE-Net [43]. Other custom networks like EDL-COVID [44], CVDNet [45], COVIDSDNet [46], LDC-Net [47] etc., have been developed for the Covid-19 automatic detection system.

Along with large and heavy pre-trained networks, the lightweight networks also perform a critical role in detecting Covid-19 like CoVNet-19 [48], MKs-ELM-DNN [49], UNet +  + [1], LDC-Net [47] that can be easily deployed on resource constraint platforms. ResNet [50], GoogleNet [51], Inception [52], DenseNet[53], NasNet[54], Xception [55], AlexNet [28], SqueezeNet [56], VGG [57] etc. are the pre-trained models applied for the classification/detection of Covid-19 from radiological pictures in the studied articles. Fully automated systems [27] were utilized to minimize processing time from blurry samples and reject those samples with misclassified lung regions for Covid-19 detection. Figure 3 reveals the quantitative analysis of DL models used for Covid-19 detection systems. Among all pre-trained DL models, ResNet was utilized by 35 reviewed systems, 21 reviewed systems used Inception, and VGG was used by 19 reviewed systems, whereas 32 Covid-19 detection systems used hybrid and custom networks. Figure 4 depicts that Keras and TensorFlow are the commonly used tools for Covid-19 detection in the experimental setup (analysis from Table 4).

Fig. 3
figure 3

Deep learning models used in the studied articles

Fig. 4
figure 4

DL tools used for Covid-19 detection

Fig. 5
figure 5

DL-based Covid-19 diagnosis flow

Deep transfer learning [58] firstly trains the CNN for a particular job utilizing a bulky dataset like Image Net. The dataset must have at least 5500 or more samples per class, i.e., data availability is the utmost key aspect for early training to effectively take out the essential features and obtain advantageous network parameters. Afterward, the initial practical training of CNN is ready to handle and manage the new data and mining features as it is established on facts and information grown from initial training [20]. Figure 5 shows the DL-based general flow used in Covid-19 detection systems. In DCNN, TL can be achieved in two methods.

(1) It contains feature mining with TL. The innovative CNN model [59] is served as a feature miner, and a novel classifier is trained on topmost. The already trained network also remembers its ideal structure and its learned features. Learned features from this network are provided to new classifiers learned for the particular job at hand.

(2) It contains structural alteration to already trained models to improve outcomes [60]. Generally, several network units in such prototypes are exchanged with newly adjusted layers, in-tune features, and only allow for a particular job. Primarily fully connected (FC) layers [47, 61] in the newest pre-trained network structure are exchanged with a unique FC wherein weights are prepared arbitrarily. The convolutional regions [61] are locked to realm of the selective filtering learned by convolutional regions. This means that backpropagation is only possible up to totally connected layers, as these layer weights are arbitrary [29]. This technique permits FC layers to study the structures and shapes from extremely judgmental and affluent-featured convolutional layers [62]. Further, FC layers gain the structures and shapes of a newly given image set; the complete prototype is allowable to train with an actual lesser learning rate to attain enough precision on the new image.

2.2 Class-Based Task Formulation

We have categorized the studied Covid-19 detection articles using DL from binary to multi-class classifications. Because each label signifies one or more class types. DL networks are used to create network models that helps in diagnosis of multi-class(Normal, BacterialPneumonia, ViralPneumonia, Covid19) [41, 63], 3-class(Healthy, Covid19, Pneumonia) [48, 64], (CovidPneumonia, nonCovidPneumonia, Normal) [65], (BacterialViralPneumonia, Covid19, Normal) [66] and 2-class(Normal, Covid-19) [67, 68], (Covid19, nonCovid) [69] classification. Figure 6 shows the radio image type, number of systems, and proportion of radiography images, including X-ray, CT, and multimodal types in studied papers as a dataset. As indicated in Fig. 6, 36 examined techniques (56%) used X-ray modality, while 20 examined methods (31%) used CT scans and the remaining eight systems (13%) used multimodal as a data source.

Fig. 6
figure 6

Distribution of the three-radiography imaging modality

2.3 Datasets

From studied papers, a total of 35 different datasets are recovered. An outline of these datasets is produced in Table 1. Some authors successfully generated a new clean dataset of Covid-19 [70] from various institutes, hospitals, and research centers and shared it publically on a common platform. The publicly shared online platforms are Kaggle [71,72,73,74], Github [75, 76], SIRM [77], Mendeley [78, 79], Radiopedia [80], IEEE-Dataport [81], NIH [82], Eurorad [83], wiki-cancerimagingarchive [84, 85], Harvard [86], social media (Twitter [87] and Instagram [88]), physionet[89], stanfordmlgroup [90] etc. to open community. While others did not disclose the dataset on a public platform to control the privacy policies of hospitals, research institutes, and patients, they have only used the dataset for experiments [1, 36, 39, 43, 91, 92]. Each tuple from Table 1 states the data source name, image resolution, imaging mode, type, images with class, URL (address to access the dataset), reference number, and authors using the same dataset. According to our findings, Covid-19 X-ray by Cohen [72] is the most frequently used dataset in 23 studies.

Table 1 Covid-19 datasets and repositories used in the studied articles

3 Diagnosis Based on X-Ray Images

3.1 Pre-trained Deep Learning Models

Hemdan et al. [42] in Mar. 2020, created a system to detect Covid-19 using the pre-trained models of CNN from chest X-ray images(CXRI). The proposed system was named as COVIDX-Net, uses 7 pre-trained variants of DCNN (InceptionV3, ResNetV2, Xception, and MobileNetV2, VGG19, InceptionResNetV2, DenseNet201). They collected a dataset from a public repository [72]. The dataset consists of 25 samples for each class (healthy, Covid-19). All images were resized from 1112 × 624 and 2170 × 1953 to 224 × 224 pixels. The data set was partitioned into 80:20% for training and testing. The investigational performance exposed VGG19, DenseNet achieved the highest results over other simulations with an F1-score of 90% and 91%, respectively, whereas InceptionV3 obtained the lowest accuracy of 0.00%. Mangal et al. [63] in Apr. 2020, proposed CovidAID system using pre-trained CheXNet prototypical to categorize a frontal-view of CXR image to normal, bacterialPneumonia, viralPneumonia, and Covid19 classes. CovidAID builds upon CheXNet, which detects pneumonia from CXRI. CheXNet is 121-layer DenseNet constructed structure and skilled on the ChestXRay14 largest dataset. The assembled dataset from 4 public online repositories [71, 72, 78, 121] and separated into training(70%), testing(20%), and validation(10%) sets. The dataset contains 6011 samples, including 155 for Covid-19, 1493 for viral pneumonia, 2780 for bacterial pneumonia, and 1583 for normal classes. The experimental results obtained an overall accuracy of 87.2% for 4-class classification and 90.5% for ternary class. For Covid-19 class the achieved accuracy, sensitivity, PPV for binary classification is 99.97%, 100%, 96.80% and for 4-class is 99.94%,100%, 93.80%. To visualize the region of interest, they have generated saliency maps for COVID-AID model predictions using RISE (Randomized input sampling for explanation).

Alazab et al. [67] in May 2020, the proposed identification of the Covid-19 model using VGG16 from CXRI. The original dataset was collected from a single source [117] which contains 128 images, including 28 images for the healthy class and 70 images for the Covid19 class. After applying data expansion, the total images reached 1000(500 for healthy and 500 for Covid-19 class). 80%:20% ratio partitioned for testing and training dataset. The proposed system improved F-measure from 0.95 to 0.99 for the original and augmented data sets, respectively. Apostolopoulos et al. [66] in Jun. 2020, modeled an architecture for the spontaneous identification of Covid-19 individuals by employing the perception of DTL with five different pre-trained CNNs (Inception, Xception, InceptionResNetv2, VGG19, MobileNetV2). They assembled the dataset from 3 public online platforms [72, 74, 79] in two forms Dataset_1 and Dataset_2. The system considered (Dataset_1) 1427 CXRI comprising Covid-19(224), common-bacterial pneumonia(700), and healthy(504) cases. In Dataset_2, 224 photos of Covid-19, 714 images of common-bacterial, viral-pneumonia, and 504 images of healthy. The dataset was distributed using ten-fold cross-validation. The projected model using MobileNet-V2 attained the highest performance over a Dataset_2 binary classification with a 96.78% accuracy, 98.66% sensitivity, 96.46% specificity, and 3-class classification accuracy of 94.72%. Haghanifar et al. [65] in Jul. 2020, anticipated a framework to identify Covid-19 using TL pattern, CheXNet model is used to develop COVID-CXNet. The collected dataset was assembled from 9 different public online platforms [77, 78, 80, 82, 83, 87, 88, 109, 119] consist of 780 X-ray samples of Covid19 pneumonia, 3500 for nonCovid pneumonia, 3500 samples of normal class. Images were downsized to 320 × 320 resolutions. CheXNet was created on DenseNet architecture, and it has trained on frontal CXR images. The proposed COVID-CXNet was adjusted on the Covid-19 CXR data set 431 layers and 7 million parameters. This system consists of a lung separation unit to increase the model localization of lung irregularities. Covid-19 pneumonia class achieved the highest performance from a base model with 98.68% accuracy, 94% F1-score, and COVID-CXNet_v1 with 99.04% accuracy, 96% F1-score. For hierarchical multiclass, COVID-CXNet attained an 87.21% accuracy and a 92% F-score. COVID-CXNet used Grad-CAM for visualizing the results.

Sethi et al. [104] created a Covid-19 recommender system using DL from CXRI in Jul. 2020. The proposed structure uses four deep CNN architectures: Inception V3, ResNet50, MobileNet, and Xception. The collected dataset from a single repository [72] contains 320 ‘Covid + ve’ and 5928 non-Covid19 samples. The dataset splitting with a 75:25 ratio for training and testing sets. For the validation set, 30% of the training set was utilized. The MobileNet achieved the highest 98.6% accuracy, 99.3% specificity, 87.8% precision, 87.8% sensitivity, and 87.8% F1-score from all four models. Ucar et al. [103] in Jul. 2020, projected a network structure for an immediate investigation of Covid-19 built on deepBayesSqueezeNet(COVIDiagnosisNet) built on CNN pretrained SqueezeNet model. The dataset was collected from 2 public repositories [72, 78]. Data augmentation was employed due to the low CXRI of Covid-19(70 samples), and the sample size reached 1536 images. After augmentation, the total image samples moved to 4602. The data splitting into training(60%) set, testing set(20%), and validation set(20%) and labeled into 1536 images for each Covid, normal, and pneumonia class. The experimental results revealed that deepBayesSqueezeNet achieved complete accuracy of 98.26%, specificity of 99.13%, and F1-measure of 98.25%. Mertyuz et al. [51] in Oct. 2020, the proposed Covid-19 prognosis from CXRI using 3 DCNN variants(VGG-16, ResNet, GoogleNet). The dataset collected from a public platform [73] contains Covid-19 positive(219), normal(1341), and viral pneumonia(1345) images. The proposed system achieves accuracy, sensitivity, specificity for VGG-16 network 95.87%, 97.73%, and 99.63%; using ResNet network 96.90%, 95.45%, and 100%; using the GoogleNet network 95.18%, 86.36%, and 100% respectively. Sharma et al. [68] in Oct. 2020, proposed Covid-19 screening with residual attention network using X-ray samples. The system uses 10 variations of CNN(VGG16, InceptionResNet, Xception, VGG19, MobileNet, MobileNetV2, DenseNet121, DenseNet201, NASNet, Vanilla Residual Attention Network (RAN)). They assembled a dataset from 2 online public platforms [71, 72], containing 120 images for Covid and 119 for normal class. The dataset divided into 70% for training (167 images), 20% for testing (50 images), and 30% for validation (22 images) sets. For non-linear dimensionality reduction images, the UMAP (Uniform Manifold Approximation and Projection) technique was applied. The residualAttNet-based proposed system achieves 98% accuracy, 100% sensitivity and specificity, precision, and recall metrics on a validation set.

The innovative Domain Extension Transfer (DETL) algorithm, developed by Basu et al. [28] in Dec. 2020, based on TL to screen Covid-19 from the CXR. DETL used three pre-trained versions of TL(AlexNet, VGGNet, and ResNet50). The datasets were collected from 4 online repositories [71, 77, 87, 126]. Data-A was formed based on [82] to classify normal and diseased classes, and Data-B classified pneumonia, normal, other diseases, and Covid-19 classes. Other disease classes contain “Atelectasis, Cardiomegaly, Infiltration, Effusion, Nodule, and Mass” type image. The proposed DETL model structure consists of AlexNet with eight layers, VGGNet with 16 layers, and ResNet has 50 layers. The accuracy attained with fivefold cross-validation on Data-B for AlexNet, VGGNet, and ResNet of 82.98%, 90.13%, and 85.98%, respectively. VGGNet correctly classified on Data-A with the best validation accuracy of 99% Covid and 100% normal. Sarker et al. [69], in Jan. 2021, proposed a COVID-DenseNet, a DTL-based framework for classifying Covid-19 from healthy and pneumonia individuals. The collected dataset from 3 public platforms [72, 75, 110] are divided into 80%:10%:10% ratio for train, test and validation set for binary classification (Covid19, NonCovid19) and 3 class classification (Covid19, pneumonia, and normal). The originally assembled dataset contains normal(8851), pneumonia(6045), and Covid-19(238 images) class. Data augmentation was employed to upsurge the samples of Covid class(11,416 images). The system achieved 96% accuracy for the binary and 94% for the 3-class classification. Patient-wise, tenfold cross-validation attained an average accuracy of 92.91% for 3-class classification to validate the consistency of the proposed model. Using Grad-CAM highlighted the image regions in making a prediction. Kedia et al. [48] in Feb. 2021, proposed a Stacked-Ensemble(SE) model to sense Covid-19 + Ve patients using CXRI. The proposed model(CoVNet-19) is a two-level stacked ensemble machine learning structure that understands similar facts in altered methods. The Phase-1 structure of CoVNet-19 joined two pre-trained DCNNs(VGG19 and DenseNet121). Mutually networks remained independently trained on composing datasets to achieve the classification work. CXR image of size 224 × 224 as the input given to CovNet-19 model. In phase-2 of CoVNet-19, the SVM classifier's SE structure was trained to accomplish binary and 3-class classification. The system was trained on the extracted features from the phase-1 structure. The assembled dataset from five different public repositories [72,73,74,75, 79] initially contains Covid19(798 frames), normal(2241 frames), and pneumonia(2345 frames) class. Later data augmentation was applied on Covid images, and the sample size increased to 1628. The proposed model hyperparameters were adjusted distinctly for 3-class (Covid19, normal, and pneumonia) and binary classification (Covid19, nonCovid19). Experimental results revealed 99.71% accuracy for binary and 98.28% for 3-class classification.

Ismael et al. [98] in Feb. 2021, proposed a DCNN approach to categorize coronavirus from healthy CXR images. They used five variants of pre-trained CNNs(VGG16, ResNet101, VGG19, ResNet18, and ResNet50) to extract features. SVM uses kernel functions like quadratic, cubic, linear, and gaussian for deep feature classification. Dataset was collected from 3 online public repositories [71, 72, 113], which contain 380 frames comprising 180 for Covid19 and 200 for normal/healthy classes. Dataset was partitioned into train and test sets for 75%:25% ratios. All CXRs were rescaled to 224 × 224 pixels. The deep features extracted with a linear-kernel function of the ResNet50 model and SVM classification generated 94.7% accuracy, 91% sensitivity, 98.89% specificity, 94.79% F1score, and 99.90% AUC, which was the highest among all obtained results. VGG16 achieved the lowest accuracy(i.e., 85.26%). Jain et al. [64] in Mar. 2021, proposed DL-based discovery & investigation of Covid-19 on CXR images. The system used 3 pretrained variants:InceptionV3, Xception, and ResNeXt. The dataset collected from the Kaggle repository [125] contains 6432 PA-views of CXRI, including training(5467) and validation(965), healthy(1583), Covid(576), and pneumonia(4263 images) classes. Experimental result analysis shows that Xception offered an uppermost accuracy of 97.97% for detecting Covid-19 than the other two networks. Elkorany and Elsharkawy [41] in Apr. 2021, developed a tailored Covid-19 detection called COVIDetectionNet from CXRI. The COVIDetectionNet was built on the architecture of ShuffleNet, SqueezeNet for extracting the features, and multi-class, SVM for recognition and classification. The combined features are fed to MSVM. The dataset contains 1200 CXRs composed of two public online repositories [71, 72] and classified into 300 CXRI for Covid, normal, bacterialPneumonia, and viralPneumonia classes. The dataset was partitioned into training(80%) and testing(20%). The uppermost 100% finding accuracy was attained for the binary class. For 3-class classification (Covid19, normal, pneumonia) achieved 99.72% accuracy and 94.44% for 4-class classification (3-class + bacterialPneumonia, viralPneumonia). For 4-class classification, the system achieved performance metrics in a recall, specificity, precision, and F1-score of 94.45%, 98.15%, 94.42%, and 94.4%, respectively.

3.2 Hybrid and Custom Deep Learning Models

Padma et al. [35] in Sept. 2020, the proposed DL-based diagnostic tool for detecting Covid-19 using 2DCNN. For mining features and categorization, the 2D-CNN was implemented. The planned technique fine-tuned for optical features like texture and sharpness was measured to find Covid-19 using CXRI. The data set was collected from GitHub [72], comprising 60 images(30 for normal and 30 for COVID positive class). The dataset was divided into training(80%) and testing(20%). The experimental results for the training set achieved a 99.2% accuracy, 98.3% validation accuracy, 0.3% loss, 99.1% sensitivity, 98.8% specificity, and 100% precision was attained by spotting Covid-19 imageries. Ouchicha et al. [45] in Nov. 2020, suggested an innovative DCNN-based CVDNet model for the finding of Covid-19. The CVDNet architecture consists of 2 parallel networks with 9-convolutional, 9-max-pools, 1-concatenation, 1-flattened, 3-FC with Softmax and ReLU layers. The dataset was assembled from 3 online public repositories [71, 72, 77] consisting of 2905 CXRI, containing 219 for Covid-19, 1345 for viral-pneumonia, and 1341 for normal class. The dataset was partitioned into fivefold cross-validation(training:70%, validation:10%, and testing:20%). In general, CVDNet model accomplished a typical accuracy of 97.20%, 96.73%, and 96.58% for 3-class classification, i.e., Covid-19, normal, and viral-pneumonia, respectively. For fivefold cross-validation results, a precision of 96.72%, accuracy of 96.69%, recall of 96.84%, and F1 score of 96.68%. Al-Waisy et al. [39] in Nov. 2020, presented the Covid-19 CXR virus detection framework using hybrid DL (COVID-CheXNet). Initially, the CXRI was enhanced, and the noise level was reduced by local contrasting histogram adaptation and Butterworth band-pass filters. After combining the results acquired after two altered DL models based on a ResNet34 & high-resolution network structure are trained with a high-range data set. The experimented system(COVID-CheXNet) for Covid-19 detection achieved 99.99% accuracy, 99.98% sensitivity, 100% specificity, 100% precision, and 99.99% F1-score. Also, the weighted sum rule was used to calculate MSE of 0.011% and RMSE of 0.012% at the score level. Tabik et al. [46] in Dec. 2020, developed the CNN-based COVIDSDNet method. COVIDSDNet chains data expansion, segmentation, and conversions altogether with a suitable CNN for implication. They collected the dataset from 5 different online public repositories [75, 82, 89, 110, 122] named COVIDGR1.0 comprised 426 Covid-19(+ ve) and 426 Covid-19(− ve) PA-views. Because of excessive variations between multiple executions, implemented fivefold cross-validations in all trials. Experimental configuration utilizes 80% for training, and 20% for testing(COVIDGR1.0 dataset). In the initial phase, the COVIDNetA, COVIDNetB, and COVIDNetC models were trained on the COVIDx dataset; in the second phase, COVIDSDNet was retrained on our COVIDGR. The experiment obtains 72.59% sensitivity, 78.67% specificity, 75.71% F1-score, and 76.18% accuracy.

To identify Covid-19, Wang et al. [70] in Dec. 2020, suggested a DCNN structure. The projection-expansion-projection-extension (PEPX) design was used in COVID-Net construction. They used a human–machine collaborative system design in the initial step. In COVID-Net, the implemented approach blends a human-driven fundamental system architecture prototype with a machine-driven screening tool in the second step. They assembled a dataset from 4 public online repositories [72, 73, 75, 110]. They introduced the same dataset named COVIDx [120] with open access, containing 13,975 CXR images through 13,870 patients. The assembled CXR dataset was categorized into Covid-19(358 images), normal(8066 images), nonCovid19-pneumonia(5538 images). COVID-Net model achieved 93.3% test accuracy. COVID-Net achieved sensitivity of 73.9%, 93.1%, 81.9%, 100.0%; precision of 95.1%, 87.1%, 67%, 80.0% for four-class classification(normal, bacterial, nonCovid19 viral, and Covid19 viral classes) respectively.

A combination of graph neural network(GNN) and CNN based(i.e., CNN + 2L-GCN) SARS-Net was proposed by Kumar et al. [127] in 2021 for detecting Covid-19 abnormalities. The COVIDx[120] dataset was used for training(90%), val(10%), test(10%) the model. The SARS-Net achieved an accuracy of 97.60% and a sensitivity of 92.90%. An agent-based simulation was proposed by Chopra et al. [128] in 2021 for vaccine administration. A Graph Neural Network(GNN) based DeepABM-COVID framework utilizes agent, interaction, infection, and progression modules. They presented a sample for results on delaying the second dose of the mRNA vaccine and presented recommendations on when this strategy could be usefully adopted. The author suggests DeepABM is a scalable and efficient. A lightweight IoT feature vector-based custom DCNN, Raspberry Pi deployed Covid-19 detection CAD-GUI-tool application proposed by Bhosale et al. [47] in May 2022. LDC-net was trained on 5-different datasets[82]. The 76%:12%:12% CXRs ratio samples allocated for train:test:val sets. The Proposed LDC-Net not only classifies Covid-19 but also other eight lung diseases for the X-ray radiography images. The IoT deployed application trained large X-ray samples(10,800). The system attained the highest 99.28% accuracy and a minimum error-rate of 4.83%. The LDC-Net took 0.136 s to test individual CXR. Zanwar et al. [61], in Feb 2022, proposed a custom DCNN-based Covid-19 classification using CXRs. The proposed system utilizes the Cohen datasets containing normal(1540), Covid(1520), and Pneumonia(1560) labels. Implemented DCNN-based system attained the 96.59% recognition rate for pneumonia disease. A multi-channel capsule(MLCN) based architecture was proposed for the detection of Covid-19 by Sridhar and Sanagavarapu [129] in April 2021. MLCN utilizes 1678(healthy) and 902(Covid-19) CXR samples for feature extraction and testing. MLCN attained an accuracy of 96.8%. A ensemble VGGCapsNet(Capsule Network + VGG16) proposed by Tiwari et al. [130] for 3-class classification. The proposed CapsNet utilizes 219(Covid-19), 1345(Pnuemonia), and 1341(Normal) for feature extraction. It utilizes primaryCap + XRayCaps architecture for disease classification. CapsNet attained overall 92% accuracy for Covid-19 label.

4 Diagnosis Based on Computed Tomography (CT) Images

4.1 Pre-Trained Deep Learning Models

Yang et al. [92] in Apr. 2020, created a DL-based coronavirus detection approach from HRCT images using DenseNet. The pre-trained network uses a 3-layer block with dense, global average pooling and an FC-layer. They collected a dataset of confirmed Covid-19 patients from Shanghai hospitals containing 295 HRCT images, including healthy (149) and Covid (146) classes. The dataset was partitioned to train (45%), test (45%), and validation (10%) sets. The experiment shows the performance with the threshold of 0.8 for accuracy, sensitivity, specificity, F1-score, and AUC of 95%, 100%, 90%, 95%, and 99%, respectively. Silva et al. [106] in Sept. 2020, vote casting and different datasets strategies offered a DL approach for selecting Covid-19. However, data inclines to current frames of variable superiority, which might originate in other CT mechanisms. Replicating circumstances of the nations and towns with data origin. In this system, the data from assumed individuals have categorized a cluster into a polling mechanism. The suggested method is verified on two datasets and a cross-dataset study of Covid-19 CT scrutiny and divided into [76, 105] forms. The input CT pictures from a given patient are categorized as a cluster in polling. The system uses exploited and extended version of EfficientNet(a version of synthetic DNN). The assembled dataset partitioned randomly into training(80%) and test(20%) set contains 3294 CT images, including 1601 for Covid19 and 1693 for nonCovid19 class. The individual execution was attained for COVID-CT [105] with an 87.60% accuracy; SARS-CoV-2 CT-scan [76] with 98.99% accuracy. The suggested method achieved an 87.6% accuracy, 86.19% F1-measure, and 90.5% AUC. However, the cross-dataset performance has shown that accuracy decreases to 56.16% in the most delicate assessment setting. Anwar et al. [107] in Nov. 2020, proposed DL-based diagnoses of Covid-19 through CT-scan using a variant of DNN(EfficientNet B4). The proposed system uses three different learning rates: reducing the learning-rate(reducing on a plateau), recurring, and constant learning-rate. The dataset collected from a single online public source [76] contains 1494 CT images, including 702 scans for Covid and 792 for nonCovid class. The data expansion technique was applied to upsurge the quantity of the samples. No separate validation dataset was used in this study because of the limited dataset. The fivefold cross-validation is practiced such that test data is expected in each fold. EfficientNet DL architecture offered an 89.7% accuracy, 89.6% F1-score, and 89.5% AUC. The reducing learning rate approach achieved the highest 0.9 F1-score. Whereas recurring learning-rate, constant learning-rate achieved F1-score of 0.86, 0.82 respectively.

A DL-derived ML-classifier developed by Javor et al. [9] in Dec. 2020 for Covid-19 grouping. They collected a private dataset of 6868 CT samples, including 3102 for Covid19(+ Ve) and 3766 Covid19(-Ve). The dataset was separated into 80:20 training & validation sets. For testing, 90 images of 90 patients, including 45 Covid-19(+ Ve) patients, were selected randomly, and 45 negative patients were chosen manually. Further, photos were resized to 448 × 448 pixels. The proposed model reached a complete accuracy of 98.6%, a sensitivity of 99.3%, and a specificity of 75.8%. Ko et al. [38] in 2020, established a 2D-DL framework using CT images to diagnose Covid-19 pneumonia and discriminate it from nonCovid pneumonia and non–pneumonia. TL was used to create FCONet (FastTrack Covid-19 classification network), constructed by utilizing pre-trained DL models(VGG16, ResNet-50, Inception-v3, and Xception). They have assembled the dataset from 2 hospitals and two public platforms [76, 77], containing 3993 CT images. After data augmentation, the entire image set contains 31,940 CT-images, including Covid-19 pneumonia(9550), other pneumonia(10,860), normal lung(7890), and lung-cancer(3550). At a ratio of 80%:20%, the dataset was separated into training and testing sets. Among four pretrained models of FCONet, ResNet50 revealed outstanding execution with 99.58% sensitivity, 100% specificity, and 99.87% accuracy. Compared to other exterior testing data sets with poor CT scans, ResNet-50 attained the uppermost 96.97% accuracy, followed by Xception, Inception-v3, and VGG16 with 90.71%, 89.38%, and 87.12%, respectively. Dutta et al. [112] in Jan. 2021, suggested a Covid-19 recognition system through TL with multilayer DCNN(Inception V3) from CT images. The dataset was picked from Kaggle [74] with binary classification for ‘Covid + ve’ and ‘Covid-ve’ classes. For the training set, 279 images of ‘Covid + ve’ and 279 of ‘Covid-ve.’ 70 images for each ‘Covid + ve’ and ‘Covid-ve’ class for the validation set. The dataset was partitioned into two parts: training and validation, with an 80:20% ratio. The last few layers of the model were swapped with a customized DNN, with four customized layers, including the flattening layer, dense layer, dropout, and sigmoid activation function. The projected structure with 84% accuracy outperformed in this classification task.

Javaheri et al. [40] in Feb. 2021, created CovidCTNet, a DL open-source technique for recognizing Covid-19 from CT scans. In CovidCTNet framework, the compound preprocessing steps on CT samples were applied using the BCDU-Net structural design. The BCDUNet was designed based on U-Net. They collected a dataset from a public repository [85] and five medical centers in Iran. The BCDUNet differentiates Covid-19 from CAP in addition to supplementary lung illnesses. The assembled dataset contains 89,145 CT images, including 32,230 samples of affirmed Covid19, 25,699 of CAP, and 31,216 with healthy lungs or other illnesses. All CT slices were resized from 512 × 512 to 128 × 128. The dataset is divided into 90% for training and 10% for testing. CovidCTNet system attained a 91.66% accuracy, 87.5% sensitivity, 94% specificity, and 95% AUC based on experimental data. Jiang et al. [36] in Feb. 2021, suggested recognizing Covid-19 from thoracic CT images using DL. VGG16, ResNet50, Inception-v3, InceptionResNetv2, and DenseNet169 are the five extensively utilized DL pre-trained models employed in this system. The originally collected dataset comprises Covid-19(349) and non-Covid-19(397) CT pictures collected from preprints and 888 lung cancer CT scans from LUNA16. CycleGAN was used for synthetic image generation. Then datasets were classified into Covid, nonCovid, and lungCancer, containing 1300 CT images in each class. All samples were resized to 512 × 512 pixels and 66.33:33.33% dataset splitting ratio for training(1000 images) and testing(300 images) set. The experiment shows the DenseNet169 attain the finest performance with all the measures, including accuracy, recall, precision, F1-Score for the synthetic dataset of 98.92%, 97.80%, 100.00%, 98.89%, and real dataset of 98.09%, 97.80%, 97.37% 97.92% respectively.

The proposed approach (ReCOV-101) by Rohila et al. [124] in Jun. 2021 utilizes complete lung CT scans to detect the possibilities of Covid-19 contamination. They used pre-trained DCNN models are ResNet-50, ResNet-101, DenseNet-169, DenseNet-201. A DCNN with ResNet-101 as a pillar for ReCOV-101 resolves the vanishing gradient's difficulties using frisk connections. It avoids training from tiny layers and attaches straight to the output layer. The performance affecting layer is skipped by regularization. The dataset was collected from MosMedData [123] with partitioning based on the degree of seriousness of the Covid-19 contamination (GGO participation of the lung parenchyma from 25%, 50%, 75% degree). Dataset CT-0 contains normal lung tissue; CT-1 contains parenchyma less than 25%; CT-2 contains parenchyma between 25 and 50%; CT-3 contains parenchyma between 50 and 75%; CT-4 contains parenchyma exceeding 75%. The collected dataset contains 1105 images, including 250(normal) images for CT-0 class and 856(Covid) slices from CT-1 to CT-4 class. The dataset is distributed into 60:20:20 ratios for training, validation, and testing. The proposed structure achieved uppermost performance by ResNet101 using Adam-1 with 94.9% accuracy.

4.2 Hybrid and Custom Deep Learning Models

Ying et al. [43] in Feb. 2020, created a DL-based CT diagnoses system called as DeepPneumonia. The private datasets were collected from 3 hospitals and medical institutes. The data set was separated into randomized divisions of 60%:10%:30% for train, validate, and test sets. The DRENet structure was built using ResNet-50, in which the FPN is responsible for extracting top K data features from every picture. The proposed DRENet achieved an 86% accuracy, 95% AUC, 96% recall, 79% precision, and 87% F1-score. Turkoglu[49], in Jan. 2021, proposed a novel multiple kernels ELM (Extreme Learning Machine) built on a DCNN named MKs-ELM-DNN for recognition of Covid-19. The features were retrieved from CT using a DenseNet201 structure. An online public database [76] comprising Covid19 and nonCovid labels are employed to calculate ELM-classifier's architecture performance based on different activation approaches like ReLU-ELM, PReLU-ELM, and TanhReLU-ELM. At last, class annotation is defined as the voting technique aimed at estimating outcomes. After applying the data augmentation technique, the 746 CT images (349 and 397 for no-findings and Covid-19, respectively) expanded to 3730 images. At the same time, the extended dataset enclosed 1745 samples of Covid-19 and 1985 nonCovid19 classes. The highest accuracy achieved by the ReLU-ELM activation based on Multiple Kernels-ELM for data augmented(No) and augmented(Yes) of 87.02% and 96.75%, respectively. The MKs-ELM-DNN classification attained 98.36% accuracy, 98.28% sensitivity, 98.44% specificity, and 98.22% precision, 98.25% F1-score, and 98.36% AUC for Covid-19 recognition.

Wu et al. [115] in Feb. 2021, suggested a hybrid structure called COVID-AL for diagnoses of Covid-19 using weakly supervised DL from CT images. They assembled a dataset from 3 public online platforms [84, 114] and Ronneberger for lung region segmentation. The system considered 962 CT images considering 304 for Coronavirus-pneumonia, 316 for pneumonia, and 342 for normal class. The data splitting into the form of training (70%), testing(10%), and validation(20%). The proposed network design of a 2D U-Net for the lung area segmentation and a 3D residual network for the finding of Covid-19. In four steps of downsampling, the encoder of network segmentation retrieves image features by twice convolutional and pooling layers. The decoder of the segmentation network avoids connection to add features in the same phase. The system evaluation of COVID-AL offers 86.60% accuracy, 96.20% precision, and 96.80% AUC.

In another study by Tiwari et al. [131] developed lightweight applications in combination with modified TL models(VGG, DenseNet, ResNet, MobileNet) and Capsule networks using CT images. All the modifications were made based on CNN + PrimaryCaps + CTscanCaps layers sequences using DL-varients. The highest 99% classification accuracy was attained by MobileCapsNet. Modi et al. [132], in July 2021, suggested a capsule network for the classification of CT scan images performing the detection of Covid-19. The suggested DOCN was initially trained on 360 Covid-19 scans, and 397 CT scans for other diseases and healthy subjects. DOCN utilises 3-Conv + 3-Capsule layers. The system attained binary classification accuracy of 98%, sensitivity of 81%, and specificity of 98.4%.

Apart from the above description, we have tabulated the remaining systems in Table 2; which highlights the key aspects, such as data sources, training models, class-wise number of images, data partitioning, and the performance metrics of the examined DL-based Covid-19 diagnostic systems from X-ray, CT frames utilizing pre, hybrid, and custom-made networks with DTL.

Table 2 Remaining summary of DL-based Covid-19 X-Ray, CT diagnosis systems

5 Diagnosis Based On Multimodal Radiography Images

Kassani et al. [97] in Apr. 2020 suggested a DL-based automatic recognition of coronavirus infection from binary(X-ray, CT) modalities. The system used 8 DCNN variants(MobileNet, DenseNet, Xception, ResNet, InceptionV3, InceptionRes-NetV2, VGGNet, NASNet). Obtained features were fed to an ML classifier to separate the subjects, either Covid-19 or other. This method avoids work definite in data pre-processing. They assembled a multisource dataset [72, 77] containing Covid-19 positive frames(117:X-ray, 20:CT) and healthy frames(117:X-ray, 20:CT). The DenseNet121 feature mining using a bagging tree attained the most satisfactory performance at 99% categorization accuracy. The 2nd most extraordinary learner was a hybrid of a ResNet50 feature extractor trained by LightGBM with 98% precision. Horry et al. [101] in Aug. 2020, developed a Covid-19 method of identification employing seven versions of CNNs from three(X-ray, ultrasound, CT) radiography modalities images (a multimodal image classification). The used different CNN techniques are: VGG16/VGG19, Resnet50V2, InceptionV3, Xception, InceptionResNetV2, DenseNet121, and NASNetLarge. The experimentation used four public datasets[76, 82, 118]; a total of 62,476 images are chosen from the source dataset, including 140 for Covid-19, 320 for pneumonia, 60,361 for normal images of X-ray; 349 for Covid-19, 397 for nonCovid images of CT; 399 of Covid-19, 275 of pneumonia, 235 of normal images for the ultrasound. The X-ray dataset has been selected to eliminate a certain solitary image that was wrongly marked. After being curated, the X-ray set contains 139(Covid-19), 190(pneumonia), and 400(normal) samples. After applying data augmentation, the complete dataset reached to 34,560 samples, including Covid-19(2920), pneumonia(2920), normal(5840) for X-ray; Covid-19(6000), nonCovid(6000) for CT; and Covid-19(2720), pneumonia(2720), normal(5440) for the ultrasound. Further, N-CLAHE improves luminosity and sharpness. A picture is redrawn to the default classifier with 224 × 224 for VGG and 299 × 299 for InceptionV3. The data partitions were randomly selected for train and test sets. Overall results show that ultrasound images deliver higher recognition accuracy than X-ray & CT modalities. The VGG19 achieves significant Covid-19 recognition alongside pneumonia or normal, although three modalities approached with the precision of 86% for X-ray, 100% for ultrasound, and 84% for CT. Nath et al. [34] in Oct. 2020, introduced DNN structure to sense Covid-19 for 2-class (Covid, nonCovid) and 3-class (Covid, nonCovid, pneumonia) classification from X-ray and CT. The data set is collected from 3 public platforms [72, 73, 76]. CXR dataset includes Covid19 (219), normal(1341), viralPneumonia(1345); 349 for Covid, and 397 for nonCovid classes of CT modality. Further data set separated to 80:20 proportion for training and testing. CNN-model has been constructed with the organization of 24-layers. SGD with momentum optimizer was utilized for both datasets with LR = 0.001. It achieves 99.68% and 71.81% accuracy for X-ray and CT, respectively.

Panwar et al. [57] in Nov. 2020, suggested a new DTL algorithm for binary classification and tested it on three diverse radiology online datasets [72, 78, 105] of X-ray and CT modalities. The system was an altered version of DCNN called VGG16. The structure comprises 19-weighted layers of DCNN in VGG19. Experiments used the PA-view of CXRI to examine the lungs better. The experiment attained an 95.61% accuracy, 94.04% sensitivity, 95.86% specificity, 95% F1-score, and 96% recall. The weights acquired after network training through CT treatment would also answer X-ray significantly. The achieved outcomes disclose that the individual identified with pneumonia had additional probabilities of becoming confirmed as a false-positive by the projected network. Alakus et al. [37] in Nov. 2020, proposed DL-based Covid-19 recognition system. The suggested system was trained on six neural network variants(ANN, CNN, LSTM, RNN) and 2 hybrid models(CNNLSTM, CNNRNN). Using laboratory decisions, X-ray, and CT images, the prediction was learned for Covid-19 infection with DL models. The laboratory findings are: hematocrit, hemoglobin, platelets, red blood cells, lymphocytes, leukocytes, basophils, eosinophils, monocytes, serum glucose, neutrophils, urea, C-reactive protein, creatinine, potassium, sodium, alanine transaminase, aspartate transaminase. The single dataset contains 520 no-findings and 80 Covid19 patients. The experimented DTL properties include a batch size of 512, 256, 32, 16 with a learning rate of 0.001 and epoch 250 with activation function ReLu and SGD optimizer. The proposed experiment results were categorized into two methods: data split and model performance. LSTM gives the highest assessment outcomes with tenfold cross-validation 86.66% correctness, 91.89% F1-measure, 86.75% exactness, 99.42% recall, 62.50% AUC. CNNLSTM gives the highest assessment results among all DL models with train-test(80:20) divided method are 92.30% accuracy, 93% F1-measure, 92.35% exactness, 93.68% recall, and 90% AUC.

For Covid-19 automatic detection, Hussain et al. [93] in Jan. 2021, constructed a new CNN model called CoroDet using primary X-ray and CT scanning photographs. The system considered 7 public online repositories [72, 73, 75, 76, 105, 116, 117] by combining, modifying, and preparing COVID-R dataset. The Covid-R dataset prepared based on binary class(Covid-19 and normal), 3-class(Covid-19, pneumonia, and normal), and 4-class(Covid-19, pneumonia-bacterial, pneumonia-viral, and normal) classification. Binary classification is stated on Covid-19(500) and normal (800) images. For the 3-class classification, pneumonia-bacteria images are 800 and binary class samples. For 4-class classification, pneumonia-viral and pneumonia-bacteria samples are 400 each and binary class samples. System attained an 99.1% accuracy, 99.27% precision, 98.17% recall, 98.51% F1-score for 2-class classification; 94.2% accuracy, 95.37% precision, 97.47% recall, 98.62% F1-score for 3-class classification; and 91.2% accuracy, 94.27% precision, 96.17% recall, 97.51% F1-score for 4-class classification. The authors have consulted with clinicians to understand the difference between each class of CXRI. Perumal et al. [23] in Jan. 2021, introduced a new DL approach for classifying distinct pulmonary illnesses containing Covid-19 from CXR and CT scans. Perumal proved that Covid-19 is considerably like viral pneumonia lung infection using TL. The knowledge adapted from the training model for spotting viral pneumonia could be applied to Covid-19 detection. Haralick features were used for feature extraction to emphasize only the ROI to detect Covid-19. The noise impedance from infected areas and tissues makes it challenging to sense the images' atypical features. The proposed model uses three pre-trained networks(VGG16, ResNet50, and InceptionV3). The dataset was downloaded from 2 public repositories (NIH [82], Mendeley [78]), contains pulmonary disease(81,176), bacterial-pneumonia(2538), viral-pneumonia(1345), normal (1349), Covid-19(205) of CXRI; and 202 for Covid-19 CT class. The images were resized to 256 × 256. The HE (to increase the contrast) and Weiner filters(to eliminate the noise) were used to enhance the image quality. The analysis shows that 385 Covid-19 images were properly categorized under Covid-19 class, and 22 images were wrongly categorized under non-viral pneumonia class out of 407 samples. These results conclude that Covid-19 is related to viral-pneumonia. The misclassification rate was 0.012 for viral pneumonia. Testing pneumonia model with Covid-19 dataset, the VGG16 offers the highest accuracy of 93.8%.

A CNN technique for categorizing healthy people from Covid-19 and pneumonia has been created by Gilanie et al. [135] in Apr. 2021. The model was trained on 3-classes (normal, pneumonia, and Covid). The system used three openly accessible and locally created datasets (Radiology Dept. BVHB, Pakistan). The datasets under experiment consist of 15,108 images, including 7021 X-ray, CT scans of each normal and pneumonia class, and 1066(539 X-ray, 527 CT) frames of Covid-19 class. During system design, each image shrank to 256 × 256. Dataset was separated into 60%, 20%, and 20% for train, cross-validation, and test sets. The projected method attained an average of 96.68% accuracy, 95.65% specificity, and 96.24% sensitivity. The quantitative information for multimodal imaging is listed in Table 3.

Table 3 Summary of DL-based Covid-19 multimodal diagnosis systems

5.1 Imaging Segmentation for Covid-19 Diagnosis

Voulodimos et al. [96] in May 2020, proposed DL-based classification & semantic-segmentation labeling of diseased lung regions for Covid-19 from CT scans. Suggested system used FCN-8 and U-Net for Covid region segmentation [136]. Given an input as a CT image, the FCN8 tends to create new coarse borders. Alternatively, U-Net provides smaller slicker regions than the original annotated region. The dataset was collected from radiopaedia [80] for the experiment consisting of 939 cross-sectional images, including 447 CT slices annotated as ‘–Ve’ and 492 as ‘ + Ve.' From the collected dataset, 85% were utilized for training and validation, 15% for testing. Among the total dataset, 90% were used for training and 10% for the validation set. The system reached 99% accuracy, 89% recall, 91% precision, and 89% F1-score on the validation set and offered the usual execution time per image between 0.01 s to 0.018 s.

Abdel-Basset et al. [137] in Jan. 2021, a dual-path DL network was demonstrated for partial-monitored FewShot Segmentation for Covid-19 (FSS-2019-nCov) contamination. FSS-nCov delivers a precise division of Covid-19 contamination from the insufficient number of labeled images. Each path covers encoder-decoder(ED) architecture to extract features while preserving the channel of Covid-19 CT slices. The ED structure comprises an encoder, context enrichment, and decoder. The FSS-nCov uses RestNet34 for feature extraction at encoder. They introduced a Smoothed-Atrous-Convolution block, Multiscale-Pyramid-Pooling block, and an adaptive Recombination-Recalibration(RR) unit, allowing rigorous information sharing between tracks. The experimental results achieve the dice similarity coefficient (DSC) of 79.8%, the sensitivity of 80.3%, specificity of 98.6%, and Mean Absolute Error (MEA) of 6.5%.

6 Open Discussion, Challenges, and Future Trends

6.1 Open Discussion

The primary objective of this analysis is to find the most vital DL model for detecting, segmenting, and classifying Covid-19 using CNN variants. Table 4 summarizes individual systems' outcomes and compares different DL experimental setup work reported in the studied articles. In the comparative analysis of experimental results from Table 4, each row represents the number of layers used in the model, the kernel size, pool size, stride, batch size, image size, learning rate, number of epochs, performance evaluation metrics, activation function, optimizer, and libraries used for Covid-19 detection. Throughout the study, 64 systems were examined, with 36 systems based on X-ray images, 20 on CT, and 8 on multimodal imaging systems. Most systems used multisource data, and only a few used a single data source among the studied articles. The utilization and demand of enormous datasets with good maximum-likelihood diagnosis, such as RT-PCR, confirm Covid-19, and outcomes such as death and discharge duration are common to DL techniques for imaging Covid-19 [138]. Integrating medical data in time series containing recurrent samples and blood tests may significantly boost training datasets. Some researchers and radiological studies have taken the initiative to release Covid-19 datasets on public platforms [71,72,73,74,75,76,77,78,79,80,81,82, 136].

Table 4 Summary report of work done on experimental setup in the studied papers

Most Covid-19 detection systems assessed accuracy, sensitivity, specificity, and F1-score as performance evaluation metrics (Appendix A). From individual image modality the highest 100% accuracy achieved by [99], 100% specificity achieved by [38, 39, 51, 53, 55, 68], 100% sensitivity (recall) achieved by [52, 63, 68, 70, 92], 100% precision achieved by [35, 36, 39, 54, 55, 68], 99.99% AUC by [98], and the best F1-score of 99.99% attained by [39]. Whereas 2-mode imaging (X-ray, CT) has the highest of 99.68% accuracy achieved by [97], 98.17% recall, and 99.27% precision achieved by [93]. Along with 3-mode imaging (X-ray, CT, ultrasound), the highest 100% precision is achieved [101]. In the case of 4-class classification 99.94% accuracy achieved by CovidAid [63], for 3-class classification 99.72% accuracy achieved by CovidDetection-Net [41], and for binary classification 99.97 accuracy achieved by [63]. In a custom network, the highest accuracy of 99.99% was attained by Covid-ChexNet[39]. The X-ray [72] dataset is the most frequently used dataset by 23 times (analysis from Table 1). The highest time taken by the hybrid EDL-COVID [60] model was the fastest detection speed of 342.92 s, whereas the least execution time was 0.013 and 0.014 s by [47] and [139]. However, a custom model has consequences like computation efficiency, layer size, epochs, training parameters, number of layers, etc.

Compared to the custom models, most evaluated techniques outscored the pre-trained network. Because different data sizes were used practically in each investigation, the efficiency of the developed models varied (regarding the data source) and hence is not comparable. In terms of imaging modality comparison, the X-ray modality outperformed, followed by CT and ultrasound.

However, one instance was the pneumonia analysis from CT [138], which intended to autonomously detect and measure aberrant structures throughout the chest, allowing the detailed examination of non-contrast chest CT scans for scientific purposes. This process detects pneumonia-related lungs, divisions, and anomalies. It also evaluates more significant anomalies, which have been linked to severe conditions. These findings could assess the degree and progress of anomalies in Covid-19 patients [138]. These evaluation methods should be deployed at the remote console, i.e., the clinicians' viewing desk. Attempts are being taken to ensure that the right user experience and interoperability for such concepts occur using limited processing power, hardware, and the internet [138]. The overwhelming bulk of cutting-edge DNN is trained on 2D drawings. CT and MRI, 3D scams certainly contribute to the root issue. Because standard DL networks are not fine-tuned, history is invaluable when implementing DL models on such imageries [140]. However, some solutions used a large sample dataset for other diseases, whereas Covid-19 instances are limited. Across the study, binary to multi-classifications is considered. The Covid-19 detection systems [41, 42, 46, 72, 91, 112] are managed to get an equal quantity of samples in each class to achieve the highest accuracy.

Numerous publications ignored highlighting the diagnosis efficacy of their technique, even though distinguishing severe Covid-19 instances versus regular chest X-rays may not have been challenging [62, 141]. Based on the degree of severity, Covid-19 affected lungs were classified as severe vs. non-severe [142], and severity [143], normal vs. severe [144] only. In [16], the dataset was considered for model training to find the severity of the affected lungs, but the results were not satisfactory. Furthermore, there is no reason why investigators preferred a particular network structure over another and did not equate their outcome if some other CNN design was chosen [141]. The study [145] proves how dropout weights can affect the uncertainty of the DL model of Covid-19 prediction. Along with disease classification, Alazab [146] et al. predicted the Covid-19 outbreak using LSTM and time-series analysis for coastal areas. Suggested research proves that the Covid-19 outbreak in coastal regions is higher than in non-coastal.

The medical observations [45] for Covid-19 CT scans are radial, symmetrical, sub-pleural, multichannel, posterior, frontal, and middle expansion of airways, broncho-vascular thinning (thickness of the in lesion), traction bronchiectasis, a bizarre pavement look (GGOs & inter-/intra-lobular septum thickness). Likewise, the following aspects are seen in pneumonia sufferers' CT images: Reticular transparency, centralized dispersion of GGO, unilateral, and More widespread dispersion across the Bronchovascular link (Vascular, Bronchial wall thickness). CAM, GradCAM [65] shows an important role in object detection for visualization of Covid-19 affected area on lung from radiograph images and bounding box [136]. A lightweight RaspberryPi-based GUI application (LDC-Net) [47] may be deployed with radiography machines(X-ray, CT, Ultrasound), which further may help physicians for diagnosis assistance.

However, in the case of Ultrasound imaging for Covid-19 respiratory periodic inspection and advancement of Covid-19 illness is visible as B-lines oxygenation from the initial phases to severe phases [101]. Also, radiological indicators in LUS could be exploited in Covid-19 client treatment [27]. A 4-level grading scheme with a scale from zero to three for coarse disease assessment was developed. Grade 0 denotes the existence of a consistent pleural lining followed by horizontal artifacts known as A-lines [147] that represent an excellent lung area. Grade 1 means the earliest evidence of irregularity, i.e., the advent of pleural-line changes associated with vertical artifacts. Grades 2 & 3 indicate a more progressive pathologic condition, with modest or substantial consolidations occurring accordingly. Lastly, a grade of 3 confirms the value of a larger hyperechogenic region [136] underneath the pleural area, known as the “white lung.” In [148], LUS imaging identification signs were collected from radiologist practitioners, such as healthy lung(with horizontal A-lines), pneumonia infected(alveolar consolidations), and SARS-CoV-2 infected(sub-pleural consolidation and a focal B-line) cases. B-lines are the most common pathological sign of LUS, and they are caused by pulmonary edoema or non-cardiac sources of interstitial diseases [149]. The Covid-19 LUS image has patchy B-lines, Pleural lines fragmented/irregular, and sub-pleural consolidation as common sign findings [150]. Unlike an X-ray or CT scan, LUS does not use ionizing radiation[151].

The study shows that the dataset used for the experiment was divided into ratios (hold-back, e.g., 80:10:10 for training, testing, and validation) and fivefold or tenfold cross-validation [60] techniques. The investigative presentation was assessed and equated to professional radiologists on an independent testing dataset [91]. To increase the sample size, default transformations and custom augmentations are offered by the fastai2 [91]. Issues like ample storage, noisy images, classification of images, and image retrieval tend to be tedious for developing a DL-based system, and to overcome such problems, CBIR(Content-Based Image Retrieval) [152] was created. Further, similar issues can be solved using ML-based classification algorithms such as Decision Tree, SVM, Naive Bayes, and KNN [153].

6.2 Challenges and Future Trends

There are numerous obstacles using DL methods to discover new coronavirus. Although, DL with Covid-19 detection from lung ultrasound, CT, and X-ray images demonstrates promising outcomes. At the same time, the DL technique requires a large amount of dataset and computation to develop a robust diagnosis system and train the networks.

The available dataset is noisy [151], blurry, unnecessary content like catalog scripts, symbols, and manufacturer-specific user interface[150], and not accurately labeled data. The developed system uses internet downloaded data, so there is a high possibility of data duplication, missing data, weakly labeled dataset [154], or limited labeled dataset [155]. Articles referred [46, 47, 63, 65, 69, 70, 93, 94, 101] for the study using different multisource datasets for their experimentation. Because of this intention, it is pretty problematic to determine which system harvests the most satisfactory result. With a small amount of data, the DL architecture may over-fit, which may degrade the performance of the developed system. To expand the sample-size of Covid-19 class from limited datasets, various authors applied data augmentation [38, 48, 49, 69, 101, 103].

Cross-Dataset Performance The significant model performance assessment for different dataset-based efficiency of learned features by CNN at training from a particular source and testing it on another data source. This results in the sensitivity and specificity generating many varying quantitative results with a difference of 10 to 40%, which has been achieved by[141] for AlexNet, ResNet, VGG, Squeeze Net, and DenseNet. In [106], the cross-dataset analysis results in accuracy drops. In a clinical decision, more excellent and more different data sets are taken to assess the approaches in a real-world situation. Multiple authors used the same dataset for testing and training, so the generated system accuracy is very high, but this is a severe issue about inspecting the efficiency of feature maps generated by CNN on the particular dataset and testing it on features from a diverse source [141]. Reducing features in the system [39] is not an intelligent approach while training a model because it leads to lesser accuracy in the validation and test phase.

Severity Level Recognition of Covid-19 Covid-19 severity levels such as mild-Covid, moderate-Covid, and severe-Covid are used to build triage systems with high clinical value [156]. Apart from severity, the detection of Covid-19 in pregnancy and childbirth [24] did not aggravate the course of symptoms or CT characteristics of Covid-19. The suggested platform's efficiency was not evaluated to that of radiologists.

Custom Models In a multiclassification problem (with three or even more categories), the set of features in the CoVNet-19 feature matrix using a single DCNN could be raised to 64 or 128. CoVNet19 [48] can be optimized and changed into a lighter version since the author employed a stacked ensemble system with two DCNNs and an SVC network [48]. The authors state that they have used a global filter to sense local-features using a filter dimension of 3 × 3, but there is a possibility to spot features lost by the used filters. CheXNet [39] contains flaws, like the number of sample unpredictability regarding data sequencing variations and sensitivity to adversarial attacks. Using CheXNet to improve pulmonary abnormalities identification in CXRs necessitates the creation of ensemble methods that are presently prone to overfitting due to many pictures from Covid-19 infected cells [65].

Ninety percent of referred articles focused on developing standalone Covid-19 detection systems, including computer-aided detection (CAD) [157, 158], which runs on a single machine. So there could be a separate provision for developing public web or enterprise applications for global data collection, processing, and detecting Covid-19. Nevertheless, few developed web applications [10] on the cloud [159] where the doctor can able to upload the X-ray, CT image to the application and provide the doctors with a simple Covid-19(Yes) or Covid-19(No) as a result [97]. Finding the severity of patients using radiography images is also a challenging task. Also, it is hard to find (as per the dataset) the most common age ratio of Covid19 ‘ + Ve’ patients is? Even for the gender/geography ratio [160]. No author has categorized the Covid-19 phases (1st wave, second wave, third wave, delta coronavirus) based on the finding of Covid-19.

In addition, medical investigations would have been needed to prove the remarkable precisions reported in the literature studied article [161]. The clinical evaluation of the developed system should be verified in collaboration with hospitals and medical research centers. After generating the CoroDet [93] system, the DL researcher approached the radiologist to understand the effects of coronavirus and pneumonia on the lung segment [93] and collected the necessary information. It’s hard to suggest an excellent yet concise DL network for the screening of Covid-19. For the large storage capacity of a vast dataset, the big data on the cloud[159] plays a significant role in DL [162]. Because LUS pictures are blurry and data filtration, Fourier processing, and deconvolution are used to upgrade the effectiveness of radiography images. CLAHE, HE, and AHE were used, and the picture with contrast limited AHE [65] was employed before the workflow to improve the clarity of ultrasound images [101]. A U-Net-based semantics separation [163] can also distinguish pulmonary pixels from the body tissues and background [65]. Data fusion enables us to mix several kinds of data to enhance the grouping accuracy of the model. Developing a web application that accepts radiological lung pictures as a source and produces a probability of Covid-19 or pneumonia occurrence and a heatmap emphasizing the likely contaminated areas [69]. We believe that DL-based quantification can help address patients with worsening respiratory status and moderate or severe infection of Covid-19.

6.3 Technical Limitations

See Table 5.

Table 5 Technical limitations for COVID-19 diagnosis

7 Conclusion

This paper provides the recent DL techniques utilized to classify Covid-19 from different lung and chest imaging modalities. However, subsequent X-ray, CT, and ultrasound studies for Covid-19 classification are limited. Widely accepted changes in pulmonary lesion patterns could be witnessed, including GGO and high mortality convergence in the comparatively initial phases. Datasets used in various experiments are provided and discussed in Sect. 2. Its significant barriers to the current methods are underlined in Sect. 6. The identical set of data accumulated by Cohen [72] is used in most studies. The implemented DL-based methods for detecting Covid-19 in the literature indicate impressive outcomes. Even after the remarkable results, there is indeed much space for enhancement.

In most cases, a limited set of images for diseased samples with Covid-19 is used. It is necessary to create public and diversified data sources. Radiology professionals must authenticate the data sets and categorize them with the respective lung illness abnormalities. The majority of the existing methodologies used binary classification, although various other factors can cause pneumonia. GAN’s encouraging outcomes are worthy of additional research. Also, there is a possibility of developing a standard Covid-19 detection system, where multimodal radiography images could be accepted; instead of creating an individual system for each modality. The blended utilization of AI and radiography imaging modalities can compensate for limited hospital facilities while assisting in the definitive screening and diagnostic forecasting of Covid-19. Healthcare professionals and software developers, on the other hand, should regularly interact and use their equally skilled to verify the utility of DL methods. We are enthusiastic about these frameworks' great versatility and assume their fundamental constraints could be resolved. We keep hoping that this beginning will assist the audience in narrowing their focus and pursuing such implementations for the classification of Covid-19 variants(Alpha, Beta, Delta, Omicron, Deltacron, etc.) using medical imaging [47] with deep learning techniques.