Abstract
In lung sound classification using deep learning, many studies have considered the use of short-time Fourier transform (STFT) as the most commonly used 2D representation of the input data. Consequently, STFT has been widely used as an analytical tool, but other versions of the representation have also been developed. This study aims to evaluate and compare the performance of the spectrogram, scalogram, melspectrogram and gammatonegram representations, and provide comparative information to users regarding the suitability of these time-frequency (TF) techniques in lung sound classification. Lung sound signals used in this study were obtained from the ICBHI 2017 respiratory sound database. These lung sound recordings were converted into images of spectrogram, scalogram, melspectrogram and gammatonegram TF representations respectively. The four types of images were fed separately into the VGG16, ResNet-50 and AlexNet deep-learning architectures. Network performances were analyzed and compared based on accuracy, precision, recall and F1-score. The results of the analysis on the performance of the four representations using these three commonly used CNN deep-learning networks indicate that the generated gammatonegram and scalogram TF images coupled with ResNet-50 achieved maximum classification accuracies.
-
Research funding: None declared.
-
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Competing interests: Authors state no conflict of interest.
-
Informed consent: Informed consent was obtained from all individuals included in this study.
-
Ethical approval: The local Institutional Review Board deemed the study exempt from review.
References
1. Pasterkamp, H, Kraman, SS, Wodicka, GR. Respiratory sounds: advances beyond the stethoscope. Am J Respir Crit Care Med 1997;156:974–87. https://doi.org/10.1164/ajrccm.156.3.9701115.Search in Google Scholar PubMed
2. Forum of International Respiratory Societies. The Global Impact of Respiratory Disease, 2nd Edition. Sheffield: European Respiratory Society; 2017.Search in Google Scholar
3. WHO. Global surveillance, prevention and control of chronic respiratory diseases: a comprehensive approach. Geneva, Switzerland: WHO; 2007.Search in Google Scholar
4. Acharya, J, Basu, A, Ser, W. Feature extraction techniques for low-power ambulatory wheeze detection wearables. IEEE Eng Med Biol Soc Conf Proc 2017:4574–7. https://doi.org/10.1109/EMBC.2017.8037874.Search in Google Scholar PubMed
5. Zhang, J, Ser, W, Yu, J, Zhang, TT. A novel wheeze detection method for wearable monitoring systems. IEEE IUCE Conf Proc 2009:331–4. https://doi.org/10.1109/iuce.2009.66.Search in Google Scholar
6. Bahoura, M. Pattern recognition methods applied to respiratory sounds classification into normal and wheeze classes. Comput Biol Med 2009;39:824–43. https://doi.org/10.1016/j.compbiomed.2009.06.011.Search in Google Scholar PubMed
7. Lin, BS, Lin, BS. Automatic wheezing detection using speech recognition technique. J Med Biol Eng 2016;36:545–54. https://doi.org/10.1007/s40846-016-0161-9.Search in Google Scholar
8. Jakovljević, N, Lončar-Turukalo, T. Hidden Markov model based respiratory sound classification. IFMBE Biomed Health Informatics Conf Proc 2017;39–43. https://doi.org/10.1007/978-981-10-7419-6_7.Search in Google Scholar
9. Pramono, RXA, Bowyer, S, Rodriguez-Villegas, E. Automatic adventitious respiratory sound analysis: a systematic review. PloS One 2017;12:e0177926. https://doi.org/10.1371/journal.pone.0177926.Search in Google Scholar PubMed PubMed Central
10. Mushtaq, Z, Su, SF, Tran, QV. Spectral images based environmental sound classification using CNN with meaningful data augmentation. Appl Acoust 2020;172:107581. https://doi.org/10.1016/j.apacoust.2020.107581.Search in Google Scholar
11. Tian, C, Xu, Y, Zuo, W. Image denoising using deep CNN with batch renormalization. Neural Network 2020;121:461–73. https://doi.org/10.1016/j.neunet.2019.08.022.Search in Google Scholar PubMed
12. Aslan, MF, Unlersen, MF, Sabanci, K, Durdu, A. CNN-based transfer learning – BiLSTM network: a novel approach for COVID-19 infection detection. Appl Soft Comput 2020;98:106912. https://doi.org/10.1016/j.asoc.2020.106912.Search in Google Scholar PubMed PubMed Central
13. Hu, Q, Souza, LFDF, Holanda, GB, Alves, SS, Silva, FHDS, Han, T, et al.. An effective approach for CT lung segmentation using mask region-based convolutional neural networks. Artif Intell Med 2020;103:101792. https://doi.org/10.1016/j.artmed.2020.101792.Search in Google Scholar PubMed
14. Kisilev, P, Sason, E, Barkan, E, Hashoul, S. Medical image description using multi-task-loss CNN. LNCS Book Series 2016;10008:121–9. https://doi.org/10.1007/978-3-319-46976-8_13.Search in Google Scholar
15. Gour, N, Khanna, P. Multi-class multi-label ophthalmological disease detection using transfer learning based convolutional neural network. Biom Signal Proc and Con 2020;66:102329. https://doi.org/10.1016/j.bspc.2020.102329.Search in Google Scholar
16. Zuluaga-Gomez, J, Al-Masry, Z, Benaggoune, K, Meraghni, S, Zerhouni, N. A CNN-based methodology for breast cancer diagnosis using thermal images. Comput Methods Biomech Biomed Eng Imaging Vis 2020;9:131–45. https://doi.org/10.1080/21681163.2020.1824685.Search in Google Scholar
17. Vasanthselvakumar, R, Balasubramanian, M, Sathiya, S. Automatic detection and classification of chronic kidney diseases using CNN architecture. AISC Book Series 2020;1079:735–44. https://doi.org/10.1007/978-981-15-1097-7_62.Search in Google Scholar
18. Ranjan, R, Bhushan, B, Palaniswami, M, Verma, A. A convolutional neural network approach for quantification of tremor severity in neurological movement disorders. SAI Intelligent Systems Conf Proc 2020:416–29. https://doi.org/10.1007/978-3-030-55190-2_31.Search in Google Scholar
19. Bengio, Y, Simard, P, Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Network 1994;5:157–66. https://doi.org/10.1109/72.279181.Search in Google Scholar PubMed
20. Salehinejad, H, Sankar, S, Barfett, J, Colak, E, Valaee, S. Recent advances in recurrent neural networks. arXiv preprint 2017. https://doi.org/10.48550/arXiv.1801.01078.Search in Google Scholar
21. Alhussein, M, Muhammad, G. Voice pathology detection using deep learning on mobile healthcare framework. IEEE Access 2018;6:41034–41. https://doi.org/10.1109/access.2018.2856238.Search in Google Scholar
22. Abdel-Hamid, O, Mohamed, AR, Jiang, H, Deng, L, Penn, G, Yu, D. Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio Speech Lang Process 2014;22:1533–45. https://doi.org/10.1109/taslp.2014.2339736.Search in Google Scholar
23. Han, K, He, Y, Bagchi, D, Fosler-Lussier, E, Wang, D. Deep neural network based spectral feature mapping for robust speech recognition. ISCA Interspeech Conf Proc 2015:2484–8. https://doi.org/10.21437/interspeech.2015-536.Search in Google Scholar
24. Chien, JC, Wu, HD, Chong, FC, Li, CI. Wheeze detection using cepstral analysis in Gaussian mixture models. IEEE Eng Med Biol Soc Conf Proc 2007:3168–71. https://doi.org/10.1109/IEMBS.2007.4353002.Search in Google Scholar PubMed
25. Neili, Z, Fezari, M, Redjati, A. ELM and K-NN machine learning in classification of breath sounds signals. Int J Electr Comput Eng 2020;10:3528–36. https://doi.org/10.11591/ijece.v10i4.pp3528-3536.Search in Google Scholar
26. Orjuela-Cañón, AD, Gómez-Cajas, DF, Jiménez-Moreno, R. Artificial neural networks for acoustic lung signals classification. LNCS Book Series 2014;8827:214–21. https://doi.org/10.1007/978-3-319-12568-8_27.Search in Google Scholar
27. Serbes, G, Sakar, CO, Kahya, YP, Aydin, N. Pulmonary crackle detection using time-frequency and time-scale analysis. Digit Signal Process 2013;23:1012–21. https://doi.org/10.1016/j.dsp.2012.12.009.Search in Google Scholar
28. Jin, F, Sattar, F, Goh, DY. New approaches for spectro-temporal feature extraction with applications to respiratory sound classification. Neurocomputing 2014;123:362–71. https://doi.org/10.1016/j.neucom.2013.07.033.Search in Google Scholar
29. Khodabakhshi, MB, Moradi, MH. The attractor recurrent neural network based on fuzzy functions: an effective model for the classification of lung abnormalities. Comput Biol Med 2017;84:124–36. https://doi.org/10.1016/j.compbiomed.2017.03.019.Search in Google Scholar PubMed
30. Altan, G, Kutlu, Y, Pekmezci, AÖ, Nural, S. Deep learning with 3D-second order difference plot on respiratory sounds. Biom Signal Proc and Con 2018;45:58–69. https://doi.org/10.1016/j.bspc.2018.05.014.Search in Google Scholar
31. Altan, G, Kutlu, Y, Allahverdi, N. Deep learning on computerized analysis of chronic obstructive pulmonary disease. IEEE J Biom and Health Info 2020;24:1344–50. https://doi.org/10.1109/jbhi.2019.2931395.Search in Google Scholar PubMed
32. Demir, F, Abdullah, DA, Sengur, A. A new deep CNN model for environmental sound classification. IEEE Access 2020;8:66529–37. https://doi.org/10.1109/access.2020.2984903.Search in Google Scholar
33. Chen, H, Yuan, X, Pei, Z, Li, M, Li, J. Triple-classification of respiratory sounds using optimized s-transform and deep residual networks. IEEE Access 2020;7:32845–52. https://doi.org/10.1109/ACCESS.2019.2903859.Search in Google Scholar
34. Jácome, C, Ravn, J, Holsbø, E, Aviles-Solis, JC, Melbye, H, Ailo Bongo, L. Convolutional neural network for breathing phase detection in lung sounds. Sensors 2019;19:1798. https://doi.org/10.3390/s19081798.Search in Google Scholar PubMed PubMed Central
35. Bardou, D, Zhang, K, Ahmad, SM. Lung sounds classification using convolutional neural networks. Artif Intell Med 2018;88:58–69. https://doi.org/10.1016/j.artmed.2018.04.008.Search in Google Scholar PubMed
36. Acharya, J, Basu, A. Deep neural network for respiratory sound classification in wearable devices enabled by patient specific model tuning. IEEE Trans Biomed Circuits Syst 2020;14:535–44. https://doi.org/10.1109/TBCAS.2020.2981172.Search in Google Scholar PubMed
37. Shi, L, Du, K, Zhang, C, Ma, H, Yan, W. Lung sound recognition algorithm based on VGGish-BiGRU. IEEE Access 2019;7:139438–49. https://doi.org/10.1109/access.2019.2943492.Search in Google Scholar
38. Aykanat, M, Kılıç, Ö, Kurt, B, Saryal, S. Classification of lung sounds using convolutional neural networks. J Image Video Process 2017;65. https://doi.org/10.1186/s13640-017-0213-2.Search in Google Scholar
39. Gupta, S, Agrawal, M, Deepak, D. Gammatonegram based triple classification of lung sounds using deep convolutional neural network with transfer learning. Biom Signal Proc and Con 2021;70:102947. https://doi.org/10.1016/j.bspc.2021.102947.Search in Google Scholar
40. Demir, F, Ismael, AM, Sengur, A. Classification of lung sounds with CNN model using parallel pooling structure. IEEE Access 2020;8:105376–83. https://doi.org/10.1109/access.2020.3000111.Search in Google Scholar
41. Jayalakshmy, S, Sudha, GF. Scalogram based prediction model for respiratory disorders using optimized convolutional neural networks. Artif Intell Med 2020;103:101809. https://doi.org/10.1016/j.artmed.2020.101809.Search in Google Scholar PubMed
42. García-Ordás, MT, Benítez-Andrades, JA, García-Rodríguez, I, Benavides, C, Alaiz-Moretón, H. Detecting respiratory pathologies using convolutional neural networks and variational autoencoders for unbalancing data. Sensors 2020;20:1214. https://doi.org/10.3390/s20041214.Search in Google Scholar PubMed PubMed Central
43. Rocha, BM, Pessoa, D, Marques, A, Carvalho, P, Paiva, RP. Automatic classification of adventitious respiratory sounds: a (un)solved problem? Sensors 2021;21:57. https://doi.org/10.3390/s21010057.Search in Google Scholar PubMed PubMed Central
44. Demir, F, Sengur, A, Bajaj, V. Convolutional neural networks based efficient approach for classification of lung diseases. Health Inf Sci Syst 2020;8:4. https://doi.org/10.1007/s13755-019-0091-3.Search in Google Scholar
45. Shuvo, SB, Ali, SN, Swapnil, SI, Hasan, T, Bhuiyan, MIH. A lightweight CNN model for detecting respiratory diseases from lung auscultation sounds using EMD-CWT-based hybrid scalogram. IEEE J Biomed Health Inform 2020;25:2595–603. https://doi.org/10.1109/JBHI.2020.3048006.Search in Google Scholar
46. Rocha, BM, Filos, D, Mendes, L, Vogiatzis, I, Perantoni, E, Kaimakamis, E, et al.. Α respiratory sound database for the development of automated classification. IFMBE Proc Book Series 2017;66:33–7. https://doi.org/10.1007/978-981-10-7419-6_6.Search in Google Scholar
47. Grinsted, A, Moore, JC, Jevrejeva, S. Application of the cross wavelet transform and wavelet coherence to geophysical time series. Nonlinear Process Geophys 2004;11:561–6. https://doi.org/10.5194/npg-11-561-2004.Search in Google Scholar
48. Ren, Z, Qian, K, Zhang, Z, Pandit, V, Baird, A, Schuller, B. Deep scalogram representations for acoustic scene classification. IEEE/CAA J Autom Sin 2018;5:662–9. https://doi.org/10.1109/jas.2018.7511066.Search in Google Scholar
49. Daubechies, I. The wavelet transform, time-frequency localization and signal analysis. IEEE Trans Inf Theory 1990;36:961–1005. https://doi.org/10.1109/18.57199.Search in Google Scholar
50. Rioul, O, Vetterli, M. Wavelets and signal processing. IEEE Sig Process Mag 1991;8:14–38. https://doi.org/10.1109/79.91217.Search in Google Scholar
51. Patterson, RD, Robinson, KEN, Holdsworth, J, McKeown, D, Zhang, C, Allerhand, M. Complex sounds and auditory images. Hearing Symp Conf Proc 1992:429–46. https://doi.org/10.1016/b978-0-08-041847-6.50054-x.Search in Google Scholar
52. Glasberg, BR, Moore, BC. Derivation of auditory filter shapes from notched-noise data. Hear Res 1990;47:103–38. https://doi.org/10.1016/0378-5955(90)90170-t.Search in Google Scholar
53. Simonyan, K, Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint 2015. https://doi.org/10.48550/arXiv.1409.1556.Search in Google Scholar
54. He, K, Zhang, X, Ren, S, Sun, J. Deep residual learning for image recognition. IEEE ICVPR Conf Proc 2016:770–8. https://doi.org/10.1109/cvpr.2016.90.Search in Google Scholar
55. Alom, MZ, Taha, TM, Yakopcic, C, Westberg, S, Sidike, P, Nasrin, MS, et al.. The history began from AlexNet: a comprehensive survey on deep learning approaches. arXiv preprint 2018. https://doi.org/10.48550/arXiv.1803.01164.Search in Google Scholar
56. Altan, G, Kutlu, Y, Gökçen, A. Chronic obstructive pulmonary disease severity analysis using deep learning on multi-channel lung sounds. Turk J Electr Eng Comput Sci 2020;28:2979–96. https://doi.org/10.3906/elk-2004-68.Search in Google Scholar
57. Altan, G, Kutlu, Y. Hessenberg ELM autoencoder kernel for deep learning. J Eng Techn Appl Sci 2018;3:141–51. https://doi.org/10.30931/jetas.450252.Search in Google Scholar
58. Ruder, S, Peters, ME, Swayamdipta, S, Wolf, T. Transfer learning in natural language processing. NAACL Conf Proc 2019:15–8. https://doi.org/10.18653/v1/n19-5004.Search in Google Scholar
59. Ahmed, KB, Bouhorma, M, Ahmed, MB, Radenski, A. Visual sentiment prediction with transfer learning and big data analytics for smart cities. IEEE CiSt Conf Proc 2016:800–5. https://doi.org/10.1109/cist.2016.7804997.Search in Google Scholar
© 2022 Walter de Gruyter GmbH, Berlin/Boston