Skip to main content

Random Forest Based Deep Hybrid Architecture for Histopathological Breast Cancer Images Classification

  • Conference paper
  • First Online:
Computational Science and Its Applications – ICCSA 2022 (ICCSA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13376))

Included in the following conference series:

Abstract

Breast cancer is the most common cancer in women worldwide. While the early diagnosis and treatment can significantly reduce the mortality rate, it is a challenging task for pathologists to accurately estimate the cancerous cells and tissues. Therefore, machine learning techniques are playing a significant role in assisting pathologists and improving the diagnosis results. This paper proposes a hybrid architecture that combines: three of the most recent deep learning techniques for feature extraction (DenseNet_201, Inception_V3, and MobileNet_V2) and random forest to classify breast cancer histological images over the BreakHis dataset with its four magnification factors: 40X, 100X, 200X and 400X. The study evaluated and compared: (1) the developed random forest models with their base learners, (2) the designed random forest models with the same architecture but with a different number of trees, (3) the decision tree classifiers with the best random forest models and (4) the best random forest models of each feature extractor. The empirical evaluations used: four classification performance criteria (accuracy, sensitivity, precision and F1-score), 5-fold cross-validation, Scott Knott statistical test, and Borda Count voting method. The best random forest model achieved an accuracy mean value of 85.88%, and was constructed using 9 trees, 200X as a magnification factor, and Inception_V3 as a feature extractor. The experimental results demonstrated that combining random forest with deep learning models is effective for the automatic classification of malignant and benign tumors using histopathological images of breast cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Breast Cancer Facts and Statistics. https://www.breastcancer.org/facts-statistics. Accessed 08 Apr 2022

  2. Ginsburg, O., et al.: Breast cancer early detection: A phased approach to implementation. Cancer 126, 2379–2393 (2020). https://doi.org/10.1002/cncr.32887

    Article  Google Scholar 

  3. Yassin, N.I.R., Omran, S., El Houby, E.M.F., Allam, H.: Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: a systematic review. Comput. Methods Programs Biomed. 156, 25–45 (2018). https://doi.org/10.1016/j.cmpb.2017.12.012

    Article  Google Scholar 

  4. Abdar, M., et al.: A new nested ensemble technique for automated diagnosis of breast cancer. Pattern Recogn. Lett. 132, 123–131 (2020). https://doi.org/10.1016/j.patrec.2018.11.004

    Article  Google Scholar 

  5. Hamed, G., Marey, M.A.E.-R., Amin, S.E.-S., Tolba, M.F.: Deep learning in breast cancer detection and classification. In: Hassanien, A.-E., Azar, A.T., Gaber, T., Oliva, D., Tolba, F.M. (eds.) AICV 2020. AISC, vol. 1153, pp. 322–333. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44289-7_30

    Chapter  Google Scholar 

  6. Ho, T.K.: Multiple classifier combination: lessons and next steps. In: Bunke, H., Kandel, A. (eds.) Hybrid Methods in Pattern Recognition, pp. 171–198. WORLD SCIENTIFIC (2002). https://doi.org/10.1142/9789812778147_0007

  7. Kuncheva, L.I.: Combining Pattern Classifiers, p. 382 (2014)

    Google Scholar 

  8. Polikar, R.: Ensemble learning. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning, pp. 1–34. Springer, Boston (2012). https://doi.org/10.1007/978-1-4419-9326-7_1

  9. Sagi, O., Rokach, L.: Ensemble learning: a survey. WIREs Data Min. Knowl. Discov. 8, e1249 (2018). https://doi.org/10.1002/widm.1249

    Article  Google Scholar 

  10. Opitz, D., Maclin, R.: Popular ensemble methods: an empirical study. JAIR 11, 169–198 (1999). https://doi.org/10.1613/jair.614

  11. Oza, N.C., Tumer, K.: Classifier ensembles: select real-world applications. Inf. Fus. 9, 4–20 (2008). https://doi.org/10.1016/j.inffus.2007.07.002

    Article  Google Scholar 

  12. Kuncheva, L.I.: Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy, p. 27 (2003)

    Google Scholar 

  13. Brown, G., Kuncheva, L.I.: “Good” and “Bad” diversity in majority vote ensembles. In: El Gayar, N., Kittler, J., Roli, F. (eds.) MCS 2010. LNCS, vol. 5997, pp. 124–133. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12127-2_13

    Chapter  Google Scholar 

  14. El Ouassif, B., Idri, A., Hosni, M.: Investigating accuracy and diversity in heterogeneous ensembles for breast cancer classification. In: Gervasi, O., et al. (eds.) ICCSA 2021. LNCS, vol. 12950, pp. 263–281. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86960-1_19

  15. Wang, G., Hao, J., Ma, J., Jiang, H.: A comparative assessment of ensemble learning for credit scoring. Expert Syst. Appl. 38, 223–230 (2011). https://doi.org/10.1016/j.eswa.2010.06.048

    Article  Google Scholar 

  16. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015). https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  17. del Rio, F., Messina, P., Dominguez, V., Parra, D.: Do Better ImageNet Models Transfer Better... for Image Recommendation? arXiv:1807.09870 [cs] (2018)

  18. Xu, G., Liu, M., Jiang, Z., Söffker, D., Shen, W.: Bearing fault diagnosis method based on deep convolutional neural network and random forest ensemble learning. Sensors 19, 1088 (2019). https://doi.org/10.3390/s19051088

    Article  Google Scholar 

  19. Breiman, L.: Bagging predictors. Mach Learn. 24, 123–140 (1996). https://doi.org/10.1007/BF00058655

    Article  MATH  Google Scholar 

  20. Zerouaoui, H., Idri, A.: Deep hybrid architectures for binary classification of medical breast cancer images. Biomed. Signal Process. Control 71, 103226 (2022). https://doi.org/10.1016/j.bspc.2021.103226

    Article  Google Scholar 

  21. Hosni, M., Abnane, I., Idri, A., Carrillo de Gea, J.M., Fernández Alemán, J.L.: Reviewing ensemble classification methods in breast cancer. Comput. Methods Prog. Biomed. 177, 89–112 (2019). https://doi.org/10.1016/j.cmpb.2019.05.019

  22. Guo, Y., Shi, H., Kumar, A., Grauman, K., Rosing, T., Feris, R.: SpotTune: Transfer Learning Through Adaptive Fine-Tuning, p. 10 (2018)

    Google Scholar 

  23. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191

    Article  Google Scholar 

  24. Alshalali, T., Josyula, D.: Fine-tuning of pre-trained deep learning models with extreme learning machine. In: 2018 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 469–473. IEEE, Las Vegas, NV, USA (2018). https://doi.org/10.1109/CSCI46756.2018.00096

  25. Ahmed, A., Yu, K., Xu, W., Gong, Y., Xing, E.: Training hierarchical feed-forward visual recognition models using transfer learning from pseudo-tasks. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 69–82. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_6

    Chapter  Google Scholar 

  26. Morid, M.A., Borjali, A., Del Fiol, G.: A scoping review of transfer learning research on medical image analysis using ImageNet. Comput. Biol. Med. 128, 104115 (2021). https://doi.org/10.1016/j.compbiomed.2020.104115

    Article  Google Scholar 

  27. Wang, S.-H., Zhang, Y.-D.: DenseNet-201-based deep neural network with composite learning factor and precomputation for multiple sclerosis classification. ACM Trans. Multimedia Comput. Commun. Appl. 16, 1–19 (2020). https://doi.org/10.1145/3341095

    Article  Google Scholar 

  28. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520. IEEE, Salt Lake City, UT (2018). https://doi.org/10.1109/CVPR.2018.00474

  29. Inception V3 Deep Convolutional Architecture For Classifying Acute. https://www.intel.com/content/www/us/en/develop/articles/inception-v3-deep-convolutional-architecture-for-classifying-acute-myeloidlymphoblastic.html. Accessed 3 June 2021

  30. Iqbal, M., Yan, Z.: Supervised machine learning approaches: a survey. Int. J. Soft Comput. 5, 946–952 (2015). https://doi.org/10.21917/ijsc.2015.0133

  31. Liang, G., Zhu, X., Zhang, C.: An Empirical Study of Bagging Predictors for Different Learning Algorithms, p. 2 (2011)

    Google Scholar 

  32. Bühlmann, P., Yu, B.: Analyzing bagging. Ann. Statist. 30 (2002). https://doi.org/10.1214/aos/1031689014

  33. Adele Cutler, D., Cutler, R., Stevens, J.R.: Random forests. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning, pp. 157–175. Springer, Boston (2012). https://doi.org/10.1007/978-1-4419-9326-7_5

  34. Oshiro, T.M., Perez, P.S., Baranauskas, J.A.: How many trees in a random forest? In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 154–168. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_13

    Chapter  Google Scholar 

  35. Kassani, S.H., Kassani, P.H., Wesolowski, M.J., Schneider, K.A., Deters, R.: Classification of Histopathological Biopsy Images Using Ensemble of Deep Learning Networks. arXiv:1909.11870 [cs, eess] (2019)

  36. Saxena, S., Shukla, S., Gyanchandani, M.: Pre-trained convolutional neural networks as feature extractors for diagnosis of breast cancer using histopathology. Int. J. Imaging Syst. Technol. 30, 577–591 (2020). https://doi.org/10.1002/ima.22399

    Article  Google Scholar 

  37. Zerouaoui, H., Idri, A., Nakach, F.Z., Hadri, R.E.: Breast fine needle cytological classification using deep hybrid architectures. In: Gervasi, O., et al. (eds.) ICCSA 2021. LNCS, vol. 12950, pp. 186–202. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86960-1_14

  38. Nikhil, B.: Image Data Pre-Processing for Neural Networks. https://becominghuman.ai/image-data-pre-processing-for-neural-networks-498289068258. Accessed 12 May 2021

  39. Yussof, W.: Performing Contrast Limited Adaptive Histogram Equalization Technique on Combined Color Models for Underwater Image Enhancement (2013)

    Google Scholar 

  40. Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019). https://doi.org/10.1186/s40537-019-0197-0

    Article  Google Scholar 

  41. ScottKnott: a package for performing the Scott-Knott clustering algorithm in R. https://www.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512014000100002. Accessed 20 May 2021

  42. Borda Count | Mathematics for the Liberal Arts. https://courses.lumenlearning.com/waymakermath4libarts/chapter/borda-count/. Accessed 21 May 2021

  43. Hastie, T., Tibshirani, R., Friedman, J.: Ensemble learning. In: Hastie, T., Tibshirani, R., Friedman, J. (eds.) The Elements of Statistical Learning. SSS, pp. 605–624. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7_16

Download references

Acknowledgement

This work was conducted under the research project “Machine Learning based Breast Cancer Diagnosis and Treatment”, 2020–2023. The authors would like to thank the Moroccan Ministry of Higher Education and Scientific Research, Digital Development Agency (ADD), CNRST, and UM6P for their support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Idri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nakach, FZ., Zerouaoui, H., Idri, A. (2022). Random Forest Based Deep Hybrid Architecture for Histopathological Breast Cancer Images Classification. In: Gervasi, O., Murgante, B., Hendrix, E.M.T., Taniar, D., Apduhan, B.O. (eds) Computational Science and Its Applications – ICCSA 2022. ICCSA 2022. Lecture Notes in Computer Science, vol 13376. Springer, Cham. https://doi.org/10.1007/978-3-031-10450-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10450-3_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10449-7

  • Online ISBN: 978-3-031-10450-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics