Skip to main content

RRF-BD: Ranger Random Forest Algorithm for Big Data Classification

  • Conference paper
  • First Online:
Computational Intelligence in Data Mining

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 990))

Abstract

In the current era, data are growing with a faster rate in terms of exponential form where these data create a major challenge for suitable classification to classify the statistical data. The relevance of this topic is extraction of data, insights, mining of information from the dataset with an efficient and faster manner has attracted attention towards the best classification strategy. This paper presents a Ranger Random forest (RRF) algorithm for high-dimensional data classification. Random Forest (RF) has been treated as a most popular ensemble technique of classification due to its measure variable importance, out-of-bag error, proximities, etc. To make the classification constraint possible, in this paper, we use three different datasets in order to accommodate the runtime and memory utilization effectively with the same efficiency as given by the traditional random forest. We also depict the improvements of Random Forest in terms of computational time and memory without affecting the efficiency of the traditional Random Forest. Experimental results show that the proposed RRF outperforms with others in terms of memory utilization and computation time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wright, M.N., Ziegler, A.: Ranger: a fast implementation of random forests for high dimensional data in C++ and R. arXiv:1508.04409 (2015)

  2. Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)

    Google Scholar 

  3. Kruppa, J., Liu, Y., Biau, G., Kohler, M., König, I.R., Malley, J.D., Ziegler, A.: Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory. Biom. J. 56(4), 534–563 (2014)

    Article  MathSciNet  Google Scholar 

  4. Nguyen, C., Wang, Y., Nguyen, H.N.: Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic. J. Biomed. Sci. Eng. 6(05), 551 (2013)

    Article  Google Scholar 

  5. Azar, A.T., Elshazly, H.I., Hassanien, A.E., Elkorany, A.M.: A random forest classifier for lymph diseases. Comput. Methods Programs Biomed. 113(2), 465–473 (2014)

    Article  Google Scholar 

  6. Rodriguez-Galiano, V.F., Ghimire, B., Rogan, J., Chica-Olmo, M., Rigol-Sanchez, J.P.: An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote. Sens. 67, 93–104 (2012)

    Article  Google Scholar 

  7. Ellis, K., Kerr, J., Godbole, S., Lanckriet, G., Wing, D., Marshall, S.: A random forest classifier for the prediction of energy expenditure and type of physical activity from wrist and hip accelerometers. Physiol. Meas. 35(11), 2191 (2014)

    Article  Google Scholar 

  8. Feng, Q., Liu, J., Gong, J.: Urban flood mapping based on unmanned aerial vehicle remote sensing and random forest classifier—A case of Yuyao. China. Water 7(4), 1437–1455 (2015)

    Article  Google Scholar 

  9. Xiong, J., Pan, J., Yang, J., Zhong, Z., Zou, R., Zhu, B.: An improved fast compressive tracking algorithm based on online random forest classifier. In: MATEC Web of Conferences, vol. 59. EDP Sciences (2016)

    Google Scholar 

  10. Wang, A.P., Wan, G.W., Cheng, Z.Q., Li, S.K.: Incremental learning extremely random forest classifier for online learning. RuanjianXuebao/J. Softw. 22(9), 2059–2074 (2011)

    Article  Google Scholar 

  11. Mursalin, M., Zhang, Y., Chen, Y., Chawla, N.V.: Automated epileptic seizure detection using improved correlation-based feature selection with random forest classifier. Neurocomputing 241, 204–214 (2017)

    Article  Google Scholar 

  12. Chaudhary, A., Kolhe, S., Kamal, R.: An improved random forest classifier for multi-class classification. Inf. Process. Agric. 3(4), 215–222 (2016)

    Google Scholar 

  13. Patti, C.R., Shahrbabaki, S.S., Dissanayaka, C. Cvetkovic, D.: Application of random forest classifier for automatic sleep spindle detection. In: Biomedical Circuits and Systems Conference (BioCAS), 2015, IEEE, pp. 1–4. IEEE (2015)

    Google Scholar 

  14. Sekhar, P., Mohanty, S.: Classification and assessment of power system static security using decision tree and random forest classifiers. Int. J. Numer. Model. Electron. Netw. Devices Fields 29(3), 465–474 (2016)

    Article  Google Scholar 

  15. Genuer, R., Poggi, J.M., Tuleau-Malot, C., Villa-Vialaneix, N.: Random forests for big data. Big Data Res. 9, 28–46 (2017)

    Article  Google Scholar 

  16. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  17. Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random survival forests. Ann. Appl. Stat. 841–860 (2008)

    Article  MathSciNet  Google Scholar 

  18. Aulchenko, Y.S., Ripke, S., Isaacs, A., van Duijn, C.M.: GenABEL: an R library for genome-wide association analysis. Bioinformatics 23(10), 1294–1296 (2007)

    Article  Google Scholar 

  19. Harrell Jr., F.E., Califf, R.M., Pryor, D.B., Lee, K.L., Rosati, R.A.: Evaluating the yield of medical tests. JAMA 247(18), 2543–2546 (1982)

    Article  Google Scholar 

  20. Epstein, J.M.: Agent-based computational models and generative social science. Complexity 4(5), 41–60 (1999)

    Article  MathSciNet  Google Scholar 

  21. Wickham, H.: Positioning. ggplot2, pp. 115–137. Springer, New York, NY (2009)

    Chapter  Google Scholar 

Download references

Acknowledgements

This research work is supported by Indian Institute of Technology (ISM), Government of India. The authors would like to express their gratitude and heartiest thanks to the Department of Computer Science and Engineering, Indian Institute of Technology (ISM), Dhanbad, India for providing their research support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dharavath Ramesh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rao, G.M., Ramesh, D., Kumar, A. (2020). RRF-BD: Ranger Random Forest Algorithm for Big Data Classification. In: Behera, H., Nayak, J., Naik, B., Pelusi, D. (eds) Computational Intelligence in Data Mining. Advances in Intelligent Systems and Computing, vol 990. Springer, Singapore. https://doi.org/10.1007/978-981-13-8676-3_2

Download citation

Publish with us

Policies and ethics