Elsevier

Energy

Volume 190, 1 January 2020, 116346
Energy

Property-based biomass feedstock grading using k-Nearest Neighbour technique

https://doi.org/10.1016/j.energy.2019.116346Get rights and content

Highlights

  • We applied an intelligent classification method to biomass resources.

  • The approach for the classification is based on k-NN model.

  • The best classification result was obtained with Mahalanobis distance function at K = 3.

  • This method offers a high prospect in strategic decision-making.

Abstract

Energy generation from biomass requires a nexus of different sources irrespective of origin. A detailed and scientific understanding of the class to which a biomass resource belongs is therefore highly essential for energy generation. An intelligent classification of biomass resources based on properties offers a high prospect in analytical, operational and strategic decision-making. This study proposes the k-Nearest Neighbour (k-NN) classification model to classify biomass based on their properties. The study scientifically classified 214 biomass dataset obtained from several articles published in reputable journals. Four different values of k (k=1,2,3,4) were experimented for various self normalizing distance functions and their results compared for effectiveness and efficiency in order to determine the optimal model. The k–NN model based on Mahalanobis distance function revealed a great accuracy at k=3 with Root Mean Squared Error (RMSE), Accuracy, Error, Sensitivity, Specificity, False positive rate, Kappa statistics and Computation time (in seconds) of 1.42, 0.703, 0.297, 0.580, 0.953, 0.047, 0.622, and 4.7 respectively. The authors concluded that k–NN based classification model is feasible and reliable for biomass classification. The implementation of this classification models shows that k–NN can serve as a handy tool for biomass resources classification irrespective of the sources and origins.

Introduction

Having come to the end of Millennium Development Goals (MDGs), the United Nations (UN) general assembly adopted the 2030 Sustainable Development Goals (SGDs) in September 2015. Focussing on inclusive development, the new agenda emphasize a holistic approach to achieving sustainable development for all [1]. Incidentally, many of the proposed SDGs are dependent on biomass [2]. Goal 7, 9, 12 and 13 of the SDGs substantially speak to the need for alternative energy [1]. By insight into the SDGs, biomass stands as the main avenue through which massive renewable energy can be sustainably achieved in order to ensure access to affordable, reliable, and modern energy for all. This would inevitably culminate in action to combat climate change and its impacts by elevating the sustainable use of biomass resources, since the power generation from fossil fuel continues to raise environmental concerns. Biomass has been identified as the only alternative naturally occurring fossil fuel substitute both regarding carbon content and the available quantity [3]. In 2013, 462 TWh of electricity was produced globally from biomass [4]. Ever since then, the global consumption of biofuel sourced from biomass has been on the increase, and this trend promises to continue as long as major issues surrounding the exploration of bioenergy is frontally addressed. In the future, biomass energy has the potential to provide cost-effective and sustainable energy supply to the population in the developing countries [5,6]. There is an abundant supply of biomass feedstocks which can be converted to biofuels. These feedstocks include; agricultural crops, residues, forest products, energy crops and algae biomass. Municipal Solid Waste (MSW) is also gaining attention as a viable source of biofuel [7,8].

There is a consensus that biomass fuel is a renewable energy resource, but lack of widely accepted terminology, classification metrics, and global standard lead to some serious misconception during the investigations. The main concerns surrounding biomass exploration are related to how to extend and improve the basic understanding of the composition and properties of biomass and also to profitably apply this knowledge in the interest of environmental safety [9]. The knowledge of the proximate properties of biomass and their classification are essential in their selection as an energy feedstock [10]. Among several classification criteria which have been adopted in the classification of biomass feedstock, the most prominent is based on origin and properties [11,12]. There is no known scientific classification method based on artificial intelligence to the best of the authors’ knowledge. Since energy generation from biomass often requires a nexus of different sources, a detailed and scientific understanding of the class to which a biomass resource belongs is highly essential. An intelligent classification of biomass resources based on proximate properties of biomass feedstock offers a high prospect in analytical, operational and strategic decision-making.

Artificial intelligence has opened a new page in the field of data analysis. Machine learning has been applied to several processes, which include data mining and data analysis [[13], [14], [15], [16], [17], [18]] related to supply chain management [19,20], biomass elemental composition [21], municipal solid waste management [[22], [23], [24]], energy consumption prediction [25] and so on. One of the primary objectives of data mining is classification. It is a form of predictive modelling which defines groups within the entire population. The process of classification focusses on finding a model which describes a data class. The aim of classification is to use the derived function to predict the group to which a data point belongs using an unknown class label. By using the classification technique, one can learn the rules that form categories of data [26]. Data classification has been applied in several fields such as; medicine, credit ranking, customer behaviour, and strategic management [27]. Some of the artificial intelligent-based techniques which has been applied in the classification of various data are; Bayesian classifier, Ensemble classifier, Artificial Neural Network (ANN), k-Nearest Neighbour (k-NN), Support Vector Machine, SVM [14,28]. The k–NN is a common instance-based learning algorithm, which is used to classify an unknown object by ranking the objects neighbour amidst the training data. The result of this then intelligently predicts the class of the new objects. The k in k–NN is a pointer to the number of proximate neighbours which are been evaluated in order to determine the class to which a dataset belongs [26,29]. Apart from its simplicity, k–NN has proven to be an effective and efficient algorithm used in solving several real-life classification problems with good generalization and accuracy [30]. k–NN has been appraised as a transparent machine learning method with high predictive ability, even with little or no prior information about the data distribution [15,31]. k–NN stands out as a data mining method of choice for several classification problems due to its notable advantages, which competitively edge-out other methods. Some of its advantages include [31];

  • i.

    Ability to handle training data that are too large to fit in-memory,

  • ii.

    It can measure the similarities between training tuples and test tuples without prior knowledge about data distribution,

  • iii.

    Reduction of error due to the inaccurate assumption.

  • iv.

    Lower computation time and high prediction accuracy.

The literature review has highlighted the success of k–NN in the classification of data for several applications [15,17,31,32].

This study proposes a proof of concept for the classification of biomass based on the traditional classification of properties as proposed by Khan et al. [11] using k–NN classification algorithm. The research seeks to innovate a consistent, flexible, direct and easy to implement classification method for biomass properties. Identifying biomass classes is very vital to understanding the mechanism which lead to the varying biomass feedstock behaviour and for an improved prediction of biomass properties [33,34]. This will engender an informed decision making about biomass resources management and feedstock production for industrial utilization especially as related to energy generation. Also, power plant developers will be enabled to ascertain the quality of the feedstock coming from various supply chain. Another objective of this study is to extend the boundary of knowledge in terms of biomass selection decision towards the hybridization of biomass sources for energy generation.

Section snippets

Literature survey

The application of k–NN as a machine learning approach spans more than 50 years [35]. Although k–NN was believed to have been introduced in 1951 in an unpublished medicine report, it did not gain much traction until 1960. Ever since then, it has become a renowned pattern recognition and classification technique [26,36].k-NN has been widely applied to various classification and data recognition problems in several studies. Different k-values, distance metrics, and types of data have been applied.

Methodology

The essence of the classification is to apply the algorithm to information which comes from a known data source. This means that the class to which a data belong is evaluated as an input, from which it is transferred to higher dimensional space. In this new environment, the data is grouped with distance function based on the k–NN algorithm. Therefore, the data is classified into new groups to reflect the behaviour of the algorithm.

Results and discussions

The proposed k–NN method employs the 70% of the biomass dataset as the training data. The training data was fed into the model to determine its classification ability, for k ranging from 1 to 4 based on same set of data. This was done since different value of k produces different error rates and accuracy. All the performance assessment methods were computed from confusion matrix [73,74] which were coded in MATLAB environment.

Conclusions

In this study, we have presented a novel approach to the classification of biomass feedstock. Our model is a proof of concept and a prototype of idea for using scientific methods such as k–NN in the classification of biomass data. The developed biomass classification model was assessed using some statistical parameters. The model developed was tested with new 64 data sets. Further studies should be conducted with larger biomass data which sufficiently cover all the biomass classes identified in

CRediT authorship contribution statement

Obafemi O. Olatunji: Conceptualization, Data curation, Formal analysis, Investigation, Project administration, Writing - original draft, Writing - review & editing. Stephen Akinlabi: Validation, Writing - review & editing. Nkosinathi Madushele: Validation, Writing - review & editing. Paul A. Adedeji: Conceptualization, Formal analysis, Writing - review & editing.

References (87)

  • D. Adeniyi et al.

    Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method

    Appl Comput Inf

    (2016)
  • G. Tao et al.

    Biomass properties in association with plant species and assortments I: a synthesis based on literature data of energy properties

    Renew Sustain Energy Rev

    (2012)
  • G. Tao et al.

    Biomass properties in association with plant species and assortments. II: a synthesis based on literature data for ash elements

    Renew Sustain Energy Rev

    (2012)
  • R. Todeschini et al.

    A new concept of higher-order similarity and the role of distance/similarity measures in local classification methods

    Chemometr Intell Lab Syst

    (2016)
  • A.K. Gjertsen

    Accuracy of forest mapping based on Landsat TM data and a kNN-based method

    Remote Sens Environ

    (2007)
  • X. Peng et al.

    Control rod position reconstruction based on K-Nearest Neighbor Method

    Ann Nucl Energy

    (2017)
  • L. Nunes et al.

    Biomass combustion systems: a review on the physical and chemical properties of the ashes

    Renew Sustain Energy Rev

    (2016)
  • J. Cai

    Review of physicochemical properties and analytical characterization of lignocellulosic biomass

    Renew Sustain Energy Rev

    (2017)
  • J.M. Vargas-Moreno et al.

    A review of the mathematical models for predicting the heating value of biomass materials

    Renew Sustain Energy Rev

    (2012)
  • Y.D. Singh et al.

    Comprehensive characterization of lignocellulosic biomass through proximate, ultimate and compositional analysis for bioenergy production

    Renew Energy

    (2017)
  • M. Wang

    To distinguish the primary characteristics of agro-waste biomass by the principal component analysis: an investigation in East China

    Waste Manag

    (2019)
  • W. Stelte

    Pelletizing properties of torrefied spruce

    Biomass Bioenergy

    (2011)
  • R. García et al.

    Pelletization of wood and alternative residual biomass blends for producing industrial quality pellets

    Fuel

    (2019)
  • J. Parikh et al.

    A correlation for calculating HHV from proximate analysis of solid fuels

    Fuel

    (2005)
  • J. Parikh et al.

    A correlation for calculating elemental composition from proximate analysis of biomass materials

    Fuel

    (2007)
  • D.R. Nhuchhen et al.

    Estimation of higher heating value of biomass from proximate analysis: a new approach

    Fuel

    (2012)
  • A. Luque et al.

    The impact of class imbalance in classification performance metrics based on the binary confusion matrix

    Pattern Recognit

    (2019)
  • Y. Qian et al.

    A resampling ensemble algorithm for classification of imbalance problems

    Neurocomputing

    (2014)
  • K.A. Motghare et al.

    Comparative study of different waste biomass for energy application

    Waste Manag

    (Jan 2016)
  • S. Kamel et al.

    Bioenergy potential from agriculture residues for energy generation in Egypt

    Renew Sustain Energy Rev

    (2018)
  • U.N. D.o.P. Information

    Sustainable development Goals

  • A. Müller

    The role of biomass in the Sustainable Development Goals: a reality check and governance implications

    (2015)
  • H. Garg et al.

    Global status on renewable energy

  • T. Kar et al.

    Environmental impacts of biomass combustion for heating and electricity generation

    J Eng Res Appl. Sci.

    (2016)
  • M. Balat et al.

    Biomass energy in the world, use of biomass and potential trends

    Energy Sources

    (2005)
  • R.D. Perlack

    Biomass as feedstock for a bioenergy and bioproducts industry: the technical feasibility of a billion-ton annual supply

    (2005)
  • J.B. Sluiter et al.

    Compositional analysis of lignocellulosic feedstocks. 1. Review and description of methods

    J Agric Food Chem

    (2010)
  • S. Nanda et al.

    Biomass-an overview on classification, composition and characterization

    (2013)
  • A. Sahai

    Evaluation of machine learning techniques for green energy prediction

    (2014)
  • Z. Zhang

    Introduction to machine learning: k-nearest neighbors

    Ann Transl Med

    (2016)
  • T.-F. Wu et al.

    Probability estimates for multi-class classification by pairwise coupling

    J Mach Learn Res

    (2004)
  • C. Cortes et al.

    Support-vector networks

    Mach Learn

    (1995)
  • O.O. Olatunji et al.

    Estimation of the elemental composition of biomass using hybrid adaptive neuro-fuzzy inference system

    Bioenergy Res

    (2019)
  • Cited by (23)

    • K-nearest neighbor based computational intelligence and RSM predictive models for extraction of Cadmium from contaminated soil

      2023, Ain Shams Engineering Journal
      Citation Excerpt :

      In this way, classification technique helps in understanding rules pertaining to data categories. Several data mining techniques such as Support Vector Machine (SVM), Artificial Neural Network (ANN) [14], Ensemble classifier, k-Nearest Neighbour (KNN) [13,15], and Bayesian classifier have been adopted by different researchers. Amongst these data classification techniques, the KNN technique possesses unique merits due to its simplicity, good performance, local noisy patterns, low sensitivity, descriptiveness, and multi-classes accuracy.

    • Renewable energy solutions based on artificial intelligence for farms in the state of Minas Gerais, Brazil: Analysis and proposition

      2023, Renewable Energy
      Citation Excerpt :

      The quest for a global-scale low-carbon economy resulted in an increase in the use of RESs as alternatives to fossil fuels, which are not easily replenished and become depleted over a period of time [1]. Among all RESs, wind and solar energy resources are blazing the trail, although biomass has been acknowledged as having the potential for liquid fuel production, such as biodiesel [2]. Biomass is responsible for a considerable amount of the energy matrix and is generally used as a source of heat and for generating electricity, and the conversion process is selected based on the characterization of the available resource [3].

    • Soft computing in renewable energy system modeling

      2021, Design, Analysis and Applications of Renewable Energy Systems
    View all citing articles on Scopus
    View full text