Property-based biomass feedstock grading using k-Nearest Neighbour technique

doi:10.1016/j.energy.2019.116346

Energy

Volume 190, 1 January 2020, 116346

https://doi.org/10.1016/j.energy.2019.116346 Get rights and content

Highlights

•
We applied an intelligent classification method to biomass resources.
•
The approach for the classification is based on k-NN model.
•
The best classification result was obtained with Mahalanobis distance function at K = 3.
•
This method offers a high prospect in strategic decision-making.

Abstract

Energy generation from biomass requires a nexus of different sources irrespective of origin. A detailed and scientific understanding of the class to which a biomass resource belongs is therefore highly essential for energy generation. An intelligent classification of biomass resources based on properties offers a high prospect in analytical, operational and strategic decision-making. This study proposes the $k$ -Nearest Neighbour ( $k$ -NN) classification model to classify biomass based on their properties. The study scientifically classified 214 biomass dataset obtained from several articles published in reputable journals. Four different values of $k$ ( $k = 1,2,3,4$ ) were experimented for various self normalizing distance functions and their results compared for effectiveness and efficiency in order to determine the optimal model. The $k$ –NN model based on Mahalanobis distance function revealed a great accuracy at $k = 3$ with Root Mean Squared Error (RMSE), Accuracy, Error, Sensitivity, Specificity, False positive rate, Kappa statistics and Computation time (in seconds) of 1.42, 0.703, 0.297, 0.580, 0.953, 0.047, 0.622, and 4.7 respectively. The authors concluded that $k$ –NN based classification model is feasible and reliable for biomass classification. The implementation of this classification models shows that $k$ –NN can serve as a handy tool for biomass resources classification irrespective of the sources and origins.

Introduction

Having come to the end of Millennium Development Goals (MDGs), the United Nations (UN) general assembly adopted the 2030 Sustainable Development Goals (SGDs) in September 2015. Focussing on inclusive development, the new agenda emphasize a holistic approach to achieving sustainable development for all [1]. Incidentally, many of the proposed SDGs are dependent on biomass [2]. Goal 7, 9, 12 and 13 of the SDGs substantially speak to the need for alternative energy [1]. By insight into the SDGs, biomass stands as the main avenue through which massive renewable energy can be sustainably achieved in order to ensure access to affordable, reliable, and modern energy for all. This would inevitably culminate in action to combat climate change and its impacts by elevating the sustainable use of biomass resources, since the power generation from fossil fuel continues to raise environmental concerns. Biomass has been identified as the only alternative naturally occurring fossil fuel substitute both regarding carbon content and the available quantity [3]. In 2013, 462 TWh of electricity was produced globally from biomass [4]. Ever since then, the global consumption of biofuel sourced from biomass has been on the increase, and this trend promises to continue as long as major issues surrounding the exploration of bioenergy is frontally addressed. In the future, biomass energy has the potential to provide cost-effective and sustainable energy supply to the population in the developing countries [5,6]. There is an abundant supply of biomass feedstocks which can be converted to biofuels. These feedstocks include; agricultural crops, residues, forest products, energy crops and algae biomass. Municipal Solid Waste (MSW) is also gaining attention as a viable source of biofuel [7,8].

There is a consensus that biomass fuel is a renewable energy resource, but lack of widely accepted terminology, classification metrics, and global standard lead to some serious misconception during the investigations. The main concerns surrounding biomass exploration are related to how to extend and improve the basic understanding of the composition and properties of biomass and also to profitably apply this knowledge in the interest of environmental safety [9]. The knowledge of the proximate properties of biomass and their classification are essential in their selection as an energy feedstock [10]. Among several classification criteria which have been adopted in the classification of biomass feedstock, the most prominent is based on origin and properties [11,12]. There is no known scientific classification method based on artificial intelligence to the best of the authors’ knowledge. Since energy generation from biomass often requires a nexus of different sources, a detailed and scientific understanding of the class to which a biomass resource belongs is highly essential. An intelligent classification of biomass resources based on proximate properties of biomass feedstock offers a high prospect in analytical, operational and strategic decision-making.

Artificial intelligence has opened a new page in the field of data analysis. Machine learning has been applied to several processes, which include data mining and data analysis [[13], [14], [15], [16], [17], [18]] related to supply chain management [19,20], biomass elemental composition [21], municipal solid waste management [[22], [23], [24]], energy consumption prediction [25] and so on. One of the primary objectives of data mining is classification. It is a form of predictive modelling which defines groups within the entire population. The process of classification focusses on finding a model which describes a data class. The aim of classification is to use the derived function to predict the group to which a data point belongs using an unknown class label. By using the classification technique, one can learn the rules that form categories of data [26]. Data classification has been applied in several fields such as; medicine, credit ranking, customer behaviour, and strategic management [27]. Some of the artificial intelligent-based techniques which has been applied in the classification of various data are; Bayesian classifier, Ensemble classifier, Artificial Neural Network (ANN), k-Nearest Neighbour ( $k$ -NN), Support Vector Machine, SVM [14,28]. The $k$ –NN is a common instance-based learning algorithm, which is used to classify an unknown object by ranking the objects neighbour amidst the training data. The result of this then intelligently predicts the class of the new objects. The k in $k$ –NN is a pointer to the number of proximate neighbours which are been evaluated in order to determine the class to which a dataset belongs [26,29]. Apart from its simplicity, $k$ –NN has proven to be an effective and efficient algorithm used in solving several real-life classification problems with good generalization and accuracy [30]. $k$ –NN has been appraised as a transparent machine learning method with high predictive ability, even with little or no prior information about the data distribution [15,31]. $k$ –NN stands out as a data mining method of choice for several classification problems due to its notable advantages, which competitively edge-out other methods. Some of its advantages include [31];

i.
Ability to handle training data that are too large to fit in-memory,
ii.
It can measure the similarities between training tuples and test tuples without prior knowledge about data distribution,
iii.
Reduction of error due to the inaccurate assumption.
iv.
Lower computation time and high prediction accuracy.

The literature review has highlighted the success of $k$ –NN in the classification of data for several applications [15,17,31,32].

This study proposes a proof of concept for the classification of biomass based on the traditional classification of properties as proposed by Khan et al. [11] using $k$ –NN classification algorithm. The research seeks to innovate a consistent, flexible, direct and easy to implement classification method for biomass properties. Identifying biomass classes is very vital to understanding the mechanism which lead to the varying biomass feedstock behaviour and for an improved prediction of biomass properties [33,34]. This will engender an informed decision making about biomass resources management and feedstock production for industrial utilization especially as related to energy generation. Also, power plant developers will be enabled to ascertain the quality of the feedstock coming from various supply chain. Another objective of this study is to extend the boundary of knowledge in terms of biomass selection decision towards the hybridization of biomass sources for energy generation.

Section snippets

Literature survey

The application of $k$ –NN as a machine learning approach spans more than 50 years [35]. Although $k$ –NN was believed to have been introduced in 1951 in an unpublished medicine report, it did not gain much traction until 1960. Ever since then, it has become a renowned pattern recognition and classification technique [26,36]. $k$ -NN has been widely applied to various classification and data recognition problems in several studies. Different k-values, distance metrics, and types of data have been applied.

Methodology

The essence of the classification is to apply the algorithm to information which comes from a known data source. This means that the class to which a data belong is evaluated as an input, from which it is transferred to higher dimensional space. In this new environment, the data is grouped with distance function based on the $k$ –NN algorithm. Therefore, the data is classified into new groups to reflect the behaviour of the algorithm.

Results and discussions

The proposed $k$ –NN method employs the 70% of the biomass dataset as the training data. The training data was fed into the model to determine its classification ability, for k ranging from 1 to 4 based on same set of data. This was done since different value of k produces different error rates and accuracy. All the performance assessment methods were computed from confusion matrix [73,74] which were coded in MATLAB environment.

Conclusions

In this study, we have presented a novel approach to the classification of biomass feedstock. Our model is a proof of concept and a prototype of idea for using scientific methods such as $k$ –NN in the classification of biomass data. The developed biomass classification model was assessed using some statistical parameters. The model developed was tested with new 64 data sets. Further studies should be conducted with larger biomass data which sufficiently cover all the biomass classes identified in

CRediT authorship contribution statement

Obafemi O. Olatunji: Conceptualization, Data curation, Formal analysis, Investigation, Project administration, Writing - original draft, Writing - review & editing. Stephen Akinlabi: Validation, Writing - review & editing. Nkosinathi Madushele: Validation, Writing - review & editing. Paul A. Adedeji: Conceptualization, Formal analysis, Writing - review & editing.

References (87)

M.F. Demirbas et al.
Potential contribution of biomass to the sustainable energy development
Energy Convers Manag
(2009)
M. Guo et al.
Bioenergy and biofuels: history, status, and perspective
Renew Sustain Energy Rev
(2015)
A. Khan et al.
Biomass combustion in fluidized bed boilers: potential problems and remedies
Fuel Process Technol
(2009)
S.V. Vassilev et al.
An overview of the chemical composition of biomass
Fuel
(2010)
B.R. Hough et al.
Application of machine learning to pyrolysis reaction networks: reducing model solution time to enable process optimization
Comput Chem Eng
(2017)
C. Crisci et al.
A review of supervised machine learning algorithms and their applications to ecological data
Ecol Model
(2012)
O.O. Olatunji et al.
Competitive advantage of carbon efficient supply chain in manufacturing industry
J Clean Prod
(2019/11/20/2019)
M. Talaei et al.
A robust fuzzy optimization model for carbon-efficient closed-loop supply chain network design problem: a numerical illustration in electronics industry
J Clean Prod
(2016)
K.C. Drudi et al.
Statistical model for heating value of municipal solid waste in Brazil based on gravimetric composition
Waste Manag
(2019)
P.A. Adedeji et al.
Non-linear autoregressive neural network (NARNET) with SSA filtering for a university energy consumption forecast
Procedia Manuf
(2019)

D. Adeniyi et al.

Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method

Appl Comput Inf

(2016)

G. Tao et al.

Biomass properties in association with plant species and assortments I: a synthesis based on literature data of energy properties

Renew Sustain Energy Rev

(2012)

G. Tao et al.

Biomass properties in association with plant species and assortments. II: a synthesis based on literature data for ash elements

Renew Sustain Energy Rev

(2012)

R. Todeschini et al.

A new concept of higher-order similarity and the role of distance/similarity measures in local classification methods

Chemometr Intell Lab Syst

(2016)

A.K. Gjertsen

Accuracy of forest mapping based on Landsat TM data and a kNN-based method

Remote Sens Environ

(2007)

X. Peng et al.

Control rod position reconstruction based on K-Nearest Neighbor Method

Ann Nucl Energy

(2017)

L. Nunes et al.

Biomass combustion systems: a review on the physical and chemical properties of the ashes

Renew Sustain Energy Rev

(2016)

J. Cai

Review of physicochemical properties and analytical characterization of lignocellulosic biomass

Renew Sustain Energy Rev

(2017)

J.M. Vargas-Moreno et al.

A review of the mathematical models for predicting the heating value of biomass materials

Renew Sustain Energy Rev

(2012)

Y.D. Singh et al.

Comprehensive characterization of lignocellulosic biomass through proximate, ultimate and compositional analysis for bioenergy production

Renew Energy

(2017)

M. Wang

To distinguish the primary characteristics of agro-waste biomass by the principal component analysis: an investigation in East China

Waste Manag

(2019)

W. Stelte

Pelletizing properties of torrefied spruce

Biomass Bioenergy

(2011)

R. García et al.

Pelletization of wood and alternative residual biomass blends for producing industrial quality pellets

Fuel

(2019)

J. Parikh et al.

A correlation for calculating HHV from proximate analysis of solid fuels

Fuel

(2005)

J. Parikh et al.

A correlation for calculating elemental composition from proximate analysis of biomass materials

Fuel

(2007)

D.R. Nhuchhen et al.

Estimation of higher heating value of biomass from proximate analysis: a new approach

Fuel

(2012)

A. Luque et al.

The impact of class imbalance in classification performance metrics based on the binary confusion matrix

Pattern Recognit

(2019)

Y. Qian et al.

A resampling ensemble algorithm for classification of imbalance problems

Neurocomputing

(2014)

K.A. Motghare et al.

Comparative study of different waste biomass for energy application

Waste Manag

(Jan 2016)

S. Kamel et al.

Bioenergy potential from agriculture residues for energy generation in Egypt

Renew Sustain Energy Rev

(2018)

U.N. D.o.P. Information

Sustainable development Goals

A. Müller

The role of biomass in the Sustainable Development Goals: a reality check and governance implications

(2015)

H. Garg et al.

Global status on renewable energy

T. Kar et al.

Environmental impacts of biomass combustion for heating and electricity generation

J Eng Res Appl. Sci.

(2016)

M. Balat et al.

Biomass energy in the world, use of biomass and potential trends

Energy Sources

(2005)

R.D. Perlack

Biomass as feedstock for a bioenergy and bioproducts industry: the technical feasibility of a billion-ton annual supply

(2005)

J.B. Sluiter et al.

Compositional analysis of lignocellulosic feedstocks. 1. Review and description of methods

J Agric Food Chem

(2010)

S. Nanda et al.

Biomass-an overview on classification, composition and characterization

(2013)

A. Sahai

Evaluation of machine learning techniques for green energy prediction

(2014)

Z. Zhang

Introduction to machine learning: k-nearest neighbors

Ann Transl Med

(2016)

T.-F. Wu et al.

Probability estimates for multi-class classification by pairwise coupling

J Mach Learn Res

(2004)

C. Cortes et al.

Support-vector networks

Mach Learn

(1995)

O.O. Olatunji et al.

Estimation of the elemental composition of biomass using hybrid adaptive neuro-fuzzy inference system

Bioenergy Res

(2019)

Cited by (23)

Deep learning approaches to identify order status in a complex supply chain
2024, Expert Systems with Applications
The emergence of artificial intelligence (AI) and its related capabilities has led industries to rethink the existing practices of conventional supply chain management and data analysis. Machine learning (ML), Deep Learning (DL) and their unique ability to predict future data and classify data have led to important research in the supply chain (SC) domain, particularly in identifying and prioritizing supply chain risks. This paper proposes several DL methodologies to exploit the benefit of DL, particularly to identify whether any product will be delivered late due to any unforeseen reason in a complex SC system. Four different DL architectures (Simple-LSTM, Deep-LSTM, 1D-CNN, and TCN-1DSPCNN models) are proposed to extract features, while six variant classifiers: Softmax, random trees (RT), random forest (RF), K-nearest neighbor (KNN), artificial neural network (ANN), and support vector machine (SVM), were used to classify delay or non-delay information. By seamlessly capturing intricate temporal dependencies, these DL models enhance accuracy in robustly identifying supply chain late orders. Leveraging their hierarchical feature learning, these proposed DL models excel in recognizing subtle patterns and correlations, making them ideal for classifying late orders within the supply chain. Their parallel processing prowess facilitates real-time decision support, allowing organizations to address potential delays and allocate resources effectively and proactively. Five-fold cross-validation is presented to avoid over-fitting and to prove the efficiency of the proposed DL models. The total accuracies of the six ML classifiers are 74.03, 75.81, 93.35, 87.72, 93.59, and 95.10, respectively, while the maximum accuracies obtained from four proposed DL methodologies obtained an accuracy of 97.6, 98.63, 100, 100% respectively using the SVM classifier for predicting late orders based on five-fold cross-validation.
Revolutionizing municipal solid waste management (MSWM) with machine learning as a clean resource: Opportunities, challenges and solutions
2023, Fuel
Effective municipal solid waste management is essential for public health, environmental protection, economic benefits, and clean energy generation for future commercial applications. However, challenges like real-time monitoring, automated sorting systems, optimized collection routes, predictive maintenance, and public education and engagement hinder efficiency. Machine learning can address these challenges through real-time monitoring, automated sorting, route optimization, predictive maintenance, and targeted public education. Supervised, unsupervised, and reinforcement learning can be applied to various waste management processes, enhancing energy extraction and clean fuel production for commercial sectors like the steel industry. Machine learning can effectively predict waste generation, design collection routes, classify waste materials, forecast real-time landfill filling rates, detect operational issues, prevent illicit dumping, and establish predictive maintenance systems. However, it must be integrated with other strategies, policies, and regulations for a sustainable waste management system. Additionally, cost-benefit analyses, scalability, and implementation feasibility should be considered before investing. In conclusion, machine learning can improve municipal solid waste management efficiency and effectiveness, but further research is needed. The present study offers vital knowledge for key stakeholders, including successful case studies and evaluations of societal technology and customer readiness levels.
K-nearest neighbor based computational intelligence and RSM predictive models for extraction of Cadmium from contaminated soil
2023, Ain Shams Engineering Journal
Citation Excerpt :
In this way, classification technique helps in understanding rules pertaining to data categories. Several data mining techniques such as Support Vector Machine (SVM), Artificial Neural Network (ANN) [14], Ensemble classifier, k-Nearest Neighbour (KNN) [13,15], and Bayesian classifier have been adopted by different researchers. Amongst these data classification techniques, the KNN technique possesses unique merits due to its simplicity, good performance, local noisy patterns, low sensitivity, descriptiveness, and multi-classes accuracy.
Computational intelligence (CI) predictive models based on k-Nearest Neighbor (KNN) algorithms were developed for Cd ions removal from contaminated soil using environmentally friendly chelating-agent polyaspartate. Based on extracted Cd ions into the chelating-agent, residual Cd ions in treated soil and Cd removal efficiency, the performances of the KNN models were compared with response surface methodology (RSM) models using whole data set (KNN1) and split data (KNN2) scenarios using correlation coefficient (R²) and root mean square error (RMSE). Optimal performances of the developed KNN based models were found to be significantly influenced by the nearest neighbor’s k-parameter attributed to the disparity in the two approaches. The KNN1 demonstrated better performances characterized by higher R² = 0.984–0.999 and lower RSME of 0.399–6 against the RSM models’ R² = 0.7882–0.990 and RSME 2.08–20.36, respectively. For the KNN2 models, even though lower performances were obtained, yet the soil remediation efficiency models, demonstrated enhanced performance over the RSM models.
Renewable energy solutions based on artificial intelligence for farms in the state of Minas Gerais, Brazil: Analysis and proposition
2023, Renewable Energy
Citation Excerpt :
The quest for a global-scale low-carbon economy resulted in an increase in the use of RESs as alternatives to fossil fuels, which are not easily replenished and become depleted over a period of time [1]. Among all RESs, wind and solar energy resources are blazing the trail, although biomass has been acknowledged as having the potential for liquid fuel production, such as biodiesel [2]. Biomass is responsible for a considerable amount of the energy matrix and is generally used as a source of heat and for generating electricity, and the conversion process is selected based on the characterization of the available resource [3].
Rural areas have great renewable energy potential. With an introduction to sustainable development goals, the smart farm concept presents a novel idea for providing energy in rural areas using artificial intelligence and renewable energy management. We proposed the following topics in this research: (i) methodologies for sizing generation systems by integer linear programming, (ii) use of analytic hierarchy process (AHP) to select the alternative source by financial, environmental, social, and physical criteria, and (iii) training of an artificial neural network (ANN) for process optimization based on the agroeconomic profile of farms. We applied the proposed generic methodology to farms in São Francisco do Glória, Brazil. The biomass, solar, and wind systems were indicated by AHP for implementation in 62.16, 15.32, and 22.52% of farms, respectively. The best configuration of ANN presented a maximum precision of 81.80 ± 3.36%. If the systems were to be implemented, 1.975 GWh yr⁻¹ would be generated, 435.23 tonnes of CO₂ would no longer be emitted per year, and a CO₂ credit of 366.62 tonnes yr⁻¹ would be injected into the national electric system. Public policies are necessary for this scenario to become a reality in Brazil, such as research incentives and market development.
Energy data generation with Wasserstein Deep Convolutional Generative Adversarial Networks
2022, Energy
Residential energy consumption data and related sociodemographic information are critical for energy demand management, including providing personalized services, ensuring energy supply, and designing demand response programs. However, it is often difficult to collect sufficient data to build machine learning models, primarily due to cost, technical barriers, and privacy. Synthetic data generation becomes a feasible solution to address data availability issues, while most existing work generates data without considering the balance between usability and privacy. In this paper, we first propose a data generation model based on the Wasserstein Deep Convolutional Generative Adversarial Network (WDCGAN), which is capable of synthesizing fine-grained energy consumption time series and corresponding sociodemographic information. The WDCGAN model can generate realistic data by balancing data usability and privacy level by setting a hyperparameter during training. Next, we take the classification of sociodemographic information as an application example and train four classical classification models with the generated datasets, including CNN, LSTM, SVM, and LightGBM. We evaluate the proposed data generator using Irish data, and the results show that the proposed WDCGAN model can generate realistic load profiles with satisfactory similarity in terms of data distribution, patterns, and performance. The classification results validate the usability of the generated data for real-world machine learning applications with privacy guarantee, e.g., most of the differences in classification accuracy and F₁ scores are less than 8% between using real and synthesized data.
Soft computing in renewable energy system modeling
2021, Design, Analysis and Applications of Renewable Energy Systems
Low tolerance for imprecision and uncertainty in achieving low-cost solutions has increased the use of soft computing techniques in the decision-making process for renewable energy systems. The domino effect of this advancing technique is evident in increased understanding of resource variability and intermittency, hybrid system optimization, and system fault classification. This chapter presents a background to soft computing techniques and their mathematical modeling while their merits and demerits were highlighted. The application of soft computing in renewable energy systems (with focus on wind, solar, and biomass resources) was categorized into three: predictive modeling, hybrid energy system optimization, and system classification. Each category was discussed and further substantiated with two case studies. While many applications of soft computing were observed in predictive modeling and hybrid energy system optimization, little is known about their applications in system classification. Finally, future prospects and research areas for soft computing in renewable energy were presented.

View all citing articles on Scopus

View full text

Property-based biomass feedstock grading using k-Nearest Neighbour technique

Highlights

Abstract

Introduction

Section snippets

Literature survey

Methodology

Results and discussions

Conclusions

CRediT authorship contribution statement

Energy Convers Manag

Renew Sustain Energy Rev

Fuel Process Technol

Fuel

Comput Chem Eng

Ecol Model

J Clean Prod

J Clean Prod

Waste Manag

Procedia Manuf

Appl Comput Inf

Renew Sustain Energy Rev

Renew Sustain Energy Rev

Chemometr Intell Lab Syst

Remote Sens Environ

Ann Nucl Energy

Renew Sustain Energy Rev

Renew Sustain Energy Rev

Renew Sustain Energy Rev

Renew Energy

Waste Manag

Biomass Bioenergy

Fuel

Fuel

Fuel

Fuel

Pattern Recognit

Neurocomputing

Waste Manag

Renew Sustain Energy Rev

Sustainable development Goals

The role of biomass in the Sustainable Development Goals: a reality check and governance implications

Global status on renewable energy

Environmental impacts of biomass combustion for heating and electricity generation

J Eng Res Appl. Sci.

Biomass energy in the world, use of biomass and potential trends

Energy Sources

Biomass as feedstock for a bioenergy and bioproducts industry: the technical feasibility of a billion-ton annual supply

Compositional analysis of lignocellulosic feedstocks. 1. Review and description of methods

J Agric Food Chem

Biomass-an overview on classification, composition and characterization

Evaluation of machine learning techniques for green energy prediction

Introduction to machine learning: k-nearest neighbors

Ann Transl Med

Probability estimates for multi-class classification by pairwise coupling

J Mach Learn Res

Support-vector networks

Mach Learn

Estimation of the elemental composition of biomass using hybrid adaptive neuro-fuzzy inference system

Bioenergy Res