Abstract
Cotton is a major economic crop predominantly cultivated under rainfed situations. The accurate prediction of cotton yield invariably helps farmers, industries, and policy makers. The final cotton yield is mostly determined by the weather patterns that prevail during the crop growing phase. Crop yield prediction with greater accuracy is possible due to the development of innovative technologies which analyses the bigdata with its high-performance computing abilities. Machine learning technologies can make yield prediction reasonable and faster and with greater flexibility than process based complex crop simulation models. The present study demonstrates the usability of ML algorithms for yield forecasting and facilitates the comparison of different models. The cotton yield was simulated by employing the weekly weather indices as inputs and the model performance was assessed by nRMSE, MAPE and EF values. Results show that stacked generalised ensemble model and artificial neural networks predicted the cotton yield with lower nRMSE, MAPE and higher efficiency compared to other models. Variable importance studies in LASSO and ENET model found minimum temperature and relative humidity as the main determinates of cotton yield in all districts. The models were ranked based these performance metrics in the order of Stacked generalised ensemble > ANN > PCA ANN > SMLR ANN > LASSO> ENET > SVM > PCA SMLR > SMLR SVM > SMLR. This study shows that stacked generalised ensembling and ANN method can be used for reliable yield forecasting at district or county level and helps stakeholders in timely decision-making.
Similar content being viewed by others
Data availability
The data sets developed during the investigation are available upon reasonable request from the corresponding author.
Change history
01 May 2024
A Correction to this paper has been published: https://doi.org/10.1007/s00484-024-02696-4
References
Abbaszadeh P, Gavahi K, Alipour A, Deb P, Moradkhani H (2022) Bayesian multi-modeling of deep neural nets for probabilistic crop yield prediction. Agric For Meteorol 314:108773
Ali AM, Abouelghar M, Belal AA, Saleh N, Yones M, Selim AI, Amin ME, Elwesemy A, Kucher DE, Maginan S, Savin I (2022) Crop yield prediction using multi sensors remote sensing. Egypt J Remote Sens Space Sci 25(3):711–716. https://doi.org/10.1016/j.ejrs.2022.04.006
Amaratunga V, Wickramasinghe L, Perera A, Jayasinghe J, Rathnayake U (2020) Artifcial neural network to estimate the paddy yield prediction using climatic data. Math Probl Eng:8627824. https://doi.org/10.1155/2020/8627824
Aravind K, Vashisth A, Krishanan P, Das B (2022) Wheat yield prediction based on weather parameters using multiple linear, neural network and penalised regression models. J Agrometeorol 24(1):18–25. https://doi.org/10.54386/jam.v24i1.1002
Bali N, Singla A (2021) Deep learning based wheat crop yield prediction model in Punjab region of North India. App Artificial Intel 35(15):1304–1328. https://doi.org/10.1080/08839514.2021.1976091
Boyd ML, Phipps BJ, Wrather JA, Newman M, Sciumbato GL (2004) Cotton pests: scouting and management. Extension Publications, Columbia, MO, p 65211
Bradow JM, Davidonis GH (2000) Quantitation of fiber quality and the cotton production-processing interface: a physiologist’s perspective. J Cotton Sci 4(1):34–64
Brejda JJ, Moorman TB, Karlen DL, Dao TH (2000) Identification of regional soil quality factors and indicators I. Central and Southern High Plains Soil Sci Soc America J 64(6):2115–2124
Cabangbang RP, Manguiat PH (1989) Cotton cultivar responses to high rainfall and low solar radiation environment. Philippine J Crop Sci 14(2):55–59
Chattopadhyay N, Samui RP, Banerjee SK (2008) Effect of weather on growth and yield of cotton grown in the dry farming tract of peninsular India. Mausam 59(3):339–346
Das B, Murgaonkar D, Navyashree S, Kumar P (2022) Novel combination artificial neural network models could not outperform individual models for weather-based cashew yield prediction. Int J Biometeorol 66(8):1627–1638. https://doi.org/10.1007/s00484-022-02306-1
Das B, Nair B, Arunachalam V, Reddy KV, Venkatesh P, Chakraborty D, Desai S (2020) Comparative evaluation of linear and nonlinear weather-based models for coconut yield prediction in the west coast of India. Int J Biometeorol 64:1111–1123. https://doi.org/10.1007/s00484-020-01884-2
Das B, Nair B, Reddy VK, Venkatesh P (2018) Evaluation of multiple linear, neural network and penalised regression models for prediction of rice yield based on weather parameters for west coast of India. Int J Biometeorol 62(10):1809–1822. https://doi.org/10.1007/s00484-018-1583-6
Dason AA, Krishnasamy S, Ramakrishnan YS, Krishnadoss D (1996) Cotton growing environment, vol 628. Agricultural Research Station, Kovilpatty, p 501
De'ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81(11):3178–3192
Dijk ADV, Kootstra G, Kruijer W, de Ridder D (2021) Machine learning in plant science and plant breeding. Iscience 24(1):101890
Directorate of Economics and Statistics, Government of India (2021) https://eands.dacnet.nic.in/PDF/Agricultural%20Statistics%20at%20a%20Glance%20-%202021%20(English%20version).pdf. Accessed 28 Oct 2022
Doorenbos J, Pruitt WO (1977) Crop water requirements. FAO irrigation and drainage paper 24. Land and water development division, FAO, Rome. 144(1)
Dubey RC, Chowdhury A, Kale JD (1995) The estimation of cotton yield based on weather parameters in Maharashtra. Mausam. 46(3):275–278
Everingham Y, Sexton J, Skocaj D, Inman-Bamber G (2016) Accurate prediction of sugarcane yield using a random forest algorithm. Agron Sustain Dev 36:1–9
Filippi P, Jones EJ, Wimalathunge NS, Somarathna PD, Pozza LE, Ugbaje SU, Jephcott TG, Paterson SE, Whelan BM, Bishop TF (2019) An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning. Precis Agric 20:1015–1029. https://doi.org/10.1007/s11119-018-09628-4
Freeland Jr TB, Martin SW, Ebelhar MW, Meredith Jr WR (2004) Yield, quality, and economic impacts of 2002 harvest season rainfall in the Mississippi Delta. In Proc: Beltwide cotton prod. Res. Conf., San Antonio, TX, pp. 5–9
Gandhi N, Petkar O, Armstrong LJ. Rice crop yield prediction using artificial neural networks (2016) In: 2016 IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR). pp. 105–110
Ghosh K, Balasubramanian R, Bandopadhyay S, Chattopadhyay N, Singh KK, Rathore LS (2014) Development of crop yield forecast models under FASAL-a case study of kharif rice in West Bengal. J Agrometeorol 16(1):1–8
Grinblat GL, Uzal LC, Larese MG, Granitto PM (2016) Deep learning for plant identification using vein morphological patterns. Comput Electron Agric 127:418–424
Han J, Zhang Z, Cao J, Luo Y, Zhang L, Li Z, Zhang J (2020) Prediction of winter wheat yield based on multi-source data and machine learning in China. Remote Sens 12(2):236
Hesami M, Alizadeh M, Jones AM, Torkamaneh D (2022) Machine learning: its challenges and opportunities in plant system biology. Appl Microbiol Biotechnol 106(9–10):3507–3530
Hesami M, Alizadeh M, Naderi R, Tohidfar M (2020) Forecasting and optimizing Agrobacterium-mediated genetic transformation via ensemble model-fruit fly optimization algorithm: a data mining approach using chrysanthemum databases. PLoS One 15(9):e0239901
Hesami M, Jones AM (2020) Application of artificial intelligence models and optimization algorithms in plant cell and tissue culture. Appl Microbiol Biotechnol 104:9449–9485
Hodges HF, Reddy KR, McKinion JM, Reddy VR (1993) Temperature effects on cotton. Bulletin, USA
Jafari M, Shahsavar A (2020) The application of artificial neural networks in modeling and predicting the effects of melatonin on morphological responses of citrus to drought stress. PLoS One 15(10):e0240427
Jamieson PD, Porter JR, Wilson DR (1991) A test of the computer simulation model ARCWHEAT1 on wheat crops grown in New Zealand. Field Crop Res 27(4):337–350
Jha PK, Ines AV, Han E, Cruz R, Prasad PV (2022) A comparison of multiple calibration and ensembling methods for estimating genetic coefficients of CERES-Rice to simulate phenology and yields. Field Crop Res 284:108560. https://doi.org/10.1016/j.fcr.2022.108560
Jha PK, Ines AV, Singh MP (2021) A multiple and ensembling approach for calibration and evaluation of genetic coefficients of CERES-maize to simulate maize phenology and yield in Michigan. Environ Model Softw 135:104901
Jones JW, Hoogenboom G, Porter CH, Boote KJ, Batchelor WD, Hunt LA, Wilkens PW, Singh U, Gijsman AJ, Ritchie JT (2003) The DSSAT cropping system model. Eur J Agron 18(3–4):235–265. https://doi.org/10.1016/S1161-0301(02)00107-7
Ju S, Lim H, Ma JW, Kim S, Lee K, Zhao S, Heo J (2021) Optimal county-level crop yield prediction using MODIS-based variables and weather data: a comparative study on machine learning models. Agric Forest Meteorol 307:108530
Keating BA, Carberry PS, Hammer GL, Probert ME, Robertson MJ, Holzworth D, Huth NI, Hargreaves JN, Meinke H, Hochman Z, McLean G (2003) An overview of APSIM, a model designed for farming systems simulation. Eur J Agron 18(3–4):267–288. https://doi.org/10.1016/S1161-0301(02)00108-9
Khan Y, Kumar V, Setiya P, Satpathi A (2023) Forecasting soybean yield: a comparative study of neural networks, principal component analysis and penalized regression models using weather variables. Theor Applied Climatol 19:1–6
Krishna G, Sahoo RN, Singh P, Patra H, Bajpai V, Das B, Kumar S, Dhandapani R, Vishwakarma C, Pal M, Chinnusamy V (2021) Application of thermal imaging and hyperspectral remote sensing for crop water deficit stress monitoring. Geocarto Int 36(5):481–498
Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28:1–26
Kumar IE, Venkatasubramanian S, Scheidegger C, Friedler S (2020) Problems with Shapley-value-based explanations as feature importance measures. In: International conference on machine learning. pp. 5491–5500
Mauney JR (1986) Vegetative growth and development of fruiting sites. Cotton Physiol 1:16–18
Mishra B, Kumar N, Mukhtar MS (2019) Systems biology and machine learning in plant–pathogen interactions. Mol Plant-Microbe Interact 32(1):45–55
Mishra S, Mishra D, Santra GH (2016) Applications of machine learning techniques in agricultural crop production: a review paper. Indian J Sci Technol 9(38):1–4
Molden D, Vithanage M, De Fraiture C, Faures JM, Gordon L, Molle F, Peden D (2011) Water availability and its use in agriculture. Treatise on Water Sci Elsevier, Oxford, pp 707–732
Mukhala E, Hoefsloot P (2004) AgrometShell manual. Agrometeorology Group, Environment and Natural Resources Service. Food and Agricultural Organization Rome, Italy
Piaskowski JL, Brown D, Campbell KG (2016) Near-infrared calibration of soluble stem carbohydrates for predicting drought tolerance in spring wheat. Agron J 108(1):285–293. https://doi.org/10.2134/agronj2015.0173
Prasad NR, Patel NR, Danodia A (2021) Crop yield prediction in cotton for regional level using random forest approach. Spat Inf Res 29:195–206. https://doi.org/10.1007/s41324-020-00346-6
Prasad NR, Patel NR, Danodia A, Manjunath KR (2022) Comparative performance of semi-empirical based remote sensing and crop simulation model for cotton yield prediction. Modeling Earth Systems Environ 8(2):1733–1747
Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1249
Sankaranarayanan K, Praharaj CS, Nalayini P, Bandyopadhyay KK, Gopalakrishnan N (2010) Climate change and its impact on cotton. Indian J Agric Sci 80(7):561–575
Satpathi A, Setiya P, Das B, Nain AS, Jha PK, Singh S, Singh S (2023) Comparative analysis of statistical and machine learning techniques for rice yield forecasting for Chhattisgarh, India. Sustainability 15(3):2786
Sawan ZM (2017) Cotton production and climatic factors: studying the nature of its relationship by different statistical methods. Cogent Biol 3(1):1292882
Setiya P, Satpathi A, Nain AS, Das B (2022) Comparison of weather-based wheat yield forecasting models for different districts of Uttarakhand using statistical and machine learning techniques. J Agrometeorol 24(3):255–261
Shaha SK, Banerjee JR (1975) Influence of rainfall, humidity, sunshine, maximum and minimum temperatures on the yield of cotton at Coimbatore. Mausam 26(4):518–524
Shahhosseini M, Hu G, Archontoulis SV (2020) Forecasting corn yield with machine learning ensembles. Front Plant Sci 11:1120. https://doi.org/10.3389/fpls.2020.01120
Singh A, Ganapathysubramanian B, Singh AK, Sarkar S (2016) Machine learning for high-throughput stress phenotyping in plants. Trends Plant Sci 21(2):110–124
Sridhara S, Manoj KN, Gopakkali P, Kashyap GR, Das B, Singh KK, Srivastava AK (2023) Evaluation of machine learning approaches for prediction of pigeon pea yield based on weather parameters in India. Int J Biometeorol 67(1):165–180. https://doi.org/10.1007/s00484-022-02396-x
Sridhara S, Ramesh N, Gopakkali P, Das B, Venkatappa SD, Sanjivaiah SH, Kumar Singh K, Singh P, El-Ansary DO, Mahmoud EA, Elansary HO (2020) Weather-based neural network, stepwise linear and sparse regression approach for rabi sorghum yield forecasting of Karnataka, India. Agronomy 10(11):1645. https://doi.org/10.3390/agronomy10111645
Steduto P, Hsiao TC, Raes D, Fereres E (2009) AquaCrop—the FAO crop model to simulate yield response to water: I. Concepts and underlying principles. Agronomy J 101(3):426–437
Subash N, Gangwar B (2014) Statistical analysis of Indian rainfall and rice productivity anomalies over the last decades. Int J Climatol 34(7):2378–2392
Subash N, Singh SS, Priya N (2013) Observed variability and trends in extreme temperature indices and rice–wheat productivity over two districts of Bihar, India—a case study. Theor Appl Climatol 111:235–250. https://doi.org/10.1007/s00704-012-0665-3
Taylor KE (2001) Summarizing multiple aspects of model performance in a single diagram. J Geophys Res Atmos 106(D7):7183–7192
Van Diepen CV, Wolf JV, Van Keulen H, Rappoldt C (1989) WOFOST: a simulation model of crop production. Soil Use Manag 5(1):16–24
Van Klompenburg T, Kassahun A, Catal C (2020) Crop yield prediction using machine learning: a systematic literature review. Comput Electron Agric 177:105709
Varma M, Lama A, Singh KN, Gurung B (2023) Evaluating the performance of crop yield forecasting models coupled with feature selection in regression framework. Curr Sci 125(6):649
Waddle BA (1984) Crop growing practices. In: Kohel RJ, Lewis CF (eds) Cotton, 24. Wiley Online Library, pp 233–263. https://doi.org/10.2134/agronmonogr24.c8
Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50:159–175
Acknowledgements
The authors would like to thank India Meteorological Department (IMD), New Delhi for providing the funds towards conducting this study through FASAL (Forecasting Agricultural output using Space Agrometeorology and Land based observations) program and the Directorate of Research, Keladi Shivappa Nayaka, University of Agricultural and Horticultural Sciences, Iruvakki, Shivamooga, Karnataka, India for providing encouragement and support towards this study.
Author information
Authors and Affiliations
Contributions
Girish R Kashyap and Shankarappa Sridhara contributed to the study conception, design, formal analysis, and preparation of first draft. Data collection, and analysis were performed by Girish R Kashyap, Konapura Nagaraja Manoj, Pradeep Gopakkali and Bappa Das. Prakash Kumar Jha and PV Varaprasad contributed for analysis, editing and reviewing of manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Source code
The source codes for different algorithms used in the study are available from the corresponding author on reasonable request.
Supplementary information
Supplementary file 1
Supplementary Table 1 Weather indices used for developing the different models. Supplementary Table 2: The co-ordinates, prevailed weather parameters, average yield, and its standard deviation for 13 major cotton growing districts of Karnataka, India. Supplementary Table 3: Cotton yield prediction models for different districts of Karnataka developed using LASSO. Supplementary Table 4: Cotton yield prediction models for different districts of Karnataka developed using ENET. Supplementary Table 5: Cotton yield prediction model developed for study area using SMLR model. Supplementary Table 6: Cotton yield prediction model developed for study area using PCA-SMLR model. Supplementary Fig. 1 Geographical map of the research area featuring districts of Karnataka. Supplementary Fig. 2 Flowchart demonstrating steps in model development (DOCX 472 kb)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kashyap, G.R., Sridhara, S., Manoj, K.N. et al. Machine learning ensembles, neural network, hybrid and sparse regression approaches for weather based rainfed cotton yield forecast. Int J Biometeorol 68, 1179–1197 (2024). https://doi.org/10.1007/s00484-024-02661-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00484-024-02661-1