Background: Agriculture is a primary livelihood provider in India, sustaining over 58% of rural households, with banana ranking as the country's second most significant fruit crop after mango. Banana cultivation spans 3.8 million hectares across 122 countries, with India contributing approximately 25.7% to global production. Crop yield prediction using machine learning techniques can optimize field operations and support pre-harvest planning decisions for farmers.
Methods: The study evaluated machine learning models for predicting banana crop yields across 31 districts of Tamil Nadu, India. Historical yield data from 2011-2018 were collected from governmental sources, with rainfall data from 2016-2018. After data preparation and pre-processing, three regression techniques, Multiple Linear Regression, Random Forest Regression, and Polynomial Regression, were implemented and compared. Multiple Linear Regression was selected to establish baseline linear relationships between cultivation parameters and yield outcomes, providing interpretable coefficients for agricultural decision-making. Random Forest Regression was chosen for its superior ability to capture complex non-linear interactions between multiple agricultural variables and handle real-world data inconsistencies in datasets. Polynomial Regression was utilized to examine non-linear relationships in the data, specifically curved patterns between cultivation area size and yield performance. The models were trained on key agricultural parameters, including cultivation area, productivity metrics, and rainfall patterns.
Results: Analysis revealed a weak negative correlation between cultivation area and productivity, with smaller areas (under 6000 Ha) achieving some of the highest productivity levels (70-90 tonnes / ha). Rainfall showed minimal impact on productivity, suggesting effective irrigation systems and water management practices in the region. The Random Forest model demonstrated superior performance with a 36% higher Root Mean Square value compared to other models. Polynomial Regression proved less effective due to data nonlinearity, while Multiple Linear Regression provided straightforward predictions but with lower accuracy.
Conclusion: The study confirms that Random Forest Regression is the most effective machine learning technique for banana yield prediction in Tamil Nadu's agricultural context. The findings suggest that successful banana cultivation in the region relies more on intensive farming practices in smaller areas rather than extensive cultivation.
crop yield, machine learning, regression, banana, model
Balraj, G., Palpandian Preethi, M. G., & Selvaraj, R. (2021). Banana (Musa sp.) In book: Tropical fruit crop: Theory to Practical (pp. 43-114).
Cheema, S. M., & Pires, I. M. (2025). AIoT based soil nutrient analysis and recommendation system for crops using machine learning. Smart Agricultural Technology, 11, 100924.
Chlingaryan, A., Sukkarieh, S., & Whelan, B. (2018). Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: A review. Computers and Electronics in Agriculture, 151, 61-69. https://doi.org/10.1016/j.compag.2018.05.012
Crane-Droesch, A. (2018). Machine learning methods for crop yield prediction and climate change impact assessment in agriculture. Environmental Research Letters, 13(11), 114003. https://doi.org/10.1088/1748-9326/aae159
Demirhan, H. (2025). A deep learning framework for prediction of crop yield in Australia under the impact of climate change. Information Processing in Agriculture, 12(1), 125-138.
Food and Agriculture Organization. (2020). Banana market review: Preliminary results 2019. Rome.
Islam, M. R., Oliullah, K., Kabir, M. M., Alom, M., &Mridha, M. F. (2023). Machine learning enabled IoT system for soil nutrients monitoring and crop recommendation. Journal of Agriculture and Food Research, 14, 100880.
Kumar, Y. J. N., Spandana, V., Vaishnavi, V. S., Neha, K., & Devi, V. G. R. R. (2020). Supervised machine learning approach for crop yield prediction in agriculture sector. In 5th International Conference on Communication and Electronics Systems (ICCES) (pp. 736-741). https://doi.org/10.1109/ICCES48766.2020.9138053
Liakos, K. G., Busato, P., Moshou, D., Pearson, S., &Bochtis, D. (2018). Machine learning in agriculture: A review. Sensors, 18(8), 2674. https://doi.org/10.3390/s18082674
Mishra, S., Mishra, D., & Santra, G. H. (2016). Applications of machine learning techniques in agricultural crop production: A review paper. Indian Journal of Science and Technology, 9(38), 1-14. https://doi.org/10.17485/ijst/2016/v9i38/95032
Motamedi, B., & Villányi, B. (2024). A predictive analytics model with Bayesian-Optimized Ensemble Decision Trees for enhanced crop recommendation. Decision Analytics Journal, 12, 100516.
Nigam, A., Garg, S., Agrawal, A., & Agrawal, P. (2019). Crop yield prediction using machine learning algorithms. In Fifth International Conference on Image Information Processing (ICIIP) (pp. 125-130). https://doi.org/10.1109/ICIIP47207.2019.8985951
Patrick, S., Mirau, S., Mbalawata, I., & Leo, J. (2023). Time series and ensemble models to forecast banana crop yield in Tanzania, considering the effects of climate change. Resources, Environment and Sustainability, 14, 100138.
Ramzan, S., Ghadi, Y. Y., Aljuaid, H., Mahmood, A., & Ali, B. (2024). An Ingenious IoT Based Crop Prediction System Using ML and EL. Computers, Materials and Continua, 79(1), 183-199.
Reddy, D. J., & Kumar, M. R. (2021). Crop yield prediction using machine learning algorithm. In 5th International Conference on Intelligent Computing and Control Systems (ICICCS) (pp. 1466-1470). https://doi.org/10.1109/ICICCS51141.2021.9432303
Santos, J. A., Fraga, H., Malheiro, A. C., & Moutinho-Pereira, J. (2020). A review of the potential climate change impacts and adaptation strategies for European viticulture. Applied Sciences, 10(9), 3092. https://doi.org/10.3390/app10093092
Sellam, V., & Poovammal, E. (2016). Prediction of crop yield using regression analysis. Indian Journal of Science and Technology, 9(38). https://doi.org/10.17485/ijst/2016/v9i38/91714
Shah, A., Dubey, A., Hemnani, V., Gala, D., &Kalbande, D. R. (2018). Smart farming system: Crop yield prediction using regression techniques. In H. Vasudevan, A. Deshmukh, & K. Ray (Eds.), Proceedings of International Conference on Wireless Communication (pp. 53-65). Springer. https://doi.org/10.1007/978-981-10-8339-6_6
Singh, B., Singh, J. P., Kaur, A., & Singh, N. (2016). Bioactive compounds in banana and their associated health benefits - A review. Food Chemistry, 206, 1-11. https://doi.org/10.1016/j.foodchem.2016.03.033
Tripathi, L., Ntui, V. O., & Tripathi, J. N. (2019). Application of genetic modification and genome editing for developing climate-smart banana. Food and Energy Security, 8(4), e00168. https://doi.org/10.1002/fes3.168
van Klompenburg, T., Kassahun, A., &Catal, C. (2020). Crop yield prediction using machine learning: A systematic literature review. Computers and Electronics in Agriculture, 177, 105709. https://doi.org/10.1016/j.compag.2020.105709
Venugopal, A., Aparna, S., Mani, J., Mathew, R., & Williams, V. (2021). Crop yield prediction using machine learning algorithms. International Journal of Engineering Research & Technology (IJERT) NCREIS, 9(13).
von Bloh, M., Nóia Júnior, R. d. S., Wangerpohl, X., Saltık, A. O., Haller, V., Kaiser, L., & Asseng, S. (2023). Machine learning for soybean yield forecasting in Brazil. Agricultural and Forest Meteorology, 341, 109670.
Yadav, R. K., & Sharma, P. (2020). Machine learning applications for precision agriculture: A comprehensive review. IEEE Access, 8, 164945-164964.