IDEAS home Printed from https://ideas.repec.org/a/gam/jagris/v10y2020i9p400-d412144.html
   My bibliography  Save this article

A Hybrid CFS Filter and RF-RFE Wrapper-Based Feature Extraction for Enhanced Agricultural Crop Yield Prediction Modeling

Author

Listed:
  • Dhivya Elavarasan

    (School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632 014, India)

  • Durai Raj Vincent P M

    (School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632 014, India)

  • Kathiravan Srinivasan

    (School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632 014, India)

  • Chuan-Yu Chang

    (Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Yunlin 64002, Taiwan)

Abstract

The innovation in science and technical knowledge has prompted an enormous amount of information for the agrarian sector. Machine learning has risen with massive processing techniques to perceive new contingencies in agricultural development. Machine learning is a novel onset for the investigation and determination of unpredictable agrarian issues. Machine learning models actualize the need for scaling the learning model’s performance. Feature selection can impact a machine learning model’s performance by defining a significant feature subset for increasing the performance and identifying the variability. This paper explains a novel hybrid feature extraction procedure, which is an aggregation of the correlation-based filter (CFS) and random forest recursive feature elimination (RFRFE) wrapper framework. The proposed feature extraction approach aims to identify an optimal subclass of features from a collection of climate, soil, and groundwater characteristics for constructing a crop-yield forecasting machine learning model with better performance and accuracy. The model’s precision and effectiveness are estimated (i) with all the features in the dataset, (ii) with essential features obtained using the learning algorithm’s inbuilt ‘feature_importances’ method, and (iii) with the significant features obtained through the proposed hybrid feature extraction technique. The validation of the hybrid CFS and RFRFE feature extraction approach in terms of evaluation metrics, predictive accuracies, and diagnostic plot performance analysis in comparison with random forest, decision tree, and gradient boosting machine learning algorithms are found to be profoundly satisfying.

Suggested Citation

  • Dhivya Elavarasan & Durai Raj Vincent P M & Kathiravan Srinivasan & Chuan-Yu Chang, 2020. "A Hybrid CFS Filter and RF-RFE Wrapper-Based Feature Extraction for Enhanced Agricultural Crop Yield Prediction Modeling," Agriculture, MDPI, vol. 10(9), pages 1-27, September.
  • Handle: RePEc:gam:jagris:v:10:y:2020:i:9:p:400-:d:412144
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2077-0472/10/9/400/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2077-0472/10/9/400/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bommert, Andrea & Sun, Xudong & Bischl, Bernd & Rahnenführer, Jörg & Lang, Michel, 2020. "Benchmark for filter methods for feature selection in high-dimensional classification data," Computational Statistics & Data Analysis, Elsevier, vol. 143(C).
    2. R. Srinivasan & C.P. Lohith, 2017. "Strategic Marketing and Innovation for Indian MSMEs," India Studies in Business and Economics, Springer, number 978-981-10-3590-6, March.
    3. Saikai, Yuji & Patel, Vivak & Mitchell, Paul, 2020. "Machine learning for optimizing complex site-specific management," 2020 Conference (64th), February 12-14, 2020, Perth, Western Australia 305238, Australian Agricultural and Resource Economics Society.
    4. Torres, Alfonso F. & Walker, Wynn R. & McKee, Mac, 2011. "Forecasting daily potential evapotranspiration using machine learning and limited climatic data," Agricultural Water Management, Elsevier, vol. 98(4), pages 553-562, February.
    5. Saeid Hamzeh & Marzieh Mokarram & Azadeh Haratian & Harm Bartholomeus & Arend Ligtenberg & Arnold K. Bregt, 2016. "Feature Selection as a Time and Cost-Saving Approach for Land Suitability Classification (Case Study of Shavur Plain, Iran)," Agriculture, MDPI, vol. 6(4), pages 1-13, October.
    6. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Priya Brata Bhoi & Veeresh S. Wali & Deepak Kumar Swain & Kalpana Sharma & Akash Kumar Bhoi & Manlio Bacco & Paolo Barsocchi, 2021. "Input Use Efficiency Management for Paddy Production Systems in India: A Machine Learning Approach," Agriculture, MDPI, vol. 11(9), pages 1-27, August.
    2. Sebastian Kujawa & Gniewko Niedbała, 2021. "Artificial Neural Networks in Agriculture," Agriculture, MDPI, vol. 11(6), pages 1-6, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fatemeh Moodi & Amir Jahangard-Rafsanjani & Sajad Zarifzadeh, 2023. "Feature selection and regression methods for stock price prediction using technical indicators," Papers 2310.09903, arXiv.org, revised Nov 2023.
    2. Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.
    3. Nahushananda Chakravarthy H G & Karthik M Seenappa & Sujay Raghavendra Naganna & Dayananda Pruthviraja, 2023. "Machine Learning Models for the Prediction of the Compressive Strength of Self-Compacting Concrete Incorporating Incinerated Bio-Medical Waste Ash," Sustainability, MDPI, vol. 15(18), pages 1-22, September.
    4. Wen, Shaoting & Buyukada, Musa & Evrendilek, Fatih & Liu, Jingyong, 2020. "Uncertainty and sensitivity analyses of co-combustion/pyrolysis of textile dyeing sludge and incense sticks: Regression and machine-learning models," Renewable Energy, Elsevier, vol. 151(C), pages 463-474.
    5. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    6. Fuentes, Sigfredo & Ortega-Farías, Samuel & Carrasco-Benavides, Marcos & Tongson, Eden & Gonzalez Viejo, Claudia, 2024. "Actual evapotranspiration and energy balance estimation from vineyards using micro-meteorological data and machine learning modeling," Agricultural Water Management, Elsevier, vol. 297(C).
    7. Kusiak, Andrew & Zheng, Haiyang & Song, Zhe, 2009. "On-line monitoring of power curves," Renewable Energy, Elsevier, vol. 34(6), pages 1487-1493.
    8. Zhu, Siying & Zhu, Feng, 2019. "Cycling comfort evaluation with instrumented probe bicycle," Transportation Research Part A: Policy and Practice, Elsevier, vol. 129(C), pages 217-231.
    9. Dursun Delen & Hamed M. Zolbanin & Durand Crosby & David Wright, 2021. "To imprison or not to imprison: an analytics model for drug courts," Annals of Operations Research, Springer, vol. 303(1), pages 101-124, August.
    10. Doruk Cengiz & Arindrajit Dube & Attila S. Lindner & David Zentler-Munro, 2021. "Seeing Beyond the Trees: Using Machine Learning to Estimate the Impact of Minimum Wages on Labor Market Outcomes," NBER Working Papers 28399, National Bureau of Economic Research, Inc.
    11. repec:iim:iimawp:14638 is not listed on IDEAS
    12. Zhou, Jing & Li, Wei & Wang, Jiaxin & Ding, Shuai & Xia, Chengyi, 2019. "Default prediction in P2P lending from high-dimensional data based on machine learning," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 534(C).
    13. Lu, Yingjie & Li, Tao & Hu, Hui & Zeng, Xuemei, 2023. "Short-term prediction of reference crop evapotranspiration based on machine learning with different decomposition methods in arid areas of China," Agricultural Water Management, Elsevier, vol. 279(C).
    14. Bohdan M. Pavlyshenko, 2019. "Machine-Learning Models for Sales Time Series Forecasting," Data, MDPI, vol. 4(1), pages 1-11, January.
    15. Matthias Bogaert & Lex Delaere, 2023. "Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art," Mathematics, MDPI, vol. 11(5), pages 1-28, February.
    16. Jason R. W. Merrick & Claire A. Dorsey & Bo Wang & Martha Grabowski & John R. Harrald, 2022. "Measuring Prediction Accuracy in a Maritime Accident Warning System," Production and Operations Management, Production and Operations Management Society, vol. 31(2), pages 819-827, February.
    17. Buzna, Luboš & De Falco, Pasquale & Ferruzzi, Gabriella & Khormali, Shahab & Proto, Daniela & Refa, Nazir & Straka, Milan & van der Poel, Gijs, 2021. "An ensemble methodology for hierarchical probabilistic electric vehicle load forecasting at regular charging stations," Applied Energy, Elsevier, vol. 283(C).
    18. Adler, Werner & Lausen, Berthold, 2009. "Bootstrap estimated true and false positive rates and ROC curve," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 718-729, January.
    19. Döpke, Jörg & Fritsche, Ulrich & Pierdzioch, Christian, 2017. "Predicting recessions with boosted regression trees," International Journal of Forecasting, Elsevier, vol. 33(4), pages 745-759.
    20. Andrea Sciandra & Alessio Surian & Livio Finos, 2021. "Supervised Machine Learning Methods to Disclose Action and Information in “U.N. 2030 Agenda” Social Media Data," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 156(2), pages 689-699, August.
    21. Mirosław Parol & Paweł Piotrowski & Piotr Kapler & Mariusz Piotrowski, 2021. "Forecasting of 10-Second Power Demand of Highly Variable Loads for Microgrid Operation Control," Energies, MDPI, vol. 14(5), pages 1-29, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jagris:v:10:y:2020:i:9:p:400-:d:412144. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.