IDEAS home Printed from https://ideas.repec.org/a/gam/jagris/v10y2020i9p400-d412144.html
   My bibliography  Save this article

A Hybrid CFS Filter and RF-RFE Wrapper-Based Feature Extraction for Enhanced Agricultural Crop Yield Prediction Modeling

Author

Listed:
  • Dhivya Elavarasan

    (School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632 014, India)

  • Durai Raj Vincent P M

    (School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632 014, India)

  • Kathiravan Srinivasan

    (School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632 014, India)

  • Chuan-Yu Chang

    (Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Yunlin 64002, Taiwan)

Abstract

The innovation in science and technical knowledge has prompted an enormous amount of information for the agrarian sector. Machine learning has risen with massive processing techniques to perceive new contingencies in agricultural development. Machine learning is a novel onset for the investigation and determination of unpredictable agrarian issues. Machine learning models actualize the need for scaling the learning model’s performance. Feature selection can impact a machine learning model’s performance by defining a significant feature subset for increasing the performance and identifying the variability. This paper explains a novel hybrid feature extraction procedure, which is an aggregation of the correlation-based filter (CFS) and random forest recursive feature elimination (RFRFE) wrapper framework. The proposed feature extraction approach aims to identify an optimal subclass of features from a collection of climate, soil, and groundwater characteristics for constructing a crop-yield forecasting machine learning model with better performance and accuracy. The model’s precision and effectiveness are estimated (i) with all the features in the dataset, (ii) with essential features obtained using the learning algorithm’s inbuilt ‘feature_importances’ method, and (iii) with the significant features obtained through the proposed hybrid feature extraction technique. The validation of the hybrid CFS and RFRFE feature extraction approach in terms of evaluation metrics, predictive accuracies, and diagnostic plot performance analysis in comparison with random forest, decision tree, and gradient boosting machine learning algorithms are found to be profoundly satisfying.

Suggested Citation

  • Dhivya Elavarasan & Durai Raj Vincent P M & Kathiravan Srinivasan & Chuan-Yu Chang, 2020. "A Hybrid CFS Filter and RF-RFE Wrapper-Based Feature Extraction for Enhanced Agricultural Crop Yield Prediction Modeling," Agriculture, MDPI, vol. 10(9), pages 1-27, September.
  • Handle: RePEc:gam:jagris:v:10:y:2020:i:9:p:400-:d:412144
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2077-0472/10/9/400/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2077-0472/10/9/400/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bommert, Andrea & Sun, Xudong & Bischl, Bernd & Rahnenführer, Jörg & Lang, Michel, 2020. "Benchmark for filter methods for feature selection in high-dimensional classification data," Computational Statistics & Data Analysis, Elsevier, vol. 143(C).
    2. R. Srinivasan & C.P. Lohith, 2017. "Strategic Marketing and Innovation for Indian MSMEs," India Studies in Business and Economics, Springer, number 978-981-10-3590-6, January.
    3. Saikai, Yuji & Patel, Vivak & Mitchell, Paul, 2020. "Machine learning for optimizing complex site-specific management," 2020 Conference (64th), February 12-14, 2020, Perth, Western Australia 305238, Australian Agricultural and Resource Economics Society.
    4. Torres, Alfonso F. & Walker, Wynn R. & McKee, Mac, 2011. "Forecasting daily potential evapotranspiration using machine learning and limited climatic data," Agricultural Water Management, Elsevier, vol. 98(4), pages 553-562, February.
    5. Saeid Hamzeh & Marzieh Mokarram & Azadeh Haratian & Harm Bartholomeus & Arend Ligtenberg & Arnold K. Bregt, 2016. "Feature Selection as a Time and Cost-Saving Approach for Land Suitability Classification (Case Study of Shavur Plain, Iran)," Agriculture, MDPI, vol. 6(4), pages 1-13, October.
    6. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Priya Brata Bhoi & Veeresh S. Wali & Deepak Kumar Swain & Kalpana Sharma & Akash Kumar Bhoi & Manlio Bacco & Paolo Barsocchi, 2021. "Input Use Efficiency Management for Paddy Production Systems in India: A Machine Learning Approach," Agriculture, MDPI, vol. 11(9), pages 1-27, August.
    2. Sebastian Kujawa & Gniewko Niedbała, 2021. "Artificial Neural Networks in Agriculture," Agriculture, MDPI, vol. 11(6), pages 1-6, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fatemeh Moodi & Amir Jahangard-Rafsanjani & Sajad Zarifzadeh, 2023. "Feature selection and regression methods for stock price prediction using technical indicators," Papers 2310.09903, arXiv.org, revised Nov 2023.
    2. Mansoor, Umer & Jamal, Arshad & Su, Junbiao & Sze, N.N. & Chen, Anthony, 2023. "Investigating the risk factors of motorcycle crash injury severity in Pakistan: Insights and policy recommendations," Transport Policy, Elsevier, vol. 139(C), pages 21-38.
    3. Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.
    4. Akash Malhotra, 2018. "A hybrid econometric-machine learning approach for relative importance analysis: Prioritizing food policy," Papers 1806.04517, arXiv.org, revised Aug 2020.
    5. Manuel Oviedo-de la Fuente & Carlos Cabo & Celestino Ordóñez & Javier Roca-Pardiñas, 2021. "A Distance Correlation Approach for Optimum Multiscale Selection in 3D Point Cloud Classification," Mathematics, MDPI, vol. 9(12), pages 1-19, June.
    6. Nahushananda Chakravarthy H G & Karthik M Seenappa & Sujay Raghavendra Naganna & Dayananda Pruthviraja, 2023. "Machine Learning Models for the Prediction of the Compressive Strength of Self-Compacting Concrete Incorporating Incinerated Bio-Medical Waste Ash," Sustainability, MDPI, vol. 15(18), pages 1-22, September.
    7. Tim Voigt & Martin Kohlhase & Oliver Nelles, 2021. "Incremental DoE and Modeling Methodology with Gaussian Process Regression: An Industrially Applicable Approach to Incorporate Expert Knowledge," Mathematics, MDPI, vol. 9(19), pages 1-26, October.
    8. Wen, Shaoting & Buyukada, Musa & Evrendilek, Fatih & Liu, Jingyong, 2020. "Uncertainty and sensitivity analyses of co-combustion/pyrolysis of textile dyeing sludge and incense sticks: Regression and machine-learning models," Renewable Energy, Elsevier, vol. 151(C), pages 463-474.
    9. Zhu, Haibin & Bai, Lu & He, Lidan & Liu, Zhi, 2023. "Forecasting realized volatility with machine learning: Panel data perspective," Journal of Empirical Finance, Elsevier, vol. 73(C), pages 251-271.
    10. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    11. Zhang, Ning & Li, Zhiying & Zou, Xun & Quiring, Steven M., 2019. "Comparison of three short-term load forecast models in Southern California," Energy, Elsevier, vol. 189(C).
    12. Smyl, Slawek & Hua, N. Grace, 2019. "Machine learning methods for GEFCom2017 probabilistic load forecasting," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1424-1431.
    13. Barzin,Samira & Avner,Paolo & Maruyama Rentschler,Jun Erik & O’Clery,Neave, 2022. "Where Are All the Jobs ? A Machine Learning Approach for High Resolution Urban Employment Prediction inDeveloping Countries," Policy Research Working Paper Series 9979, The World Bank.
    14. Fuentes, Sigfredo & Ortega-Farías, Samuel & Carrasco-Benavides, Marcos & Tongson, Eden & Gonzalez Viejo, Claudia, 2024. "Actual evapotranspiration and energy balance estimation from vineyards using micro-meteorological data and machine learning modeling," Agricultural Water Management, Elsevier, vol. 297(C).
    15. Eike Emrich & Christian Pierdzioch, 2016. "Volunteering, Match Quality, and Internet Use," Schmollers Jahrbuch : Journal of Applied Social Science Studies / Zeitschrift für Wirtschafts- und Sozialwissenschaften, Duncker & Humblot, Berlin, vol. 136(2), pages 199-226.
    16. Kusiak, Andrew & Zheng, Haiyang & Song, Zhe, 2009. "On-line monitoring of power curves," Renewable Energy, Elsevier, vol. 34(6), pages 1487-1493.
    17. Zhu, Siying & Zhu, Feng, 2019. "Cycling comfort evaluation with instrumented probe bicycle," Transportation Research Part A: Policy and Practice, Elsevier, vol. 129(C), pages 217-231.
    18. Sen Guo & Haoran Zhao & Huiru Zhao, 2017. "A New Hybrid Wind Power Forecaster Using the Beveridge-Nelson Decomposition Method and a Relevance Vector Machine Optimized by the Ant Lion Optimizer," Energies, MDPI, vol. 10(7), pages 1-20, July.
    19. Catherine Ikae & Jacques Savoy, 2022. "Gender identification on Twitter," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(1), pages 58-69, January.
    20. Barkan, Oren & Benchimol, Jonathan & Caspi, Itamar & Cohen, Eliya & Hammer, Allon & Koenigstein, Noam, 2023. "Forecasting CPI inflation components with Hierarchical Recurrent Neural Networks," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1145-1162.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jagris:v:10:y:2020:i:9:p:400-:d:412144. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.