IDEAS home Printed from https://ideas.repec.org/a/ers/journl/vxxviiy2024ispecialbp320-332.html
   My bibliography  Save this article

Imputing Data Gaps in Economic Surveys Using Fuzzy Sets and Artificial Intelligence Technique

Author

Listed:
  • Adam Kiersztyn
  • Krystyna Kiersztyn
  • Korneliusz Pylak
  • Jakub Bis
  • Michal Dolecki
  • Anna Zelazna

Abstract

Purpose: This paper develops a novel approach to impute data gaps in economic surveys. In contrast to classical methods relying on statistical analysis of survey data, more advanced prediction techniques combined with fuzzy sets are applied to effectively address missing data. Design/Methodology/Approach: The paper proposes an unconventional approach that integrates advanced prediction methods with fuzzy sets for imputing missing data. The effectiveness of the method is tested on the extensive dataset from the Polish Panel Survey (POLPAN), which was conducted every five years from 1988 to 2018. The survey contains a wide range of questions asked over successive waves, enabling a comprehensive analysis of the method for imputing data gaps. Findings: The results of numerical experiments show that the proposed method performs highly effectively, regardless of the proportion of observations assigned to the training set. Some methods, such as Support Vector Machine (SVM), did not prove suitable for imputing this dataset. The choice and number of explanatory variables play a crucial role in the method's effectiveness, with cases where a single variable was sufficient for accurate imputation. Practical Implications: The proposed method offers practical applications for improving data quality in economic surveys, especially in large-scale longitudinal surveys like POLPAN. It provides new insights into handling missing data and optimizing the selection of explanatory variables, which can enhance the robustness of imputation techniques in complex surveys. Originality/Value: This paper contributes an original and valuable approach by combining advanced prediction techniques with fuzzy sets, providing a highly effective tool for imputing missing data. This unconventional method offers new avenues for further research in economic surveys and beyond.

Suggested Citation

  • Adam Kiersztyn & Krystyna Kiersztyn & Korneliusz Pylak & Jakub Bis & Michal Dolecki & Anna Zelazna, 2024. "Imputing Data Gaps in Economic Surveys Using Fuzzy Sets and Artificial Intelligence Technique," European Research Studies Journal, European Research Studies Journal, vol. 0(Special B), pages 320-332.
  • Handle: RePEc:ers:journl:v:xxvii:y:2024:i:specialb:p:320-332
    as

    Download full text from publisher

    File URL: https://ersj.eu/journal/3492/download
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alain Fayolle & Hans Landström & William B. Gartner & Karin Berglund, 2016. "The institutionalization of entrepreneurship : Questioning the status quo and re-gaining hope for entrepreneurship research," Post-Print hal-02311947, HAL.
    2. Ganzeboom, H.B.G. & de Graaf, P.M. & Treiman, D.J. & de Leeuw, J., 1992. "A standard international socio-economic index of occupational status," WORC Paper 92.01.001/1, Tilburg University, Work and Organization Research Centre.
    3. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mansoor, Umer & Jamal, Arshad & Su, Junbiao & Sze, N.N. & Chen, Anthony, 2023. "Investigating the risk factors of motorcycle crash injury severity in Pakistan: Insights and policy recommendations," Transport Policy, Elsevier, vol. 139(C), pages 21-38.
    2. Daniela Del Boca & Chiara Monfardini & Sarah Grace See, 2022. "Early Childcare Duration and Student' Later Outcomes in Europe," Working Papers 2022-021, Human Capital and Economic Opportunity Working Group.
    3. Clara H. Mulder & Michael Wagner, 2001. "The Connections between Family Formation and First-time Home Ownership in the Context of West Germany and the Netherlands," European Journal of Population, Springer;European Association for Population Studies, vol. 17(2), pages 137-164, June.
    4. Andersen, Asbjørn Goul & Markussen, Simen & Røed, Knut, 2021. "Pension reform and the efficiency-equity trade-off: Impacts of removing an early retirement subsidy," Labour Economics, Elsevier, vol. 72(C).
    5. Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.
    6. Akash Malhotra, 2018. "A hybrid econometric-machine learning approach for relative importance analysis: Prioritizing food policy," Papers 1806.04517, arXiv.org, revised Aug 2020.
    7. Aparicio, Juan & Santin, Daniel, 2018. "A note on measuring group performance over time with pseudo-panels," European Journal of Operational Research, Elsevier, vol. 267(1), pages 227-235.
    8. Nahushananda Chakravarthy H G & Karthik M Seenappa & Sujay Raghavendra Naganna & Dayananda Pruthviraja, 2023. "Machine Learning Models for the Prediction of the Compressive Strength of Self-Compacting Concrete Incorporating Incinerated Bio-Medical Waste Ash," Sustainability, MDPI, vol. 15(18), pages 1-22, September.
    9. Tim Voigt & Martin Kohlhase & Oliver Nelles, 2021. "Incremental DoE and Modeling Methodology with Gaussian Process Regression: An Industrially Applicable Approach to Incorporate Expert Knowledge," Mathematics, MDPI, vol. 9(19), pages 1-26, October.
    10. Wen, Shaoting & Buyukada, Musa & Evrendilek, Fatih & Liu, Jingyong, 2020. "Uncertainty and sensitivity analyses of co-combustion/pyrolysis of textile dyeing sludge and incense sticks: Regression and machine-learning models," Renewable Energy, Elsevier, vol. 151(C), pages 463-474.
    11. Zhu, Haibin & Bai, Lu & He, Lidan & Liu, Zhi, 2023. "Forecasting realized volatility with machine learning: Panel data perspective," Journal of Empirical Finance, Elsevier, vol. 73(C), pages 251-271.
    12. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    13. Zhang, Ning & Li, Zhiying & Zou, Xun & Quiring, Steven M., 2019. "Comparison of three short-term load forecast models in Southern California," Energy, Elsevier, vol. 189(C).
    14. Daniel Kemptner & Jan Marcus, 2013. "Spillover effects of maternal education on child’s health and health behavior," Review of Economics of the Household, Springer, vol. 11(1), pages 29-52, March.
    15. Smyl, Slawek & Hua, N. Grace, 2019. "Machine learning methods for GEFCom2017 probabilistic load forecasting," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1424-1431.
    16. Gabriela Schütz & Heinrich W. Ursprung & Ludger Wößmann, 2008. "Education Policy and Equality of Opportunity," Kyklos, Wiley Blackwell, vol. 61(2), pages 279-308, May.
    17. Barzin,Samira & Avner,Paolo & Maruyama Rentschler,Jun Erik & O’Clery,Neave, 2022. "Where Are All the Jobs ? A Machine Learning Approach for High Resolution Urban Employment Prediction inDeveloping Countries," Policy Research Working Paper Series 9979, The World Bank.
    18. Ralph Hippe & Maciej Jakubowski & Luisa De Sousa Lobo Borges de Araujo, 2018. "Regional inequalities in PISA: the case of Italy and Spain," JRC Research Reports JRC109057, Joint Research Centre.
    19. Eike Emrich & Christian Pierdzioch, 2016. "Volunteering, Match Quality, and Internet Use," Schmollers Jahrbuch : Journal of Applied Social Science Studies / Zeitschrift für Wirtschafts- und Sozialwissenschaften, Duncker & Humblot, Berlin, vol. 136(2), pages 199-226.
    20. Takanori Sumino, 2016. "Level or Concentration? A Cross-national Analysis of Public Attitudes Towards Taxation Policies," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 129(3), pages 1115-1134, December.

    More about this item

    Keywords

    Missing value imputation; fuzzy sets; field surveys; POLPAN.;
    All these keywords.

    JEL classification:

    • C6 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling
    • C8 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs
    • C83 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Survey Methods; Sampling Methods
    • D7 - Microeconomics - - Analysis of Collective Decision-Making

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ers:journl:v:xxvii:y:2024:i:specialb:p:320-332. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Marios Agiomavritis (email available below). General contact details of provider: https://ersj.eu/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.