IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i17p2081-d623959.html
   My bibliography  Save this article

Imputation for Repeated Bounded Outcome Data: Statistical and Machine-Learning Approaches

Author

Listed:
  • Urko Aguirre-Larracoechea

    (Research Unit, Osakidetza Basque Health Service, Barrualde-Galdakao Integrated Health Organisation, Galdakao-Usansolo Hospital, 48960 Galdakao, Spain
    Kronikgune Institute for Health Services Research, 48902 Barakaldo, Spain
    Research Network in Health Services in Chronic Diseases (Red de Investigación en Servicios de Salud en Enfermedades Crónicas, REDISSEC), 48960 Galdakao, Spain)

  • Cruz E. Borges

    (Deusto Institute of Technology, Faculty of Engineering, University of Deusto, 48007 Bilbao, Spain)

Abstract

Real-life data are bounded and heavy-tailed variables. Zero-one-inflated beta (ZOIB) regression is used for modelling them. There are no appropriate methods to address the problem of missing data in repeated bounded outcomes. We developed an imputation method using ZOIB (i-ZOIB) and compared its performance with those of the naïve and machine-learning methods, using different distribution shapes and settings designed in the simulation study. The performance was measured employing the absolute error (MAE), root-mean-square-error (RMSE) and the unscaled mean bounded relative absolute error (UMBRAE) methods. The results varied depending on the missingness rate and mechanism. The i-ZOIB and the machine-learning ANN, SVR and RF methods showed the best performance.

Suggested Citation

  • Urko Aguirre-Larracoechea & Cruz E. Borges, 2021. "Imputation for Repeated Bounded Outcome Data: Statistical and Machine-Learning Approaches," Mathematics, MDPI, vol. 9(17), pages 1-27, August.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:17:p:2081-:d:623959
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/17/2081/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/17/2081/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Xie, Hui, 2012. "Analyzing longitudinal clinical trial data with nonignorable missingness and unknown missingness reasons," Computational Statistics & Data Analysis, Elsevier, vol. 56(5), pages 1287-1300.
    2. Saeed Nosratabadi & Amir Mosavi & Puhong Duan & Pedram Ghamisi, 2020. "Data Science in Economics," Papers 2003.13422, arXiv.org.
    3. Nosratabadi, Saeed & Mosavi, Amir & Duan, Puhong & Ghamisi, Pedram & Filip, Ferdinand & Band, Shahab S. & Reuter, Uwe & Gama, Joao & Gandomi, Amir H., 2020. "Data science in economics: comprehensive review of advanced machine learning and deep learning methods," MetaArXiv haf2v, Center for Open Science.
    4. Silvia Ferrari & Francisco Cribari-Neto, 2004. "Beta Regression for Modelling Rates and Proportions," Journal of Applied Statistics, Taylor & Francis Journals, vol. 31(7), pages 799-815.
    5. Nosratabadi, Saeed & Mosavi, Amir & Duan, Puhong & Ghamisi, Pedram & Filip, Ferdinand & Band, Shahab S. & Reuter, Uwe & Gama, Joao & Gandomi, Amir H., 2020. "Data science in economics: comprehensive review of advanced machine learning and deep learning methods," LawArXiv kczj5, Center for Open Science.
    6. Hyndman, Rob J. & Koehler, Anne B., 2006. "Another look at measures of forecast accuracy," International Journal of Forecasting, Elsevier, vol. 22(4), pages 679-688.
    7. Chao Chen & Jamie Twycross & Jonathan M Garibaldi, 2017. "A new accuracy measure based on bounded relative error for time series forecasting," PLOS ONE, Public Library of Science, vol. 12(3), pages 1-23, March.
    8. Nosratabadi, Saeed & Mosavi, Amir & Duan, Puhong & Ghamisi, Pedram & Filip, Ferdinand & Band, Shahab S. & Reuter, Uwe & Gama, Joao & Gandomi, Amir H., 2020. "Data science in economics: comprehensive review of advanced machine learning and deep learning methods," SocArXiv 9vdwf, Center for Open Science.
    9. Fabio Baione & Davide Biancalana & Paolo De Angelis, 2020. "An application of Zero-One Inflated Beta regression models for predicting health insurance reimbursement," Papers 2011.09248, arXiv.org.
    10. Schenker, Nathaniel & Taylor, Jeremy M. G., 1996. "Partially parametric techniques for multiple imputation," Computational Statistics & Data Analysis, Elsevier, vol. 22(4), pages 425-446, August.
    11. Nosratabadi, Saeed & Mosavi, Amir & Duan, Puhong & Ghamisi, Pedram & Filip, Ferdinand & Band, Shahab S. & Reuter, Uwe & Gama, Joao & Gandomi, Amir H., 2020. "Data science in economics: comprehensive review of advanced machine learning and deep learning methods," OSF Preprints yc6e2, Center for Open Science.
    12. Nosratabadi, Saeed & Mosavi, Amir & Duan, Puhong & Ghamisi, Pedram & Filip, Ferdinand & Band, Shahab S. & Reuter, Uwe & Gama, Joao & Gandomi, Amir H., 2020. "Data science in economics: comprehensive review of advanced machine learning and deep learning methods," Thesis Commons auyvc, Center for Open Science.
    13. Nosratabadi, Saeed & Mosavi, Amir & Duan, Puhong & Ghamisi, Pedram & Filip, Ferdinand & Band, Shahab S. & Reuter, Uwe & Gama, Joao & Gandomi, Amir H., 2020. "Data science in economics: comprehensive review of advanced machine learning and deep learning methods," EdArXiv 5dwrt, Center for Open Science.
    14. Kwon, Tae Yeon & Park, Yousung, 2015. "A new multiple imputation method for bounded missing values," Statistics & Probability Letters, Elsevier, vol. 107(C), pages 204-209.
    15. Marco Geraci & Alexander McLain, 2018. "Multiple Imputation for Bounded Variables," Psychometrika, Springer;The Psychometric Society, vol. 83(4), pages 919-940, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yong-Chao Su & Cheng-Yu Wu & Cheng-Hong Yang & Bo-Sheng Li & Sin-Hua Moi & Yu-Da Lin, 2021. "Machine Learning Data Imputation and Prediction of Foraging Group Size in a Kleptoparasitic Spider," Mathematics, MDPI, vol. 9(4), pages 1-16, February.
    2. Mei-Li Shen & Cheng-Feng Lee & Hsiou-Hsiang Liu & Po-Yin Chang & Cheng-Hong Yang, 2021. "An Effective Hybrid Approach for Forecasting Currency Exchange Rates," Sustainability, MDPI, vol. 13(5), pages 1-29, March.
    3. Marcus Vinicius Santos & Fernando Morgado-Dias & Thiago C. Silva, 2023. "Oil Sector and Sentiment Analysis—A Review," Energies, MDPI, vol. 16(12), pages 1-29, June.
    4. ErLe Du & Meng Ji, 2021. "Analyzing the regional economic changes in a high-tech industrial development zone using machine learning algorithms," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-18, June.
    5. David G. Green, 2023. "Emergence in complex networks of simple agents," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 18(3), pages 419-462, July.
    6. Lin, Yong & Wang, Renyu & Gong, Xingyue & Jia, Guozhu, 2022. "Cross-correlation and forecast impact of public attention on USD/CNY exchange rate: Evidence from Baidu Index," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 604(C).
    7. Steve J. Bickley & Benno Torgler, 2021. "Behavioural Economics, What Have we Missed? Exploring “Classical” Behavioural Economics Roots in AI, Cognitive Psychology, and Complexity Theory," CREMA Working Paper Series 2021-21, Center for Research in Economics, Management and the Arts (CREMA).
    8. Oliver Hümbelin & Lukas Hobi & Robert Fluder, 2021. "Rich Cities, Poor Countryside? Social Structure of the Poor and Poverty Risks in Urban and Rural Places in an Affluent Country. An Administrative Data based Analysis using Random Forest," University of Bern Social Sciences Working Papers 40, University of Bern, Department of Social Sciences, revised 10 Nov 2021.
    9. Petr Suler & Zuzana Rowland & Tomas Krulicky, 2021. "Evaluation of the Accuracy of Machine Learning Predictions of the Czech Republic’s Exports to the China," JRFM, MDPI, vol. 14(2), pages 1-30, February.
    10. Saeed Nosratabadi & Nesrine Khazami & Marwa Ben Abdallah & Zoltan Lackner & Shahab S. Band & Amir Mosavi & Csaba Mako, 2020. "Social Capital Contributions to Food Security: A Comprehensive Literature Review," Papers 2012.03606, arXiv.org.
    11. Cheng Zhang & Nilam Nur Amir Sjarif & Roslina Ibrahim, 2023. "Deep learning models for price forecasting of financial time series: A review of recent advancements: 2020-2022," Papers 2305.04811, arXiv.org, revised Sep 2023.
    12. Xiaodong Zhang & Suhui Liu & Xin Zheng, 2021. "Stock Price Movement Prediction Based on a Deep Factorization Machine and the Attention Mechanism," Mathematics, MDPI, vol. 9(8), pages 1-21, April.
    13. Teddy Lazebnik & Tzach Fleischer & Amit Yaniv-Rosenfeld, 2023. "Benchmarking Biologically-Inspired Automatic Machine Learning for Economic Tasks," Sustainability, MDPI, vol. 15(14), pages 1-9, July.
    14. Meir Russ, 2021. "Knowledge Management for Sustainable Development in the Era of Continuously Accelerating Technological Revolutions: A Framework and Models," Sustainability, MDPI, vol. 13(6), pages 1-32, March.
    15. Amir Masoud Rahmani & Efat Yousefpoor & Mohammad Sadegh Yousefpoor & Zahid Mehmood & Amir Haider & Mehdi Hosseinzadeh & Rizwan Ali Naqvi, 2021. "Machine Learning (ML) in Medicine: Review, Applications, and Challenges," Mathematics, MDPI, vol. 9(22), pages 1-52, November.
    16. Labib Shami & Teddy Lazebnik, 2024. "Implementing Machine Learning Methods in Estimating the Size of the Non-observed Economy," Computational Economics, Springer;Society for Computational Economics, vol. 63(4), pages 1459-1476, April.
    17. Ren, Yi-Shuai & Ma, Chao-Qun & Kong, Xiao-Lin & Baltas, Konstantinos & Zureigat, Qasim, 2022. "Past, present, and future of the application of machine learning in cryptocurrency research," Research in International Business and Finance, Elsevier, vol. 63(C).
    18. Hanyao Gao & Gang Kou & Haiming Liang & Hengjie Zhang & Xiangrui Chao & Cong-Cong Li & Yucheng Dong, 2024. "Machine learning in business and finance: a literature review and research opportunities," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 10(1), pages 1-35, December.
    19. Cui, Xiwen & Yu, Xiaoyu & Niu, Dongxiao, 2024. "The ultra-short-term wind power point-interval forecasting model based on improved variational mode decomposition and bidirectional gated recurrent unit improved by improved sparrow search algorithm a," Energy, Elsevier, vol. 288(C).
    20. Di Wu & Zhenning Xu & Seung Bach, 2023. "Using Google Trends to predict and forecast avocado sales," Journal of Marketing Analytics, Palgrave Macmillan, vol. 11(4), pages 629-641, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:17:p:2081-:d:623959. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.