IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i1p236-d1023270.html
   My bibliography  Save this article

Machine-Learning Methods on Noisy and Sparse Data

Author

Listed:
  • Konstantinos Poulinakis

    (University of Nicosia, Nicosia CY-2417, Cyprus)

  • Dimitris Drikakis

    (University of Nicosia, Nicosia CY-2417, Cyprus)

  • Ioannis W. Kokkinakis

    (University of Nicosia, Nicosia CY-2417, Cyprus)

  • Stephen Michael Spottswood

    (Air Force Research Laboratory, Wright Patterson AFB, Greene County, OH 45433-7402, USA)

Abstract

Experimental and computational data and field data obtained from measurements are often sparse and noisy. Consequently, interpolating unknown functions under these restrictions to provide accurate predictions is very challenging. This study compares machine-learning methods and cubic splines on the sparsity of training data they can handle, especially when training samples are noisy. We compare deviation from a true function f using the mean square error, signal-to-noise ratio and the Pearson R 2 coefficient. We show that, given very sparse data, cubic splines constitute a more precise interpolation method than deep neural networks and multivariate adaptive regression splines. In contrast, machine-learning models are robust to noise and can outperform splines after a training data threshold is met. Our study aims to provide a general framework for interpolating one-dimensional signals, often the result of complex scientific simulations or laboratory experiments.

Suggested Citation

  • Konstantinos Poulinakis & Dimitris Drikakis & Ioannis W. Kokkinakis & Stephen Michael Spottswood, 2023. "Machine-Learning Methods on Noisy and Sparse Data," Mathematics, MDPI, vol. 11(1), pages 1-19, January.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:1:p:236-:d:1023270
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/1/236/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/1/236/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Chang-Jui Lin & Hsueh-Fang Chen & Tian-Shyug Lee, 2011. "Forecasting Tourism Demand Using Time Series, Artificial Neural Networks and Multivariate Adaptive Regression Splines:Evidence from Taiwan," International Journal of Business Administration, International Journal of Business Administration, Sciedu Press, vol. 2(2), pages 14-24, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Luke T. Woods & Zeeshan A. Rana, 2023. "Modelling Sign Language with Encoder-Only Transformers and Human Pose Estimation Keypoint Data," Mathematics, MDPI, vol. 11(9), pages 1-28, May.
    2. Wu, Menglong & Xiong, Jiajie & Li, Ruoyu & Dong, Aihong & Lv, Chang & Sun, Dan & Abdelghany, Ahmed Elsayed & Zhang, Qian & Wang, Yaqiong & Siddique, Kadambot H.M. & Niu, Wenquan, 2024. "Precision forecasting of fertilizer components’ concentrations in mixed variable-rate fertigation through machine learning," Agricultural Water Management, Elsevier, vol. 298(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Eden Xiaoying Jiao & Jason Li Chen, 2019. "Tourism forecasting: A review of methodological developments over the last decade," Tourism Economics, , vol. 25(3), pages 469-492, May.
    2. Marcos Álvarez-Díaz & Manuel González-Gómez & María Soledad Otero-Giráldez, 2018. "Forecasting International Tourism Demand Using a Non-Linear Autoregressive Neural Network and Genetic Programming," Forecasting, MDPI, vol. 1(1), pages 1-17, September.
    3. Rashad Aliyev & Sara Salehi & Rafig Aliyev, 2019. "Development of Fuzzy Time Series Model for Hotel Occupancy Forecasting," Sustainability, MDPI, vol. 11(3), pages 1-13, February.
    4. Dr. Murat çuhadar & Iclal Cogurcu & Ceyda Kukrer, 2014. "Modelling and Forecasting Cruise Tourism Demand to Izmir by Different Artificial Neural Network Architectures," International Journal of Business and Social Research, LAR Center Press, vol. 4(3), pages 12-28, March.
    5. Oscar Claveria & Enric Monte & Salvador Torra, 2016. "Modelling cross-dependencies between Spain’s regional tourism markets with an extension of the Gaussian process regression model," SERIEs: Journal of the Spanish Economic Association, Springer;Spanish Economic Association, vol. 7(3), pages 341-357, August.
    6. Yi-Chung Hu, 2021. "Forecasting tourism demand using fractional grey prediction models with Fourier series," Annals of Operations Research, Springer, vol. 300(2), pages 467-491, May.
    7. Peng, Bo & Song, Haiyan & Crouch, Geoffrey I., 2014. "A meta-analysis of international tourism demand forecasting and implications for practice," Tourism Management, Elsevier, vol. 45(C), pages 181-193.
    8. Oscar Claveria & Enric Monte & Salvador Torra, 2015. "“Self-organizing map analysis of agents’ expectations. Different patterns of anticipation of the 2008 financial crisis”," AQR Working Papers 201508, University of Barcelona, Regional Quantitative Analysis Group, revised Mar 2015.
    9. Oscar Claveria & Enric Monte & Salvador Torra, 2017. "“Regional tourism demand forecasting with machine learning models: Gaussian process regression vs. neural network models in a multiple-input multiple-output setting”," AQR Working Papers 201701, University of Barcelona, Regional Quantitative Analysis Group, revised Jan 2017.
    10. Oscar Claveria & Enric Monte & Salvador Torra, 2015. "“Regional Forecasting with Support Vector Regressions: The Case of Spain”," IREA Working Papers 201507, University of Barcelona, Research Institute of Applied Economics, revised Jan 2015.
    11. Yi-Chung Hu, 2017. "Predicting Foreign Tourists for the Tourism Industry Using Soft Computing-Based Grey–Markov Models," Sustainability, MDPI, vol. 9(7), pages 1-12, July.
    12. Oscar Claveria & Enric Monte & Salvador Torra, 2016. "Combination forecasts of tourism demand with machine learning models," Applied Economics Letters, Taylor & Francis Journals, vol. 23(6), pages 428-431, April.
    13. Dr. Murat çuhadar & Iclal Cogurcu & Ceyda Kukrer, 2014. "Modelling and Forecasting Cruise Tourism Demand to Izmir by Different Artificial Neural Network Architectures," International Journal of Business and Social Research, MIR Center for Socio-Economic Research, vol. 4(3), pages 12-28, March.
    14. Oscar Claveria & Enric Monte & Salvador Torra, 2014. "“A multivariate neural network approach to tourism demand forecasting”," AQR Working Papers 201410, University of Barcelona, Regional Quantitative Analysis Group, revised May 2014.
    15. Claveria, Oscar & Torra, Salvador, 2014. "Forecasting tourism demand to Catalonia: Neural networks vs. time series models," Economic Modelling, Elsevier, vol. 36(C), pages 220-228.
    16. Komkrit Wongkhae & Songsak Sriboonchitta & Kanchana Choketaworn & Chukiat Chaiboonsri, 2012. "Does price matter? The FMOLS and DOLS estimation of industrial countries tourists outbound to four ASEAN countries," The Empirical Econometrics and Quantitative Economics Letters, Faculty of Economics, Chiang Mai University, vol. 1(4), pages 107-128, December.
    17. Oscar Claveria & Enric Monte & Salvador Torra, 2018. "“A regional perspective on the accuracy of machine learning forecasts of tourism demand based on data characteristics”," AQR Working Papers 201802, University of Barcelona, Regional Quantitative Analysis Group, revised Apr 2018.
    18. Changlu Zhang & Jian Zhang & Peng Jiang, 2022. "Assessing the risk of green building materials certification using the back-propagation neural network," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 24(5), pages 6925-6952, May.
    19. Binglei Xie & Yu Sun & Xiaolong Huang & Le Yu & Gangyan Xu, 2020. "Travel Characteristics Analysis and Passenger Flow Prediction of Intercity Shuttles in the Pearl River Delta on Holidays," Sustainability, MDPI, vol. 12(18), pages 1-23, September.
    20. Tea Baldigara, 2013. "Forecasting Tourism Demand in Croatia: A Comparison of Different Extrapolative Methods," Journal of Business Administration Research, Journal of Business Administration Research, Sciedu Press, vol. 2(1), pages 84-92, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:1:p:236-:d:1023270. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.