IDEAS home Printed from https://ideas.repec.org/a/eee/appene/v373y2024ics0306261924012376.html
   My bibliography  Save this article

A novel data-characteristic-driven modeling approach for imputing missing value in industrial statistics: A case study of China electricity statistics

Author

Listed:
  • Chen, Fan
  • Yu, Lan
  • Mao, Jinqi
  • Yang, Qing
  • Wang, Delu
  • Yu, Chenghao

Abstract

As a direct reference tool to reflect the operational status and development level of national industry, industrial statistics hold significant value for numerous systematic studies. Nevertheless, it is crucial to recognize that the quality of these statistics can be compromised by the common occurrence of missing value. This issue poses a substantial challenge for analyzing and utilizing industrial statistics, impeding progress in tasks reliant upon them. Given the severity of the missing value problem in industrial statistical databases and the limitations of existing literatures on missing value imputation in terms of research objects and modeling approaches, this paper proposes a novel missing value imputation modeling approach for single-indicator panels of industrial statistics based on the idea of data-characteristic-driven (DCD). Accordingly, taking the inter-provincial “monthly power generation” data from China as an example, the imputation model was constructed and its validity was tested under different imputed objects (Jiangsu and Jilin), different missing types (continuous and discrete), and different missing rates (5%, 10% and 20%) respectively. The results indicate that the proposed DCD modeling approach in this paper exhibits excellent efficacy. The imputation model, constructed based on the data characteristic of the imputed object, demonstrates clear advantages in handling missing value with different missing types and rates. This is evident in its superior consideration of numerical accuracy, directional accuracy, and imputation stability, resulting in an outstanding comprehensive imputation effect.

Suggested Citation

  • Chen, Fan & Yu, Lan & Mao, Jinqi & Yang, Qing & Wang, Delu & Yu, Chenghao, 2024. "A novel data-characteristic-driven modeling approach for imputing missing value in industrial statistics: A case study of China electricity statistics," Applied Energy, Elsevier, vol. 373(C).
  • Handle: RePEc:eee:appene:v:373:y:2024:i:c:s0306261924012376
    DOI: 10.1016/j.apenergy.2024.123854
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0306261924012376
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.apenergy.2024.123854?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Juárez, Miguel A. & Steel, Mark F. J., 2010. "Model-Based Clustering of Non-Gaussian Panel Data Based on Skew-t Distributions," Journal of Business & Economic Statistics, American Statistical Association, vol. 28(1), pages 52-66.
    2. Yu, Lean & Wang, Zishu & Tang, Ling, 2015. "A decomposition–ensemble model with data-characteristic-driven reconstruction for crude oil price forecasting," Applied Energy, Elsevier, vol. 156(C), pages 251-267.
    3. Uebele, Martin & Ritschl, Albrecht, 2009. "Stock markets and business cycle comovement in Germany before World War I: Evidence from spectral analysis," Journal of Macroeconomics, Elsevier, vol. 31(1), pages 35-57, March.
    4. Wang, Delu & Tian, Cuicui & Mao, Jinqi & Chen, Fan, 2023. "Forecasting coal demand in key coal consuming industries based on the data-characteristic-driven decomposition ensemble model," Energy, Elsevier, vol. 282(C).
    5. Wang, Delu & Chen, Fan & Mao, Jinqi & Liu, Nannan & Rong, Fangyu, 2022. "Are the official national data credible? Empirical evidence from statistics quality evaluation of China's coal and its downstream industries," Energy Economics, Elsevier, vol. 114(C).
    6. Holz, Carsten A., 2014. "Monthly industrial output in China 1980–2012," China Economic Review, Elsevier, vol. 28(C), pages 1-16.
    7. Tang, Ling & Yu, Lean & He, Kaijian, 2014. "A novel data-characteristic-driven modeling methodology for nuclear energy consumption forecasting," Applied Energy, Elsevier, vol. 128(C), pages 1-14.
    8. Liguori, Antonio & Markovic, Romana & Ferrando, Martina & Frisch, Jérôme & Causone, Francesco & van Treeck, Christoph, 2023. "Augmenting energy time-series for data-efficient imputation of missing values," Applied Energy, Elsevier, vol. 334(C).
    9. Xin Jing & Jungang Luo & Jingmin Wang & Ganggang Zuo & Na Wei, 2022. "A Multi-imputation Method to Deal With Hydro-Meteorological Missing Values by Integrating Chain Equations and Random Forest," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 36(4), pages 1159-1173, March.
    10. Peng, Liqun & Zhang, Qiang & Yao, Zhiliang & Mauzerall, Denise L. & Kang, Sicong & Du, Zhenyu & Zheng, Yixuan & Xue, Tao & He, Kebin, 2019. "Underreported coal in statistics: A survey-based solid fuel consumption and emission inventory for the rural residential sector in China," Applied Energy, Elsevier, vol. 235(C), pages 1169-1182.
    11. Yang, Dongchuan & Guo, Ju-e & Sun, Shaolong & Han, Jing & Wang, Shouyang, 2022. "An interval decomposition-ensemble approach with data-characteristic-driven reconstruction for short-term load forecasting," Applied Energy, Elsevier, vol. 306(PA).
    12. Awan, Usama & Shamim, Saqib & Khan, Zaheer & Zia, Najam Ul & Shariq, Syed Muhammad & Khan, Muhammad Naveed, 2021. "Big data analytics capability and decision-making: The role of data-driven insight on circular economy performance," Technological Forecasting and Social Change, Elsevier, vol. 168(C).
    13. Jeong, Dongyeon & Park, Chiwoo & Ko, Young Myoung, 2021. "Missing data imputation using mixture factor analysis for building electric load data," Applied Energy, Elsevier, vol. 304(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Li, Kang & Duan, Pengfei & Cao, Xiaodong & Cheng, Yuanda & Zhao, Bingxu & Xue, Qingwen & Feng, Mengdan, 2024. "A multi-energy load forecasting method based on complementary ensemble empirical model decomposition and composite evaluation factor reconstruction," Applied Energy, Elsevier, vol. 365(C).
    2. Lean Yu & Yueming Ma, 2021. "A Data-Trait-Driven Rolling Decomposition-Ensemble Model for Gasoline Consumption Forecasting," Energies, MDPI, vol. 14(15), pages 1-26, July.
    3. Yang, Dongchuan & Guo, Ju-e & Li, Yanzhao & Sun, Shaolong & Wang, Shouyang, 2023. "Short-term load forecasting with an improved dynamic decomposition-reconstruction-ensemble approach," Energy, Elsevier, vol. 263(PA).
    4. Yixiang Ma & Lean Yu & Guoxing Zhang, 2022. "A Hybrid Short-Term Load Forecasting Model Based on a Multi-Trait-Driven Methodology and Secondary Decomposition," Energies, MDPI, vol. 15(16), pages 1-20, August.
    5. Yang, Dongchuan & Guo, Ju-e & Sun, Shaolong & Han, Jing & Wang, Shouyang, 2022. "An interval decomposition-ensemble approach with data-characteristic-driven reconstruction for short-term load forecasting," Applied Energy, Elsevier, vol. 306(PA).
    6. Tao XIONG & Chongguang LI & Yukun BAO, 2017. "An improved EEMD-based hybrid approach for the short-term forecasting of hog price in China," Agricultural Economics, Czech Academy of Agricultural Sciences, vol. 63(3), pages 136-148.
    7. Sun, Shaolong & Wang, Shouyang & Wei, Yunjie, 2019. "A new multiscale decomposition ensemble approach for forecasting exchange rates," Economic Modelling, Elsevier, vol. 81(C), pages 49-58.
    8. Piotr Bórawski & Aneta Bełdycka-Bórawska & Bogdan Klepacki & Lisa Holden & Tomasz Rokicki & Andrzej Parzonko, 2024. "Changes in Gross Nuclear Electricity Production in the European Union," Energies, MDPI, vol. 17(14), pages 1-31, July.
    9. Ding, Song & Tao, Zui & Zhang, Huahan & Li, Yao, 2022. "Forecasting nuclear energy consumption in China and America: An optimized structure-adaptative grey model," Energy, Elsevier, vol. 239(PA).
    10. Jae-Woong Jeong & Heon-Hwi Lee & Hun Park, 2022. "A Study on the Effect of Knowledge Services on Organizational Performances Based on the Concept of Balanced Scorecards for the Sustainable Growth of Firms: Evidence from South Korea," Sustainability, MDPI, vol. 14(19), pages 1-19, October.
    11. Li, Lei & Lin, Jiabao & Ouyang, Ye & Luo, Xin (Robert), 2022. "Evaluating the impact of big data analytics usage on the decision-making quality of organizations," Technological Forecasting and Social Change, Elsevier, vol. 175(C).
    12. Ju, Keyi & Su, Bin & Zhou, Dequn & Zhang, Yuqiang, 2016. "An incentive-oriented early warning system for predicting the co-movements between oil price shocks and macroeconomy," Applied Energy, Elsevier, vol. 163(C), pages 452-463.
    13. Mohsin, Muhammad & Jamaani, Fouad, 2023. "Green finance and the socio-politico-economic factors’ impact on the future oil prices: Evidence from machine learning," Resources Policy, Elsevier, vol. 85(PA).
    14. Belhadi, Amine & Venkatesh, Mani & Kamble, Sachin & Abedin, Mohammad Zoynul, 2024. "Data-driven digital transformation for supply chain carbon neutrality: Insights from cross-sector supply chain," International Journal of Production Economics, Elsevier, vol. 270(C).
    15. Brewis, Claire & Dibb, Sally & Meadows, Maureen, 2023. "Leveraging big data for strategic marketing: A dynamic capabilities model for incumbent firms," Technological Forecasting and Social Change, Elsevier, vol. 190(C).
    16. Gianmarco Bressanelli & Federico Adrodegari & Daniela C. A. Pigosso & Vinit Parida, 2022. "Towards the Smart Circular Economy Paradigm: A Definition, Conceptualization, and Research Agenda," Sustainability, MDPI, vol. 14(9), pages 1-20, April.
    17. Lin, Ling & Jiang, Yong & Xiao, Helu & Zhou, Zhongbao, 2020. "Crude oil price forecasting based on a novel hybrid long memory GARCH-M and wavelet analysis model," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 543(C).
    18. Sylvia Frühwirth‐Schnatter & Christoph Pamminger & Andrea Weber & Rudolf Winter‐Ebmer, 2012. "Labor market entry and earnings dynamics: Bayesian inference using mixtures‐of‐experts Markov chain clustering," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 27(7), pages 1116-1137, November.
    19. Fabrizi, Enrico & Salvati, Nicola & Trivisano, Carlo, 2020. "Robust Bayesian small area estimation based on quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 145(C).
    20. Oesterreich, Thuy Duong & Anton, Eduard & Teuteberg, Frank & Dwivedi, Yogesh K, 2022. "The role of the social and technical factors in creating business value from big data analytics: A meta-analysis," Journal of Business Research, Elsevier, vol. 153(C), pages 128-149.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:appene:v:373:y:2024:i:c:s0306261924012376. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.