IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2501.06587.html
   My bibliography  Save this paper

Optimizing Financial Data Analysis: A Comparative Study of Preprocessing Techniques for Regression Modeling of Apple Inc.'s Net Income and Stock Prices

Author

Listed:
  • Kevin Ungar
  • Camelia Oprean-Stan

Abstract

This article presents a comprehensive methodology for processing financial datasets of Apple Inc., encompassing quarterly income and daily stock prices, spanning from March 31, 2009, to December 31, 2023. Leveraging 60 observations for quarterly income and 3774 observations for daily stock prices, sourced from Macrotrends and Yahoo Finance respectively, the study outlines five distinct datasets crafted through varied preprocessing techniques. Through detailed explanations of aggregation, interpolation (linear, polynomial, and cubic spline) and lagged variables methods, the study elucidates the steps taken to transform raw data into analytically rich datasets. Subsequently, the article delves into regression analysis, aiming to decipher which of the five data processing methods best suits capital market analysis, by employing both linear and polynomial regression models on each preprocessed dataset and evaluating their performance using a range of metrics, including cross-validation score, MSE, MAE, RMSE, R-squared, and Adjusted R-squared. The research findings reveal that linear interpolation with polynomial regression emerges as the top-performing method, boasting the lowest validation MSE and MAE values, alongside the highest R-squared and Adjusted R-squared values.

Suggested Citation

  • Kevin Ungar & Camelia Oprean-Stan, 2025. "Optimizing Financial Data Analysis: A Comparative Study of Preprocessing Techniques for Regression Modeling of Apple Inc.'s Net Income and Stock Prices," Papers 2501.06587, arXiv.org.
  • Handle: RePEc:arx:papers:2501.06587
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2501.06587
    File Function: Latest version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Jianhong Guo & Che-Jung Chang & Yingyi Huang & Xiaotian Zhang & Andrea Murari, 2022. "An Aggregating Prediction Model for Management Decision Analysis," Complexity, Hindawi, vol. 2022, pages 1-7, May.
    2. Andreas Lanz & Gregor Reich & Ole Wilms, 2022. "Adaptive grids for the estimation of dynamic models," Quantitative Marketing and Economics (QME), Springer, vol. 20(2), pages 179-238, June.
    3. Esra’a Alshdaifat & Doa’a Alshdaifat & Ayoub Alsarhan & Fairouz Hussein & Subhieh Moh’d Faraj S. El-Salhi, 2021. "The Effect of Preprocessing Techniques, Applied to Numeric Features, on Classification Algorithms’ Performance," Data, MDPI, vol. 6(2), pages 1-23, January.
    4. Vasile Brătian & Ana-Maria Acu & Camelia Oprean-Stan & Emil Dinga & Gabriela-Mariana Ionescu, 2021. "Efficient or Fractal Market Hypothesis? A Stock Indexes Modelling Using Geometric Brownian Motion and Geometric Fractional Brownian Motion," Mathematics, MDPI, vol. 9(22), pages 1-20, November.
    5. Gründler, Klaus & Krieger, Tommy, 2022. "Should we care (more) about data aggregation?," European Economic Review, Elsevier, vol. 142(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Borisova, Ekaterina & Gründler, Klaus & Hackenberger, Armin & Harter, Anina & Potrafke, Niklas & Schoors, Koen, 2023. "Crisis experience and the deep roots of COVID-19 vaccination preferences," European Economic Review, Elsevier, vol. 160(C).
    2. Sutirtha Bagchi & Matthew J. Fagerstrom, 2023. "Wealth inequality and democracy," Public Choice, Springer, vol. 197(1), pages 89-136, October.
    3. Samuka Mohanty & Rajashree Dash, 2023. "A New Dual Normalization for Enhancing the Bitcoin Pricing Capability of an Optimized Low Complexity Neural Net with TOPSIS Evaluation," Mathematics, MDPI, vol. 11(5), pages 1-28, February.
    4. Lars Pelke, 2023. "Reanalysing the link between democracy and economic development," International Area Studies Review, Center for International Area Studies, Hankuk University of Foreign Studies, vol. 26(4), pages 361-383, December.
    5. Krieger, Tommy, 2022. "Democracy and the quality of economic institutions: Theory and evidence," ZEW Discussion Papers 22-032, ZEW - Leibniz Centre for European Economic Research.
    6. repec:zbw:vfsc24:302413 is not listed on IDEAS
    7. Tommy Krieger, 2022. "Democracy and the quality of economic institutions: theory and evidence," Public Choice, Springer, vol. 192(3), pages 357-376, September.
    8. Fairouz Hussein & Ayat Al-Ahmad & Subhieh El-Salhi & Esra’a Alshdaifat & Mo’taz Al-Hami, 2022. "Advances in Contextual Action Recognition: Automatic Cheating Detection Using Machine Learning Techniques," Data, MDPI, vol. 7(9), pages 1-13, August.
    9. Sima, Di & Huang, Fali, 2023. "Is democracy good for growth? — Development at political transition time matters," European Journal of Political Economy, Elsevier, vol. 78(C).
    10. Celico, Andrea & Rode, Martin & Rodriguez-Carreño, Ignacio, 2024. "Will the real populists please stand up? A machine learning index of party populism," European Journal of Political Economy, Elsevier, vol. 82(C).
    11. Tommy Krieger, 2022. "Elites and Health Infrastructure Improvements in Industrializing Regimes," CESifo Working Paper Series 9808, CESifo.
    12. Krieger, Tommy, 2022. "Measuring democracy," ZEW Discussion Papers 22-063, ZEW - Leibniz Centre for European Economic Research.

    More about this item

    JEL classification:

    • G14 - Financial Economics - - General Financial Markets - - - Information and Market Efficiency; Event Studies; Insider Trading
    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • G32 - Financial Economics - - Corporate Finance and Governance - - - Financing Policy; Financial Risk and Risk Management; Capital and Ownership Structure; Value of Firms; Goodwill

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2501.06587. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.