IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v189y2022ics0047259x21001469.html
   My bibliography  Save this article

Data driven orthogonal basis selection for functional data analysis

Author

Listed:
  • Basna, Rani
  • Nassar, Hiba
  • Podgórski, Krzysztof

Abstract

Functional data analysis is typically performed in two steps: first, functionally representing discrete observations, and then applying functional methods, such as the functional principal component analysis, to the so-represented data. While the initial choice of a functional representation may have a significant impact on the second phase of the analysis, this issue has not gained much attention in the past. Typically, a rather ad hoc choice of some standard basis such as Fourier, wavelets, splines, etc. is used for the data transforming purpose. To address this important problem, we present its mathematical formulation, demonstrate its importance, and propose a data-driven method of functionally representing observations. The method chooses an initial functional basis by an efficient placement of the knots. A simple machine learning style algorithm is utilized for the knot selection and recently introduced orthogonal spline bases - splinets - are eventually taken to represent the data. The benefits are illustrated by examples of analyses of sparse functional data.

Suggested Citation

  • Basna, Rani & Nassar, Hiba & Podgórski, Krzysztof, 2022. "Data driven orthogonal basis selection for functional data analysis," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
  • Handle: RePEc:eee:jmvana:v:189:y:2022:i:c:s0047259x21001469
    DOI: 10.1016/j.jmva.2021.104868
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X21001469
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2021.104868?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Molinari, Nicolas & Durand, Jean-Francois & Sabatier, Robert, 2004. "Bounded optimal knots for regression splines," Computational Statistics & Data Analysis, Elsevier, vol. 45(2), pages 159-178, March.
    2. Zhou S. & Shen X., 2001. "Spatially Adaptive Regression Splines and Accurate Knot Selection Schemes," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 247-259, March.
    3. D. G. T. Denison & B. K. Mallick & A. F. M. Smith, 1998. "Automatic Bayesian curve fitting," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(2), pages 333-350.
    4. Yao, Fang & Muller, Hans-Georg & Wang, Jane-Ling, 2005. "Functional Data Analysis for Sparse Longitudinal Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 577-590, June.
    5. Jianhua Guo & Jianchang Hu & Bing-Yi Jing & Zhen Zhang, 2016. "Spline-Lasso in High-Dimensional Linear Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 288-297, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Janet Niekerk & Haakon Bakka & Håvard Rue, 2023. "Stable Non-Linear Generalized Bayesian Joint Models for Survival-Longitudinal Data," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 85(1), pages 102-128, February.
    2. Johnson, Matthew S., 2007. "Modeling dichotomous item responses with free-knot splines," Computational Statistics & Data Analysis, Elsevier, vol. 51(9), pages 4178-4192, May.
    3. Botts, Carsten H. & Daniels, Michael J., 2008. "A flexible approach to Bayesian multiple curve fitting," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5100-5120, August.
    4. Binder, Harald & Sauerbrei, Willi, 2008. "Increasing the usefulness of additive spline models by knot removal," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5305-5318, August.
    5. Anestis Antoniadis & Irène Gijbels & Mila Nikolova, 2011. "Penalized likelihood regression for generalized linear models with non-quadratic penalties," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 63(3), pages 585-615, June.
    6. Ana-Maria Staicu & Yingxing Li & Ciprian M. Crainiceanu & David Ruppert, 2014. "Likelihood Ratio Tests for Dependent Data with Applications to Longitudinal and Functional Data Analysis," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 41(4), pages 932-949, December.
    7. Amato, Umberto & Antoniadis, Anestis & De Feis, Italia & Goude, Yannig & Lagache, Audrey, 2021. "Forecasting high resolution electricity demand data with additive models including smooth and jagged components," International Journal of Forecasting, Elsevier, vol. 37(1), pages 171-185.
    8. Şentürk, Damla & Ghosh, Samiran & Nguyen, Danh V., 2014. "Exploratory time varying lagged regression: Modeling association of cognitive and functional trajectories with expected clinic visits in older adults," Computational Statistics & Data Analysis, Elsevier, vol. 73(C), pages 1-15.
    9. Wang, Jingxing & Chung, Seokhyun & AlShelahi, Abdullah & Kontar, Raed & Byon, Eunshin & Saigal, Romesh, 2021. "Look-ahead decision making for renewable energy: A dynamic “predict and store” approach," Applied Energy, Elsevier, vol. 296(C).
    10. Heredia, María Belén & Prieur, Clémentine & Eckert, Nicolas, 2022. "Global sensitivity analysis with aggregated Shapley effects, application to avalanche hazard assessment," Reliability Engineering and System Safety, Elsevier, vol. 222(C).
    11. Febrero-Bande, Manuel & González-Manteiga, Wenceslao & Prallon, Brenda & Saporito, Yuri F., 2023. "Functional classification of bitcoin addresses," Computational Statistics & Data Analysis, Elsevier, vol. 181(C).
    12. Li, Pai-Ling & Chiou, Jeng-Min, 2011. "Identifying cluster number for subspace projected functional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2090-2103, June.
    13. Shuyu Meng & Zhensheng Huang, 2024. "Variable Selection in Semi-Functional Partially Linear Regression Models with Time Series Data," Mathematics, MDPI, vol. 12(17), pages 1-23, September.
    14. Xiuli Du & Xiaohu Jiang & Jinguan Lin, 2023. "Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data," Psychometrika, Springer;The Psychometric Society, vol. 88(3), pages 975-1001, September.
    15. Guangxing Wang & Sisheng Liu & Fang Han & Chong‐Zhi Di, 2023. "Robust functional principal component analysis via a functional pairwise spatial sign operator," Biometrics, The International Biometric Society, vol. 79(2), pages 1239-1253, June.
    16. repec:cte:wsrepe:24606 is not listed on IDEAS
    17. Zhang, Tao & Zhang, Qingzhao & Wang, Qihua, 2014. "Model detection for functional polynomial regression," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 183-197.
    18. Li, Pai-Ling & Chiou, Jeng-Min & Shyr, Yu, 2017. "Functional data classification using covariate-adjusted subspace projection," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 21-34.
    19. Xiongtao Dai & Zhenhua Lin & Hans‐Georg Müller, 2021. "Modeling sparse longitudinal data on Riemannian manifolds," Biometrics, The International Biometric Society, vol. 77(4), pages 1328-1341, December.
    20. Poskitt, D.S. & Sengarapillai, Arivalzahan, 2013. "Description length and dimensionality reduction in functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 58(C), pages 98-113.
    21. Park, So Young & Xiao, Luo & Willbur, Jayson D. & Staicu, Ana-Maria & Jumbe, N. L’ntshotsholé, 2018. "A joint design for functional data with application to scheduling ultrasound scans," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 101-114.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:189:y:2022:i:c:s0047259x21001469. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.