IDEAS home Printed from https://ideas.repec.org/a/bla/jorssc/v71y2022i4p806-833.html
   My bibliography  Save this article

Highly irregular functional generalized linear regression with electronic health records

Author

Listed:
  • Justin Petrovich
  • Matthew Reimherr
  • Carrie Daymont

Abstract

This work presents a new approach, called Multiple Imputation of Sparsely‐sampled Functions at Irregular Times (MISFIT), for fitting generalized functional linear regression models with sparsely and irregularly sampled data. Current methods do not allow for consistent estimation unless one assumes that the number of observed points per curve grows sufficiently quickly with the sample size. In contrast, MISFIT is based on a multiple imputation framework, which, as we demonstrate empirically, has the potential to produce consistent estimates without such an assumption. Just as importantly, it propagates the uncertainty of not having completely observed curves, allowing for a more accurate assessment of the uncertainty of parameter estimates, something that most methods currently cannot accomplish. This work is motivated by a longitudinal study on macrocephaly, or atypically large head size, in which electronic medical records allow for the collection of a great deal of data. However, the sampling is highly variable from child to child. Using MISFIT we are able to clearly demonstrate that the development of pathologic conditions related to macrocephaly is associated with both the overall head circumference of the children as well as the velocity of their head growth.

Suggested Citation

  • Justin Petrovich & Matthew Reimherr & Carrie Daymont, 2022. "Highly irregular functional generalized linear regression with electronic health records," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(4), pages 806-833, August.
  • Handle: RePEc:bla:jorssc:v:71:y:2022:i:4:p:806-833
    DOI: 10.1111/rssc.12556
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssc.12556
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssc.12556?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Daniel R. Kowal & David S. Matteson & David Ruppert, 2019. "Functional Autoregression for Sparsely Sampled Data," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 37(1), pages 97-109, January.
    2. King, Gary & Zeng, Langche, 2001. "Logistic Regression in Rare Events Data," Political Analysis, Cambridge University Press, vol. 9(2), pages 137-163, January.
    3. Aurore Delaigle & Peter Hall, 2012. "Achieving near perfect classification for functional data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(2), pages 267-286, March.
    4. Ahmed, M.S. & Attouch, M.K. & Dabo-Niang, S., 2018. "Binary functional linear models under choice-based sampling," Econometrics and Statistics, Elsevier, vol. 7(C), pages 134-152.
    5. Yao, Fang & Muller, Hans-Georg & Wang, Jane-Ling, 2005. "Functional Data Analysis for Sparse Longitudinal Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 577-590, June.
    6. John A. Rice & Colin O. Wu, 2001. "Nonparametric Mixed Effects Models for Unequally Sampled Noisy Curves," Biometrics, The International Biometric Society, vol. 57(1), pages 253-259, March.
    7. J. Goldsmith & S. Greven & C. Crainiceanu, 2013. "Corrected Confidence Bands for Functional Data Using Principal Components," Biometrics, The International Biometric Society, vol. 69(1), pages 41-51, March.
    8. Wesley K. Thompson & Ori Rosen, 2008. "A Bayesian Model for Sparse Functional Data," Biometrics, The International Biometric Society, vol. 64(1), pages 54-63, March.
    9. Xiongtao Dai & Hans-Georg Müller & Fang Yao, 2017. "Optimal Bayes classifiers for functional data and density ratios," Biometrika, Biometrika Trust, vol. 104(3), pages 545-560.
    10. Patrick Royston, 2004. "Multiple imputation of missing values," Stata Journal, StataCorp LP, vol. 4(3), pages 227-241, September.
    11. Han Shang, 2014. "A survey of functional principal component analysis," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 98(2), pages 121-142, April.
    12. José R. Berrendero & Antonio Cuevas & José L. Torrecilla, 2018. "On the Use of Reproducing Kernel Hilbert Spaces in Functional Classification," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1210-1218, July.
    13. Minggao Shi & Robert E. Weiss & Jeremy M. G. Taylor, 1996. "An Analysis of Paediatric Cd4 Counts for Acquired Immune Deficiency Syndrome Using Flexible Random Curves," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 45(2), pages 151-163, June.
    14. Dauxois, J. & Pousse, A. & Romain, Y., 1982. "Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference," Journal of Multivariate Analysis, Elsevier, vol. 12(1), pages 136-154, March.
    15. Liebl, Dominik, 2019. "Inference for sparse and dense functional data with covariate adjustments," Journal of Multivariate Analysis, Elsevier, vol. 170(C), pages 315-335.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mingfei Dong & Donatello Telesca & Catherine Sugar & Frederick Shic & Adam Naples & Scott P. Johnson & Beibin Li & Adham Atyabi & Minhang Xie & Sara J. Webb & Shafali Jeste & Susan Faja & April R. Lev, 2023. "A Functional Model for Studying Common Trends Across Trial Time in Eye Tracking Experiments," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(1), pages 261-287, April.
    2. Weishampel, Anthony & Staicu, Ana-Maria & Rand, William, 2023. "Classification of social media users with generalized functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    3. Mengfei Ran & Yihe Yang, 2022. "Optimal Estimation of Large Functional and Longitudinal Data by Using Functional Linear Mixed Model," Mathematics, MDPI, vol. 10(22), pages 1-28, November.
    4. Chen, Ziqi & Hu, Jianhua & Zhu, Hongtu, 2020. "Surface functional models," Journal of Multivariate Analysis, Elsevier, vol. 180(C).
    5. Park, Yeonjoo & Simpson, Douglas G., 2019. "Robust probabilistic classification applicable to irregularly sampled functional data," Computational Statistics & Data Analysis, Elsevier, vol. 131(C), pages 37-49.
    6. Kyunghee Han & Pantelis Z Hadjipantelis & Jane-Ling Wang & Michael S Kramer & Seungmi Yang & Richard M Martin & Hans-Georg Müller, 2018. "Functional principal component analysis for identifying multivariate patterns and archetypes of growth, and their association with long-term cognitive development," PLOS ONE, Public Library of Science, vol. 13(11), pages 1-18, November.
    7. Petrovich, Justin & Reimherr, Matthew, 2017. "Asymptotic properties of principal component projections with repeated eigenvalues," Statistics & Probability Letters, Elsevier, vol. 130(C), pages 42-48.
    8. Han Shang, 2014. "A survey of functional principal component analysis," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 98(2), pages 121-142, April.
    9. Ming Xiong & Ao Yuan & Hong-Bin Fang & Colin O. Wu & Ming T. Tan, 2022. "Estimation and Hypothesis Test for Mean Curve with Functional Data by Reproducing Kernel Hilbert Space Methods, with Applications in Biostatistics," Mathematics, MDPI, vol. 10(23), pages 1-17, December.
    10. Hans-Georg Müller & Wenjing Yang, 2010. "Dynamic relations for sparsely sampled Gaussian processes," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 19(1), pages 1-29, May.
    11. Park, Juhyun & Gasser, Theo & Rousson, Valentin, 2009. "Structural components in functional data," Computational Statistics & Data Analysis, Elsevier, vol. 53(9), pages 3452-3465, July.
    12. Sven Otto & Nazarii Salish, 2022. "Approximate Factor Models for Functional Time Series," Papers 2201.02532, arXiv.org, revised May 2024.
    13. Fang Yao & Yichao Wu & Jialin Zou, 2016. "Probability-enhanced effective dimension reduction for classifying sparse functional data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(1), pages 1-22, March.
    14. van der Linde, Angelika, 2008. "Variational Bayesian functional PCA," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 517-533, December.
    15. Boland, Joanna & Telesca, Donatello & Sugar, Catherine & Jeste, Shafali & Goldbeck, Cameron & Senturk, Damla, 2022. "A study of longitudinal trends in time-frequency transformations of EEG data during a learning experiment," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    16. Panaretos, Victor M. & Tavakoli, Shahin, 2013. "Cramér–Karhunen–Loève representation and harmonic principal component analysis of functional time series," Stochastic Processes and their Applications, Elsevier, vol. 123(7), pages 2779-2807.
    17. Guangxing Wang & Sisheng Liu & Fang Han & Chong‐Zhi Di, 2023. "Robust functional principal component analysis via a functional pairwise spatial sign operator," Biometrics, The International Biometric Society, vol. 79(2), pages 1239-1253, June.
    18. Matteo Fontana & Massimo Tavoni & Simone Vantini, 2019. "Functional Data Analysis of high-frequency load curves reveals drivers of residential electricity consumption," PLOS ONE, Public Library of Science, vol. 14(6), pages 1-16, June.
    19. Hao Zhang, 2016. "Comments on: Probability enhanced effective dimension reduction for classifying sparse functional data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(1), pages 47-51, March.
    20. Faheem Jan & Ismail Shah & Sajid Ali, 2022. "Short-Term Electricity Prices Forecasting Using Functional Time Series Analysis," Energies, MDPI, vol. 15(9), pages 1-15, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssc:v:71:y:2022:i:4:p:806-833. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.