IDEAS home Printed from https://ideas.repec.org/a/bla/jorssc/v71y2022i5p1521-1542.html
   My bibliography  Save this article

Non‐parametric Bayesian covariate‐dependent multivariate functional clustering: An application to time‐series data for multiple air pollutants

Author

Listed:
  • Daewon Yang
  • Taeryon Choi
  • Eric Lavigne
  • Yeonseung Chung

Abstract

Air pollution is a major threat to public health. Understanding the spatial distribution of air pollution concentration is of great interest to government or local authorities, as it informs about target areas for implementing policies for air quality management. Cluster analysis has been popularly used to identify groups of locations with similar profiles of average levels of multiple air pollutants, efficiently summarising the spatial pattern. This study aimed to cluster locations based on the seasonal patterns of multiple air pollutants incorporating the location‐specific characteristics such as socio‐economic indicators. For this purpose, we proposed a novel non‐parametric Bayesian sparse latent factor model for covariate‐dependent multivariate functional clustering. Furthermore, we extend this model to conduct clustering with temporal dependency. The proposed methods are illustrated through a simulation study and applied to time‐series data for daily mean concentrations of ozone (O3$$ {\mathrm{O}}_3 $$), nitrogen dioxide (NO2$$ \mathrm{N}{\mathrm{O}}_2 $$), and fine particulate matter (PM2.5$$ \mathrm{P}{\mathrm{M}}_{2.5} $$) collected for 25 cities in Canada in 1986–2015.

Suggested Citation

  • Daewon Yang & Taeryon Choi & Eric Lavigne & Yeonseung Chung, 2022. "Non‐parametric Bayesian covariate‐dependent multivariate functional clustering: An application to time‐series data for multiple air pollutants," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1521-1542, November.
  • Handle: RePEc:bla:jorssc:v:71:y:2022:i:5:p:1521-1542
    DOI: 10.1111/rssc.12589
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssc.12589
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssc.12589?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Crainiceanu, Ciprian M. & Ruppert, David & Wand, Matthew P., 2005. "Bayesian Analysis for Penalized Spline Regression Using WinBUGS," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i14).
    2. Sirio Legramanti & Daniele Durante & David B Dunson, 2020. "Bayesian cumulative shrinkage for infinite factorizations," Biometrika, Biometrika Trust, vol. 107(3), pages 745-752.
    3. Andrea Martino & Andrea Ghiglietti & Francesca Ieva & Anna Maria Paganoni, 2019. "A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(2), pages 301-322, June.
    4. Papastamoulis, Panagiotis, 2016. "label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 69(c01).
    5. Francesca Ieva & Anna M. Paganoni & Davide Pigoli & Valeria Vitelli, 2013. "Multivariate functional clustering for the morphological analysis of electrocardiograph curves," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 62(3), pages 401-418, May.
    6. Marie-Eve Héroux & H. Anderson & Richard Atkinson & Bert Brunekreef & Aaron Cohen & Francesco Forastiere & Fintan Hurley & Klea Katsouyanni & Daniel Krewski & Michal Krzyzanowski & Nino Künzli & Inga , 2015. "Quantifying the health impacts of ambient air pollutants: recommendations of a WHO/Europe project," International Journal of Public Health, Springer;Swiss School of Public Health (SSPH+), vol. 60(5), pages 619-627, July.
    7. Shuichi Tokushige & Hiroshi Yadohisa & Koichi Inada, 2007. "Crisp and fuzzy k-means clustering algorithms for multivariate functional data," Computational Statistics, Springer, vol. 22(1), pages 1-16, April.
    8. Heard, Nicholas A. & Holmes, Christopher C. & Stephens, David A., 2006. "A Quantitative Study of Gene Regulation Involved in the Immune Response of Anopheline Mosquitoes: An Application of Bayesian Hierarchical Clustering of Curves," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 18-29, March.
    9. Durante, Daniele, 2017. "A note on the multiplicative gamma process," Statistics & Probability Letters, Elsevier, vol. 122(C), pages 198-204.
    10. Daniel R. Kowal & David S. Matteson & David Ruppert, 2017. "A Bayesian Multivariate Functional Dynamic Linear Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 733-744, April.
    11. Amandine Schmutz & Julien Jacques & Charles Bouveyron & Laurence Chèze & Pauline Martin, 2020. "Clustering multivariate functional data in group-specific functional subspaces," Computational Statistics, Springer, vol. 35(3), pages 1101-1131, September.
    12. A. Bhattacharya & D. B. Dunson, 2011. "Sparse Bayesian infinite factor models," Biometrika, Biometrika Trust, vol. 98(2), pages 291-306.
    13. Silvia Montagna & Surya T. Tokdar & Brian Neelon & David B. Dunson, 2012. "Bayesian Latent Factor Regression for Functional and Longitudinal Data," Biometrics, The International Biometric Society, vol. 68(4), pages 1064-1073, December.
    14. Shubhankar Ray & Bani Mallick, 2006. "Functional clustering by Bayesian wavelet methods," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(2), pages 305-332, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel R. Kowal & Antonio Canale, 2021. "Semiparametric Functional Factor Models with Bayesian Rank Selection," Papers 2108.02151, arXiv.org, revised May 2022.
    2. Sylvia Fruhwirth-Schnatter, 2023. "Generalized Cumulative Shrinkage Process Priors with Applications to Sparse Bayesian Factor Analysis," Papers 2303.00473, arXiv.org.
    3. Golovkine, Steven & Klutchnikoff, Nicolas & Patilea, Valentin, 2022. "Clustering multivariate functional data using unsupervised binary trees," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    4. Ana Justel & Marcela Svarc, 2018. "A divisive clustering method for functional data with special consideration of outliers," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 637-656, September.
    5. Julien Jacques & Cristian Preda, 2014. "Functional data clustering: a survey," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(3), pages 231-255, September.
    6. Matthew W. Wheeler, 2019. "Bayesian additive adaptive basis tensor product models for modeling high dimensional surfaces: an application to high‐throughput toxicity testing," Biometrics, The International Biometric Society, vol. 75(1), pages 193-201, March.
    7. Durante, Daniele, 2017. "A note on the multiplicative gamma process," Statistics & Probability Letters, Elsevier, vol. 122(C), pages 198-204.
    8. Michael Vogt & Oliver Linton, 2015. "Classification of nonparametric regression functions in heterogeneous panels," CeMMAP working papers CWP06/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Bruno Scarpa & David B. Dunson, 2009. "Bayesian Hierarchical Functional Data Analysis Via Contaminated Informative Priors," Biometrics, The International Biometric Society, vol. 65(3), pages 772-780, September.
    10. Philip A. White & Alan E. Gelfand, 2021. "Multivariate functional data modeling with time-varying clustering," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(3), pages 586-602, September.
    11. Jim Q. Smith & Paul E. Anderson & Silvia Liverani, 2008. "Separation measures and the geometry of Bayes factor selection for classification," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 957-980, November.
    12. Michael Vogt & Oliver Linton, 2017. "Classification of non-parametric regression functions in longitudinal data models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 5-27, January.
    13. Florian Huber & Gary Koop, 2023. "Subspace shrinkage in conjugate Bayesian vector autoregressions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(4), pages 556-576, June.
    14. Fangting Zhou & Kejun He & Kunbo Wang & Yanxun Xu & Yang Ni, 2023. "Functional Bayesian networks for discovering causality from multivariate functional data," Biometrics, The International Biometric Society, vol. 79(4), pages 3279-3293, December.
    15. Michael Vogt & Oliver Linton, 2015. "Classification of nonparametric regression functions in heterogeneous panels," CeMMAP working papers 06/15, Institute for Fiscal Studies.
    16. Jaejoon Lee & Seongil Jo & Jaeyong Lee, 2022. "Robust sparse Bayesian infinite factor models," Computational Statistics, Springer, vol. 37(5), pages 2693-2715, November.
    17. Kelly R. Moran & Elizabeth L. Turner & David Dunson & Amy H. Herring, 2021. "Bayesian hierarchical factor regression models to infer cause of death from verbal autopsy data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(3), pages 532-557, June.
    18. Amandine Schmutz & Julien Jacques & Charles Bouveyron & Laurence Chèze & Pauline Martin, 2020. "Clustering multivariate functional data in group-specific functional subspaces," Computational Statistics, Springer, vol. 35(3), pages 1101-1131, September.
    19. Dimitris Korobilis & Kenichi Shimizu, 2022. "Bayesian Approaches to Shrinkage and Sparse Estimation," Foundations and Trends(R) in Econometrics, now publishers, vol. 11(4), pages 230-354, June.
    20. Marta Spreafico & Francesca Ieva & Marta Fiocco, 2023. "Modelling time-varying covariates effect on survival via functional data analysis: application to the MRC BO06 trial in osteosarcoma," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(1), pages 271-298, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssc:v:71:y:2022:i:5:p:1521-1542. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.