IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v192y2024ics0167947323002190.html
   My bibliography  Save this article

Graph-based spatial segmentation of areal data

Author

Listed:
  • Goepp, Vivien
  • van de Kassteele, Jan

Abstract

Smoothing is often used to improve the readability and interpretability of noisy areal data. However, there are many instances where the underlying quantity is discontinuous. For such cases, specific methods are needed to estimate the piecewise constant spatial process. A well-known approach in this setting is to perform segmentation of the signal using the adjacency graph, such as the graph-based fused lasso. However, this method does not scale well to large graphs. A new method is introduced for piecewise constant spatial estimation that (i) is faster to compute on large graphs and (ii) yields sparser models than the fused lasso (for the same amount of regularization), resulting in estimates that are easier to interpret. The method is illustrated on simulated data and applied to real data on overweight prevalence in the Netherlands. Healthy and unhealthy zones are identified, which cannot be explained by demographic or socio-economic characteristics alone. The method is found capable of identifying such zones and can assist policymakers with their health improving strategies.

Suggested Citation

  • Goepp, Vivien & van de Kassteele, Jan, 2024. "Graph-based spatial segmentation of areal data," Computational Statistics & Data Analysis, Elsevier, vol. 192(C).
  • Handle: RePEc:eee:csdana:v:192:y:2024:i:c:s0167947323002190
    DOI: 10.1016/j.csda.2023.107908
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947323002190
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2023.107908?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ralph C A Rippe & Jacqueline J Meulman & Paul H C Eilers, 2012. "Visualization of Genomic Changes by Segmented Smoothing Using an L0 Penalty," PLOS ONE, Public Library of Science, vol. 7(6), pages 1-14, June.
    2. Florian Frommlet & Grégory Nuel, 2016. "An Adaptive Ridge Procedure for L0 Regularization," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-23, February.
    3. Julian Besag & Jeremy York & Annie Mollié, 1991. "Bayesian image restoration, with two applications in spatial statistics," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 43(1), pages 1-20, March.
    4. D. G. T. Denison & C. C. Holmes, 2001. "Bayesian Partitioning for Estimating Disease Risk," Biometrics, The International Biometric Society, vol. 57(1), pages 143-149, March.
    5. Leonhard Knorr-Held & Günter Raßer, 2000. "Bayesian Detection of Clusters and Discontinuities in Disease Maps," Biometrics, The International Biometric Society, vol. 56(1), pages 13-21, March.
    6. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Minge Xie & Qiankun Sun & Joseph Naus, 2009. "A Latent Model to Detect Multiple Clusters of Varying Sizes," Biometrics, The International Biometric Society, vol. 65(4), pages 1011-1020, December.
    2. Leonhard Knorr-Held & Günter Raßer & Nikolaus Becker, 2002. "Disease Mapping of Stage-Specific Cancer Incidence Data," Biometrics, The International Biometric Society, vol. 58(3), pages 492-501, September.
    3. Douglas R. M. Azevedo & Marcos O. Prates & Dipankar Bandyopadhyay, 2021. "MSPOCK: Alleviating Spatial Confounding in Multivariate Disease Mapping Models," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(3), pages 464-491, September.
    4. Takumi Saegusa & Tianzhou Ma & Gang Li & Ying Qing Chen & Mei-Ling Ting Lee, 2020. "Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(3), pages 376-398, December.
    5. Håvard Rue & Ingelin Steinsland & Sveinung Erland, 2004. "Approximating hidden Gaussian Markov random fields," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(4), pages 877-892, November.
    6. K C Flórez & A Corberán-Vallet & A Iftimi & J D Bermúdez, 2020. "A Bayesian unified framework for risk estimation and cluster identification in small area health data analysis," PLOS ONE, Public Library of Science, vol. 15(5), pages 1-17, May.
    7. Zhao, Hui & Sun, Dayu & Li, Gang & Sun, Jianguo, 2019. "Simultaneous estimation and variable selection for incomplete event history studies," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 350-361.
    8. Howard D. Bondell & Brian J. Reich, 2009. "Simultaneous Factor Selection and Collapsing Levels in ANOVA," Biometrics, The International Biometric Society, vol. 65(1), pages 169-177, March.
    9. Deborah A. Costain, 2009. "Bayesian Partitioning for Modeling and Mapping Spatial Case–Control Data," Biometrics, The International Biometric Society, vol. 65(4), pages 1123-1132, December.
    10. Congdon, Peter, 2007. "Mixtures of spatial and unstructured effects for spatially discontinuous health outcomes," Computational Statistics & Data Analysis, Elsevier, vol. 51(6), pages 3197-3212, March.
    11. D. G. T. Denison & C. C. Holmes, 2001. "Bayesian Partitioning for Estimating Disease Risk," Biometrics, The International Biometric Society, vol. 57(1), pages 143-149, March.
    12. Marco Alfò & Cecilia Vitiello, 2003. "Finite mixtures approach to ecological regression," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 12(1), pages 93-108, February.
    13. Antonis Christou & Andreas Artemiou, 2023. "Adaptive L0 Regularization for Sparse Support Vector Regression," Mathematics, MDPI, vol. 11(13), pages 1-12, June.
    14. Jian Huang & Yuling Jiao & Lican Kang & Jin Liu & Yanyan Liu & Xiliang Lu, 2022. "GSDAR: a fast Newton algorithm for $$\ell _0$$ ℓ 0 regularized generalized linear models with statistical guarantee," Computational Statistics, Springer, vol. 37(1), pages 507-533, March.
    15. Duncan Lee & Richard Mitchell, 2013. "Locally adaptive spatial smoothing using conditional auto-regressive models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 62(4), pages 593-608, August.
    16. Luke Smallman & William Underwood & Andreas Artemiou, 2020. "Simple Poisson PCA: an algorithm for (sparse) feature extraction with simultaneous dimension determination," Computational Statistics, Springer, vol. 35(2), pages 559-577, June.
    17. repec:jss:jstsof:36:i10 is not listed on IDEAS
    18. Hosik Choi & Eunjung Song & Seung-sik Hwang & Woojoo Lee, 2018. "A modified generalized lasso algorithm to detect local spatial clusters for count data," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 102(4), pages 537-563, October.
    19. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    20. Guan, Wei & Gray, Alexander, 2013. "Sparse high-dimensional fractional-norm support vector machine via DC programming," Computational Statistics & Data Analysis, Elsevier, vol. 67(C), pages 136-148.
    21. Margherita Giuzio, 2017. "Genetic algorithm versus classical methods in sparse index tracking," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 40(1), pages 243-256, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:192:y:2024:i:c:s0167947323002190. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.