IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v157y2021ics0167947320302401.html
   My bibliography  Save this article

Support vector subset scan for spatial pattern detection

Author

Listed:
  • Fitzpatrick, Dylan
  • Ni, Yun
  • Neill, Daniel B.

Abstract

Discovery of localized and irregularly shaped anomalous patterns in spatial data provides useful context for operational decisions across many policy domains. The support vector subset scan (SVSS) integrates the penalized fast subset scan with a kernel support vector machine classifier to accurately detect spatial clusters without imposing hard constraints on the shape or size of the pattern. The method iterates between (1) efficiently maximizing a penalized log-likelihood ratio over subsets of locations to obtain an anomalous pattern, and (2) learning a high-dimensional decision boundary between locations included in and excluded from the anomalous subset. On each iteration, location-specific penalties to the log-likelihood ratio are assigned according to distance to the decision boundary, encouraging patterns which are spatially compact but potentially highly irregular in shape. SVSS outperforms competing methods for spatial cluster detection at the task of detecting randomly generated patterns in simulated experiments. SVSS enables discovery of practically-useful anomalous patterns for disease surveillance in Chicago, IL, crime hotspot detection in Portland, OR, and pothole cluster detection in Pittsburgh, PA, as demonstrated by experiments using publicly available data sets from these domains.

Suggested Citation

  • Fitzpatrick, Dylan & Ni, Yun & Neill, Daniel B., 2021. "Support vector subset scan for spatial pattern detection," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
  • Handle: RePEc:eee:csdana:v:157:y:2021:i:c:s0167947320302401
    DOI: 10.1016/j.csda.2020.107149
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947320302401
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2020.107149?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Daniel B. Neill, 2012. "Fast subset scan for spatial pattern detection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(2), pages 337-360, March.
    2. Jochen Gorski & Frank Pfeuffer & Kathrin Klamroth, 2007. "Biconvex sets and optimization with biconvex functions: a survey and extensions," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 66(3), pages 373-407, December.
    3. Duczmal, Luiz & Assuncao, Renato, 2004. "A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters," Computational Statistics & Data Analysis, Elsevier, vol. 45(2), pages 269-286, March.
    4. Neill, Daniel B., 2009. "Expectation-based scan statistics for monitoring spatial time series data," International Journal of Forecasting, Elsevier, vol. 25(3), pages 498-517, July.
    5. Duczmal, Luiz & Cancado, Andre L.F. & Takahashi, Ricardo H.C. & Bessegato, Lupercio F., 2007. "A genetic algorithm for irregularly shaped spatial scan statistics," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 43-52, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. de Lima, Max Sousa & Duczmal, Luiz Henrique, 2014. "Adaptive likelihood ratio approaches for the detection of space–time disease clusters," Computational Statistics & Data Analysis, Elsevier, vol. 77(C), pages 352-370.
    2. Wan, You & Pei, Tao & Zhou, Chenghu & Jiang, Yong & Qu, Chenxu & Qiao, Youlin, 2012. "ACOMCD: A multiple cluster detection algorithm based on the spatial scan statistic and ant colony optimization," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 283-296.
    3. Inkyung Jung, 2019. "Spatial scan statistics for matched case-control data," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-10, August.
    4. Silva, Ivair R. & Duczmal, Luiz & Kulldorff, Martin, 2021. "Confidence intervals for spatial scan statistic," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    5. Zhou, Ruoyu & Shu, Lianjie & Su, Yan, 2015. "An adaptive minimum spanning tree test for detecting irregularly-shaped spatial clusters," Computational Statistics & Data Analysis, Elsevier, vol. 89(C), pages 134-146.
    6. Chadoeuf, J. & Certain, G. & Bellier, E. & Bar-Hen, A. & Couteron, P. & Monestiez, P. & Bretagnolle, V., 2011. "Estimating inter-group interaction radius for point processes with nested spatial structures," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 627-640, January.
    7. Lin, Yun Hui & Wang, Yuan & He, Dongdong & Lee, Loo Hay, 2020. "Last-mile delivery: Optimal locker location under multinomial logit choice model," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 142(C).
    8. Ma, Shujie & Linton, Oliver & Gao, Jiti, 2021. "Estimation and inference in semiparametric quantile factor models," Journal of Econometrics, Elsevier, vol. 222(1), pages 295-323.
    9. Lee, Myeonggyun & Jung, Inkyung, 2019. "Modified spatial scan statistics using a restricted likelihood ratio for ordinal outcome data," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 28-39.
    10. Zhiqing Meng & Min Jiang & Rui Shen & Leiyan Xu & Chuangyin Dang, 2021. "An objective penalty function method for biconvex programming," Journal of Global Optimization, Springer, vol. 81(3), pages 599-620, November.
    11. Ibrahim Musa & Hyun Woo Park & Lkhagvadorj Munkhdalai & Keun Ho Ryu, 2018. "Global Research on Syndromic Surveillance from 1993 to 2017: Bibliometric Analysis and Visualization," Sustainability, MDPI, vol. 10(10), pages 1-20, September.
    12. Smida, Zaineb & Laurent, Thibault & Cucala, Lionel, 2024. "A Hotelling spatial scan statistic for functional data: application to economic and climate data," TSE Working Papers 24-1583, Toulouse School of Economics (TSE).
    13. Dimitris Bertsimas & Xuan Vinh Doan & Karthik Natarajan & Chung-Piaw Teo, 2010. "Models for Minimax Stochastic Linear Optimization Problems with Risk Aversion," Mathematics of Operations Research, INFORMS, vol. 35(3), pages 580-602, August.
    14. Zhao, Yue & Chen, Zhi & Lim, Andrew & Zhang, Zhenzhen, 2022. "Vessel deployment with limited information: Distributionally robust chance constrained models," Transportation Research Part B: Methodological, Elsevier, vol. 161(C), pages 197-217.
    15. Kun Chen & Kung-Sik Chan & Nils Chr. Stenseth, 2014. "Source-Sink Reconstruction Through Regularized Multicomponent Regression Analysis-With Application to Assessing Whether North Sea Cod Larvae Contributed to Local Fjord Cod in Skagerrak," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 560-573, June.
    16. Xing Zhao & Xiao-Hua Zhou & Zijian Feng & Pengfei Guo & Hongyan He & Tao Zhang & Lei Duan & Xiaosong Li, 2013. "A Scan Statistic for Binary Outcome Based on Hypergeometric Probability Model, with an Application to Detecting Spatial Clusters of Japanese Encephalitis," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-7, June.
    17. Xiaolan Wu & Tony Grubesic, 2010. "Identifying irregularly shaped crime hot-spots using a multiobjective evolutionary algorithm," Journal of Geographical Systems, Springer, vol. 12(4), pages 409-433, December.
    18. Tong Wang & Cynthia Rudin, 2022. "Causal Rule Sets for Identifying Subgroups with Enhanced Treatment Effects," INFORMS Journal on Computing, INFORMS, vol. 34(3), pages 1626-1643, May.
    19. Cucala, Lionel, 2009. "A flexible spatial scan test for case event data," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 2843-2850, June.
    20. Sevvandi Kandanaarachchi & Rob J Hyndman & Kate Smith-Miles, 2020. "Early classification of spatio-temporal events using partial information," PLOS ONE, Public Library of Science, vol. 15(8), pages 1-39, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:157:y:2021:i:c:s0167947320302401. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.