IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2407.02536.html
   My bibliography  Save this paper

Reducing False Discoveries in Statistically-Significant Regional-Colocation Mining: A Summary of Results

Author

Listed:
  • Subhankar Ghosh
  • Jayant Gupta
  • Arun Sharma
  • Shuai An
  • Shashi Shekhar

Abstract

Given a set \emph{S} of spatial feature types, its feature instances, a study area, and a neighbor relationship, the goal is to find pairs $ $ such that \emph{C} is a statistically significant regional-colocation pattern in $r_{g}$. This problem is important for applications in various domains including ecology, economics, and sociology. The problem is computationally challenging due to the exponential number of regional colocation patterns and candidate regions. Previously, we proposed a miner \cite{10.1145/3557989.3566158} that finds statistically significant regional colocation patterns. However, the numerous simultaneous statistical inferences raise the risk of false discoveries (also known as the multiple comparisons problem) and carry a high computational cost. We propose a novel algorithm, namely, multiple comparisons regional colocation miner (MultComp-RCM) which uses a Bonferroni correction. Theoretical analysis, experimental evaluation, and case study results show that the proposed method reduces both the false discovery rate and computational cost.

Suggested Citation

  • Subhankar Ghosh & Jayant Gupta & Arun Sharma & Shuai An & Shashi Shekhar, 2024. "Reducing False Discoveries in Statistically-Significant Regional-Colocation Mining: A Summary of Results," Papers 2407.02536, arXiv.org.
  • Handle: RePEc:arx:papers:2407.02536
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2407.02536
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Julian Besag & Peter J. Diggle, 1977. "Simple Monte Carlo Tests for Spatial Pattern," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 26(3), pages 327-333, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Giuseppe Espa & Giuseppe Arbia & Diego Giuliani, 2013. "Conditional versus unconditional industrial agglomeration: disentangling spatial dependence and spatial heterogeneity in the analysis of ICT firms’ distribution in Milan," Journal of Geographical Systems, Springer, vol. 15(1), pages 31-50, January.
    2. Dufour, Jean-Marie, 2006. "Monte Carlo tests with nuisance parameters: A general approach to finite-sample inference and nonstandard asymptotics," Journal of Econometrics, Elsevier, vol. 133(2), pages 443-477, August.
    3. Diks, Cees, 2003. "Detecting serial dependence in tail events: a test dual to the BDS test," Economics Letters, Elsevier, vol. 79(3), pages 319-324, June.
    4. François Bavaud, 2013. "Testing spatial autocorrelation in weighted networks: the modes permutation test," Journal of Geographical Systems, Springer, vol. 15(3), pages 233-247, July.
    5. Diego Giuliani & Giuseppe Arbia & Giuseppe Espa, 2014. "Weighting Ripley’s K-Function to Account for the Firm Dimension in the Analysis of Spatial Concentration," International Regional Science Review, , vol. 37(3), pages 251-272, July.
    6. Grabarnik, Pavel & Myllymäki, Mari & Stoyan, Dietrich, 2011. "Correct testing of mark independence for marked point patterns," Ecological Modelling, Elsevier, vol. 222(23), pages 3888-3894.
    7. Davidson, Marty, 2024. "Strategic Point Processes," OSF Preprints g5r9t, Center for Open Science.
    8. Xiaolan Wu & Tony Grubesic, 2010. "Identifying irregularly shaped crime hot-spots using a multiobjective evolutionary algorithm," Journal of Geographical Systems, Springer, vol. 12(4), pages 409-433, December.
    9. repec:elg:eechap:14395_6 is not listed on IDEAS
    10. Junfu Zhang, 2003. "Revisiting Residential Segregation by Income: A Monte Carlo Test," International Journal of Business and Economics, School of Management Development, Feng Chia University, Taichung, Taiwan, vol. 2(1), pages 27-37, April.
    11. Giuseppe Arbia & Giuseppe Espa & Diego Giuliani & Maria Michela Dickson, 2017. "Effects of missing data and locational errors on spatial concentration measures based on Ripley’s K-function," Spatial Economic Analysis, Taylor & Francis Journals, vol. 12(2-3), pages 326-346, July.
    12. Zhu, Li-Xing & Neuhaus, Georg, 2003. "Conditional tests for elliptical symmetry," Journal of Multivariate Analysis, Elsevier, vol. 84(2), pages 284-298, February.
    13. Giuseppe Arbia & Giuseppe Espa & Diego Giuliani & Rocco Micciolo, 2017. "A spatial analysis of health and pharmaceutical firm survival," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(9), pages 1560-1575, July.
    14. Zhonghao Zhang & Rui Xiao & Ashton Shortridge & Jiaping Wu, 2014. "Spatial Point Pattern Analysis of Human Settlements and Geographical Associations in Eastern Coastal China — A Case Study," IJERPH, MDPI, vol. 11(3), pages 1-16, March.
    15. Saroja Selvanathan, 1991. "Regional Consumption Patterns in Australia: A System‐Wide Analysis," The Economic Record, The Economic Society of Australia, vol. 67(4), pages 338-345, December.
    16. Christian Wehenkel & João Marcelo Brazão-Protázio & Artemio Carrillo-Parra & José Hugo Martínez-Guerrero & Felipe Crecente-Campo, 2015. "Spatial Distribution Patterns in the Very Rare and Species-Rich Picea chihuahuana Tree Community (Mexico)," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-19, October.
    17. S Openshaw, 1979. "A Methodology for Using Models for Planning Purposes," Environment and Planning A, , vol. 11(8), pages 879-896, August.
    18. Golay, Jean & Kanevski, Mikhail & Vega Orozco, Carmen D. & Leuenberger, Michael, 2014. "The multipoint Morisita index for the analysis of spatial patterns," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 406(C), pages 191-202.
    19. John Kornak & Mark Irwin & Noel Cressie, 2006. "Spatial Point Process Models of Defensive Strategies: Detecting Changes," Statistical Inference for Stochastic Processes, Springer, vol. 9(1), pages 31-46, May.
    20. Diks Cees & Manzan Sebastiano, 2002. "Tests for Serial Independence and Linearity Based on Correlation Integrals," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 6(2), pages 1-22, July.
    21. Frisén, Marianne & Andersson, Eva, 2008. "Semiparametric surveillance of outbreaks," Research Reports 2007:11, University of Gothenburg, Statistical Research Unit, School of Business, Economics and Law.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2407.02536. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.