IDEAS home Printed from https://ideas.repec.org/p/wiw/wiwrsa/ersa13p1004.html
   My bibliography  Save this paper

Spatial Econometric Modelling Of Massive Datasets: The Contribution Of Data Mining

Author

Listed:
  • MYRIAM TABASSO
  • GIUSEPPE ARBIA

Abstract

In this paper we provide a brief overview of some of the most recent empirical research on spatial econometric models and spatial data mining. Data mining in general is the search for hidden patterns that may exist in large databases. Spatial data mining is a process to discover interesting, potentially useful and high utility patterns embedded in large spatial datasets. The field of spatial data mining has been influenced by many other disciplines: databases technology, artificial intelligence, machine learning, probabilistic statistics, visualization, information science, and pattern recognition. This process is more complex than conventional data mining because of the complexities inherent in spatial data. Spatial data are multi-sourced, multi-typed, multi-scaled, eterogeneous, and dynamic. The main difference between data mining and spatial data mining is that in spatial data mining tasks we use not only non-spatial attributes (as it is usual in data mining in non-spatial data), but also spatial attributes. We suggest some directions along which spatial econometric modeling could benefit from the cross-fertilization spatial data mining techniques such as Classification and Regression Trees (CART). We use the CART algorithm to fit empirical data and produce a tree with optimal tree size for different specifications of econometric models. We also examine some diagnostic measures to evaluate the spatial autocorrelation of the pseudo-residuals obtained from the regression tree analysis and we compare the accuracy and performance of different versions of CART that take into account the effects of spatial dependence. To address this issue, we start examining a non-spatial regression tree, then we include the geographical coordinates of data in the covariate set and finally, we consider one of the most common spatial econometric models: Spatial Lag combined with two versions of regression trees: non-spatial regression tree and geographical coordinates based regression tree. This allows us to determine the strength and the possible role of spatial arrangement on the variables in the predictive model and reduce the effect of spatial autocorrelation on prediction errors. In particular, we test the sensibility of various regression trees with different spatial weights matrix specifications such that to remove the spatial autocorrelation on pseudo-residuals and improvement in the accuracy of spatial predictive models.

Suggested Citation

  • Myriam Tabasso & Giuseppe Arbia, 2013. "Spatial Econometric Modelling Of Massive Datasets: The Contribution Of Data Mining," ERSA conference papers ersa13p1004, European Regional Science Association.
  • Handle: RePEc:wiw:wiwrsa:ersa13p1004
    as

    Download full text from publisher

    File URL: https://www-sre.wu.ac.at/ersa/ersaconfs/ersa13/ERSA2013_paper_01004.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Leen Hordijk, 1979. "Problems In Estimating Econometric Relations In Space," Papers in Regional Science, Wiley Blackwell, vol. 42(1), pages 99-115, January.
    2. Bernard Fingleton, 2009. "A generalized method of moments estimator for a spatial model with moving average errors, with application to real estate prices," Studies in Empirical Economics, in: Giuseppe Arbia & Badi H. Baltagi (ed.), Spatial Econometrics, pages 35-57, Springer.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Luc Anselin, 2010. "Thirty years of spatial econometrics," Papers in Regional Science, Wiley Blackwell, vol. 89(1), pages 3-25, March.
    2. Tsimpanos, Apostolos & Tsimbos, Cleon & Kalogirou, Stamatis, 2018. "Assessing spatial variation and heterogeneity of fertility in Greece at local authority level," MPRA Paper 100406, University Library of Munich, Germany.
    3. Baltagi, Badi H. & Bresson, Georges & Pirotte, Alain, 2012. "Forecasting with spatial panel data," Computational Statistics & Data Analysis, Elsevier, vol. 56(11), pages 3381-3397.
    4. Baltagi, Badi H. & Fingleton, Bernard & Pirotte, Alain, 2014. "Spatial lag models with nested random effects: An instrumental variable procedure with an application to English house prices," Journal of Urban Economics, Elsevier, vol. 80(C), pages 76-86.
    5. Daniel C. Monchuk & Dermot J. Hayes & John A. Miranowski & Dayton M. Lambert, 2011. "Inference Based On Alternative Bootstrapping Methods In Spatial Models With An Application To County Income Growth In The United States," Journal of Regional Science, Wiley Blackwell, vol. 51(5), pages 880-896, December.
    6. Philipp Otto & Wolfgang Schmid, 2018. "Spatiotemporal analysis of German real-estate prices," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 60(1), pages 41-72, January.
    7. Mohamed Mekki Ben Jemaa, 2016. "Economic, Political and Cultural Proximity and Growth Propagation: A Network Model with Endogenous Proximity Matrix," Working Papers 1047, Economic Research Forum, revised 09 Jan 2016.
    8. Bernard Fingleton & Julie Le Gallo, 2008. "Estimating spatial models with endogenous variables, a spatial lag and spatially dependent disturbances: Finite sample properties," Papers in Regional Science, Wiley Blackwell, vol. 87(3), pages 319-339, August.
    9. Atreya, Ajita & Susana, Ferreira, 2012. "Analysis of Spatial Variation in Flood Risk Perception," 2012 Annual Meeting, February 4-7, 2012, Birmingham, Alabama 119738, Southern Agricultural Economics Association.
    10. Holly, Sean & Hashem Pesaran, M. & Yamagata, Takashi, 2011. "The spatial and temporal diffusion of house prices in the UK," Journal of Urban Economics, Elsevier, vol. 69(1), pages 2-23, January.
    11. Marcos Herrera & Manuel Ruiz & Jesús Mur, 2013. "Detecting Dependence Between Spatial Processes," Spatial Economic Analysis, Taylor & Francis Journals, vol. 8(4), pages 469-497, February.
    12. AMBA OYON, Claude Marius & Mbratana, Taoufiki, 2018. "Simultaneous Generalized Method of Moments Estimator for Panel Data Models with Spatially Correlated Error Components," MPRA Paper 84746, University Library of Munich, Germany.
    13. Torben Klarl, 2014. "Is Spatial Bootstrapping A Panacea For Valid Inference?," Journal of Regional Science, Wiley Blackwell, vol. 54(2), pages 304-312, March.
    14. Xu, Wan & Lambert, Dayton M., 2011. "Business Establishment Growth in the Appalachian Region, 2000-2007: An Application of Smooth Transition Spatial Process Models," Journal of Agricultural and Applied Economics, Southern Agricultural Economics Association, vol. 43(3), pages 1-16, August.
    15. Baltagi, Badi H. & Fingleton, Bernard & Pirotte, Alain, 2019. "A time-space dynamic panel data model with spatial moving average errors," Regional Science and Urban Economics, Elsevier, vol. 76(C), pages 13-31.
    16. Pedro Amaral & Mauro Lemos & Rodrigo Simões & Flávia Chein, 2010. "Regional Imbalances and Market Potential in Brazil," Spatial Economic Analysis, Taylor & Francis Journals, vol. 5(4), pages 463-482.
    17. Robert Garthoff & Philipp Otto, 2017. "Control charts for multivariate spatial autoregressive models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 101(1), pages 67-94, January.
    18. Arnab Bhattacharjee & Sean Holly, 2011. "Structural interactions in spatial panels," Empirical Economics, Springer, vol. 40(1), pages 69-94, February.
    19. Filippova, Olga & Sheng, Mingyue, 2020. "Impact of bus rapid transit on residential property prices in Auckland, New Zealand," Journal of Transport Geography, Elsevier, vol. 86(C).
    20. Sabina Buczkowska & Nicolas Coulombel & Matthieu Lapparent, 2019. "A comparison of Euclidean Distance, Travel Times, and Network Distances in Location Choice Mixture Models," Networks and Spatial Economics, Springer, vol. 19(4), pages 1215-1248, December.

    More about this item

    Keywords

    spatial econometric models; spatial data mining; CART; spatial autocorrelation;
    All these keywords.

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C31 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models; Quantile Regressions; Social Interaction Models
    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • R10 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - General Regional Economics - - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wiw:wiwrsa:ersa13p1004. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Gunther Maier (email available below). General contact details of provider: http://www.ersa.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.