IDEAS home Printed from https://ideas.repec.org/a/dem/demres/v47y2022i27.html
   My bibliography  Save this article

Small-area estimates from consumer trace data

Author

Listed:
  • Arthur Acolin

    (University of Washington)

  • Ari Decter-Frain

    (Cornell University)

  • Matt Hall

    (Cornell University)

Abstract

Background: Timely, accurate, and precise demographic estimates at various levels of geography are crucial for planning, policymaking, and analysis. In the United States, data from the decennial census and annual American Community Survey (ACS) serve as the main sources for subnational demographic estimates. While estimates derived from these sources are widely regarded as accurate, their timeliness is limited and variability sizable for small geographic units like towns and neighborhoods. Objective: This paper investigates the potential for using nonrepresentative consumer trace data assembled by commercial vendors to produce valid and timely estimates. We focus on data purchased from Data Axle, which contains the names and addresses of over 150 million Americans annually. Methods: We identify the predictors of over- and undercounts of households as measured with consumer trace data and compare a range of calibration approaches to assess the extent to which systematic errors in the data can be adjusted for over time. We also demonstrate the utility of the data for predicting contemporaneous (nowcasting) tract-level household counts in the 2020 Decennial Census. Results: We find that adjusted counts at the county, ZIP Code Tabulation Areas (ZCTA), and tract levels deviate from ACS survey-based estimates by an amount roughly equivalent to the ACS margins of error. Machine-learning methods perform best for calibration of county- and tract-level data. The estimates are stable over time and across regions of the country. We also find that when doing nowcasts, incorporating Data Axle estimates improved prediction bias relative to using the most recent ACS five-year estimates alone. Contribution: Despite its affordability and timeliness compared to survey-based measures, consumer trace data remains underexplored by demographers. This paper examines one consumer trace data source and demonstrates that challenges with representativeness can be overcome to produce household estimates that align with survey-based estimates and improve demographic forecasts. At the same time, the analysis also underscores the need for researchers to examine the limits of the data carefully before using them for specific applications.

Suggested Citation

  • Arthur Acolin & Ari Decter-Frain & Matt Hall, 2022. "Small-area estimates from consumer trace data," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 47(27), pages 843-882.
  • Handle: RePEc:dem:demres:v:47:y:2022:i:27
    DOI: 10.4054/DemRes.2022.47.27
    as

    Download full text from publisher

    File URL: https://www.demographic-research.org/volumes/vol47/27/47-27.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.4054/DemRes.2022.47.27?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jack DeWaard & Janna Johnson & Stephan Whitaker, 2019. "Internal migration in the United States: A comprehensive comparative assessment of the Consumer Credit Panel," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 41(33), pages 953-1006.
    2. Andrew J. Greenlee, 2019. "Assessing the Intersection of Neighborhood Change and Residential Mobility Pathways for the Chicago Metropolitan Area (2006–2015)," Housing Policy Debate, Taylor & Francis Journals, vol. 29(1), pages 186-212, January.
    3. Boarnet, Marlon G. & Bostic, Raphael W. & Burinskiy, Evgeniy & Rodnyansky, Seva & Prohofsky, Allen, 2018. "Gentrification Near Rail Transit Areas: A Micro-Data Analysis of Moves into Los Angeles Metro Rail Station Areas," Institute of Transportation Studies, Working Paper Series qt4p4584w8, Institute of Transportation Studies, UC Davis.
    4. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    5. Raj Chetty & John N Friedman & Michael Stepner & Opportunity Insights Team & Camille Baker & Harvey Barnhard & Matt Bell & Gregory Bruich & Tina Chelidze & Lucas Chu & Westley Cineus & Sebi Devlin-Fol, 2024. "The Economic Impacts of COVID-19: Evidence from a New Public Database Built Using Private Sector Data," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 139(2), pages 829-889.
    6. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    7. Rebecca Diamond & Tim McQuade & Franklin Qian, 2019. "The Effects of Rent Control Expansion on Tenants, Landlords, and Inequality: Evidence from San Francisco," American Economic Review, American Economic Association, vol. 109(9), pages 3365-3394, September.
    8. Kacie Dragan & Ingrid Ellen & Sherry A. Glied, 2019. "Does Gentrification Displace Poor Children? New Evidence from New York City Medicaid Data," NBER Working Papers 25809, National Bureau of Economic Research, Inc.
    9. David C. Folch & Daniel Arribas-Bel & Julia Koschinsky & Seth E. Spielman, 2016. "Spatial Variation in the Quality of American Community Survey Estimates," Demography, Springer;Population Association of America (PAA), vol. 53(5), pages 1535-1554, October.
    10. Lewandowski, Daniel & Kurowicka, Dorota & Joe, Harry, 2009. "Generating random correlation matrices based on vines and extended onion method," Journal of Multivariate Analysis, Elsevier, vol. 100(9), pages 1989-2001, October.
    11. Dragan, Kacie & Ellen, Ingrid Gould & Glied, Sherry, 2020. "Does gentrification displace poor children and their families? New evidence from medicaid data in New York City," Regional Science and Urban Economics, Elsevier, vol. 83(C).
    12. Steven Ruggles & Catherine Fitch & Diana Magnuson & Jonathan Schroeder, 2019. "Differential Privacy and Census Data: Implications for Social and Economic Research," AEA Papers and Proceedings, American Economic Association, vol. 109, pages 403-408, May.
    13. W. R. Bell & G. S. Datta & M. Ghosh, 2013. "Benchmarking small area estimators," Biometrika, Biometrika Trust, vol. 100(1), pages 189-202.
    14. David C. Phillips, 2020. "Measuring Housing Stability With Consumer Reference Data," Demography, Springer;Population Association of America (PAA), vol. 57(4), pages 1323-1344, August.
    15. Donghoon Lee & Wilbert Van der Klaauw, 2010. "An introduction to the FRBNY Consumer Credit Panel," Staff Reports 479, Federal Reserve Bank of New York.
    16. Malay Ghosh & Rebecca Steorts, 2013. "Two-stage benchmarking as applied to small area estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(4), pages 670-687, November.
    17. Mathew E. Hauer, 2017. "Migration induced by sea-level rise could reshape the US population landscape," Nature Climate Change, Nature, vol. 7(5), pages 321-325, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Byron Botha & Rulof Burger & Kevin Kotzé & Neil Rankin & Daan Steenkamp, 2023. "Big data forecasting of South African inflation," Empirical Economics, Springer, vol. 65(1), pages 149-188, July.
    2. Madeleine I. G. Daepp, 2022. "Small-area moving ratios and the spatial connectivity of neighborhoods: Insights from consumer credit data," Environment and Planning B, , vol. 49(3), pages 1129-1146, March.
    3. Malay Ghosh & Tatsuya Kubokawa & Yuki Kawakubo, 2014. "Benchmarked Empirical Bayes Methods in Multiplicative Area-level Models with Risk Evaluation," CIRJE F-Series CIRJE-F-918, CIRJE, Faculty of Economics, University of Tokyo.
    4. Christa N. Gibbs & Benedict Guttman-Kenney & Donghoon Lee & Scott Nelson & Wilbert Van der Klaauw & Jialan Wang, 2024. "Consumer Credit Reporting Data," Staff Reports 1114, Federal Reserve Bank of New York.
    5. Croux, Christophe & Jagtiani, Julapa & Korivi, Tarunsai & Vulanovic, Milos, 2020. "Important factors determining Fintech loan default: Evidence from a lendingclub consumer platform," Journal of Economic Behavior & Organization, Elsevier, vol. 173(C), pages 270-296.
    6. Robert Collinson & John Eric Humphries & Nicholas Mader & Davin Reed & Daniel Tannenbaum & Winnie van Dijk, 2024. "Eviction and Poverty in American Cities," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 139(1), pages 57-120.
    7. Whitaker, Stephan D., 2018. "Big Data versus a survey," The Quarterly Review of Economics and Finance, Elsevier, vol. 67(C), pages 285-296.
    8. Francis Wong, 2024. "Taxing Homeowners Who Won’t Borrow," CESifo Working Paper Series 11185, CESifo.
    9. Benavent, Roberto & Morales, Domingo, 2016. "Multivariate Fay–Herriot models for small area estimation," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 372-390.
    10. Ajay Agrawal & Joshua Gans & Avi Goldfarb, 2018. "Prediction, Judgment, and Complexity: A Theory of Decision-Making and Artificial Intelligence," NBER Chapters, in: The Economics of Artificial Intelligence: An Agenda, pages 89-110, National Bureau of Economic Research, Inc.
    11. Downes, Henry & Phillips, David C. & Sullivan, James X., 2022. "The effect of emergency financial assistance on healthcare use," Journal of Public Economics, Elsevier, vol. 208(C).
    12. Pierre-Loup Beauregard, 2024. "Gentrification, displacement, and income trajectory of incumbents," Papers 2403.10614, arXiv.org.
    13. Green, Gareth & Richards, Timothy, 2016. "Interpreting Results of Demand Estimation from Machine Learning Models," 2016 Annual Meeting, July 31-August 2, Boston, Massachusetts 236147, Agricultural and Applied Economics Association.
    14. John Eric Humphries & Nicholas Mader & Daniel Tannenbaum & Winnie van Dijk, 2019. "Does Eviction Cause Poverty? Quasi-Experimental Evidence from Cook County, IL," CESifo Working Paper Series 7800, CESifo.
    15. James Gaboardi, 2020. "Validating Abstract Representations of Spatial Population Data while considering Disclosure Avoidance," Working Papers 20-5, Center for Economic Studies, U.S. Census Bureau.
    16. Jack DeWaard & Mathew Hauer & Elizabeth Fussell & Katherine J. Curtis & Stephan D. Whitaker & Kathryn McConnell & Kobie Price & David Egan-Robertson & Michael Soto & Catalina Anampa Castro, 2022. "User Beware: Concerning Findings from the Post 2011–2012 U.S. Internal Revenue Service Migration Data," Population Research and Policy Review, Springer;Southern Demographic Association (SDA), vol. 41(2), pages 437-448, April.
    17. McKenzie, David & Sansone, Dario, 2017. "Man vs. Machine in Predicting Successful Entrepreneurs: Evidence from a Business Plan Competition in Nigeria," CEPR Discussion Papers 12523, C.E.P.R. Discussion Papers.
    18. Joyce P Jacobsen & Laurence M Levin & Zachary Tausanovitch, 2016. "Comparing Standard Regression Modeling to Ensemble Modeling: How Data Mining Software Can Improve Economists’ Predictions," Eastern Economic Journal, Palgrave Macmillan;Eastern Economic Association, vol. 42(3), pages 387-398, June.
    19. William N. Evans & David C. Phillips & Krista Ruffini, 2021. "Policies To Reduce And Prevent Homelessness: What We Know And Gaps In The Research," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 40(3), pages 914-963, June.
    20. Divya Singh, 2020. "Do Property Tax Incentives for New Construction Spur Gentrification? Evidence from New York City," 2020 Papers psi856, Job Market Papers.

    More about this item

    Keywords

    small area estimation; nontraditional data; consumer data; calibration techniques;
    All these keywords.

    JEL classification:

    • J1 - Labor and Demographic Economics - - Demographic Economics
    • Z0 - Other Special Topics - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:dem:demres:v:47:y:2022:i:27. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Editorial Office (email available below). General contact details of provider: https://www.demogr.mpg.de/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.