IDEAS home Printed from https://ideas.repec.org/a/nse/ecosta/ecostat_2019_509_4.html
   My bibliography  Save this article

Comparing Price Indices of Clothing and Footwear for Scanner Data and Web Scraped Data

Author

Listed:
  • Antonio G. Chessa
  • Robert Griffioen

Abstract

[eng] Statistical institutes are considering web scraping of online prices of consumer goods as a feasible alternative to scanner data. The lack of transaction data generates the question whether web scraped data are suited for price index calculation. This article investigates this question by comparing price indices based on web scraped and scanner data for clothing and footwear in the same webshop. Scanner data and web scraped prices are often equal, with the latter being slightly higher on average. Numbers of web scraped product prices and products sold show remarkably high correlations. Given the high churn rates of clothing products, a multilateral method (Geary-Khamis) was used to calculate price indices. For 16 product categories, the indices show small overall differences between the two data sources, with year on year indices differing only by 0.3 percentage point at COICOP level (men’s and women's clothing). It remains to be investigated whether such promising results for web scraped data will also be found for other retailers.

Suggested Citation

  • Antonio G. Chessa & Robert Griffioen, 2019. "Comparing Price Indices of Clothing and Footwear for Scanner Data and Web Scraped Data," Economie et Statistique / Economics and Statistics, Institut National de la Statistique et des Etudes Economiques (INSEE), issue 509, pages 49-68.
  • Handle: RePEc:nse:ecosta:ecostat_2019_509_4
    DOI: https://doi.org/10.24187/ecostat.2019.509.1984
    as

    Download full text from publisher

    File URL: https://www.insee.fr/en/statistiques/fichier/4203548/509_Chessa-Griffioen-EN.pdf
    Download Restriction: no

    File URL: https://libkey.io/https://doi.org/10.24187/ecostat.2019.509.1984?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Diewert, W. Erwin & Fox, Kevin J., 2017. "Substitution Bias in Multilateral Methods for CPI Construction using Scanner Data," Microeconomics.ca working papers erwin_diewert-2017-3, Vancouver School of Economics, revised 23 Mar 2017.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Patrick Bajari & Zhihao Cen & Victor Chernozhukov & Manoj Manukonda & Suhas Vijaykumar & Jin Wang & Ramon Huerta & Junbo Li & Ling Leng & George Monokroussos & Shan Wan, 2023. "Hedonic Prices and Quality Adjusted Price Indices Powered by AI," Papers 2305.00044, arXiv.org.
    2. Laureti Tiziana & Polidoro Federico, 2022. "Using Scanner Data for Computing Consumer Spatial Price Indexes at Regional Level: An Empirical Application for Grocery Products in Italy," Journal of Official Statistics, Sciendo, vol. 38(1), pages 23-56, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. W. Erwin Diewert & Robert C. Feenstra, 2021. "Estimating the Benefits of New Products," NBER Chapters, in: Big Data for Twenty-First-Century Economic Statistics, pages 437-473, National Bureau of Economic Research, Inc.
    2. Diewert, W, Erwin & Feenstra, Robert, 2017. "Estimating the Benefits and Costs of New and Disappearing Products," Microeconomics.ca working papers tina_marandola-2017-12, Vancouver School of Economics, revised 19 Dec 2017.
    3. Bentley Alan, 2022. "Rentals for Housing: A Property Fixed-Effects Estimator of Inflation from Administrative Data," Journal of Official Statistics, Sciendo, vol. 38(1), pages 187-211, March.
    4. Robert J. Hill & Michael Scholz & Chihiro & Miriam Steurer, 2020. "Rolling-Time-Dummy House Price Indexes: Window Length, Linking and Options for Dealing with the Covid-19 Shutdown," Graz Economics Papers 2020-14, University of Graz, Department of Economics.
    5. Jacek Białek, 2023. "Improving quality of the scanner CPI: proposition of new multilateral methods," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(3), pages 2893-2921, June.
    6. W. Erwin Diewert & Kiyohiko G. Nishimura & Chihiro Shimizu & Tsutomu Watanabe, 2020. "Measuring the Services of Durables and Owner Occupied Housing," Advances in Japanese Business and Economics, in: Property Price Index, chapter 0, pages 223-298, Springer.
    7. W. Erwin Diewert & Kevin J. Fox, 2020. "Measuring Real Consumption and CPI Bias under Lockdown Conditions," NBER Working Papers 27144, National Bureau of Economic Research, Inc.
    8. W. Erwin Diewert, 2022. "Scanner Data, Elementary Price Indexes and the Chain Drift Problem," Springer Books, in: Duangkamon Chotikapanich & Alicia N. Rambaldi & Nicholas Rohde (ed.), Advances in Economic Measurement, chapter 0, pages 445-606, Springer.
    9. Diewert, Erwin, 2019. "Quality Adjustment and Hedonics: A Unified Approach," Microeconomics.ca working papers erwin_diewert-2019-2, Vancouver School of Economics, revised 14 Mar 2019.
    10. Diewert, Erwin & Marandola, Tina, 2018. "Scanner Data, Elementary Price Indexes and the Chain Drift Problem," Microeconomics.ca working papers tina_marandola-2018-9, Vancouver School of Economics, revised 10 Oct 2018.
    11. Hannah de Nobrega & Johannes Coetsee & MG Ferreira & Rowan Walter, 2024. "Updating the SARB Index of Commodity Prices," Occasional Bulletin of Economic Notes 11053, South African Reserve Bank.
    12. Marie Leclair & Isabelle Léonard & Guillaume Rateau & Patrick Sillard & Gaëtan Varlet & Pierre Vernédal, 2019. "Scanner Data: Advances in Methodology and New Challenges for Computing Consumer Price," Economie et Statistique / Economics and Statistics, Institut National de la Statistique et des Etudes Economiques (INSEE), issue 509, pages 13-29.
    13. W. Erwin Diewert & Chihiro Shimizu, 2022. "Residential Property Price Indexes: Spatial Coordinates Versus Neighborhood Dummy Variables," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 68(3), pages 770-796, September.
    14. Zhenkun Zhou & Zikun Song & Tao Ren, 2022. "Predicting China's CPI by Scanner Big Data," Papers 2211.16641, arXiv.org, revised Oct 2023.
    15. Diewert W. Erwin & Fox Kevin J., 2022. "Measuring Inflation under Pandemic Conditions," Journal of Official Statistics, Sciendo, vol. 38(1), pages 255-285, March.
    16. Jacek Bia{l}ek & Maciej Berk{e}sewicz, 2020. "Scanner data in inflation measurement: from raw data to price indices," Papers 2005.11233, arXiv.org.
    17. Zhang Li-Chun & Johansen Ingvild & Nygaard Ragnhild, 2019. "Tests for Price Indices in a Dynamic Item Universe," Journal of Official Statistics, Sciendo, vol. 35(3), pages 683-697, September.
    18. Jan de Haan & Rens Hendriks & Michael Scholz, 2021. "Price Measurement Using Scanner Data: Time‐Product Dummy Versus Time Dummy Hedonic Indexes," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 67(2), pages 394-417, June.
    19. Li-Chun Zhang & Ingvild Johansen & Ragnhild Nygaard, 2018. "Tests for price indices in a dynamic item universe," Papers 1808.08995, arXiv.org, revised Oct 2018.
    20. Daniel Melser & Michael Webster, 2021. "Multilateral Methods, Substitution Bias, and Chain Drift: Some Empirical Comparisons," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 67(3), pages 759-785, September.

    More about this item

    JEL classification:

    • C43 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Index Numbers and Aggregation
    • E31 - Macroeconomics and Monetary Economics - - Prices, Business Fluctuations, and Cycles - - - Price Level; Inflation; Deflation

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nse:ecosta:ecostat_2019_509_4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Veronique Egloff (email available below). General contact details of provider: https://edirc.repec.org/data/inseefr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.