IDEAS home Printed from https://ideas.repec.org/p/hhs/nhhfms/2023_001.html
   My bibliography  Save this paper

A two sample size estimator for large data sets

Author

Listed:

Abstract

In GMM estimators moment conditions with additive error terms involve an observed component and a predicted component. If the predicted component is computationally costly to evaluate, it may not be feasible to estimate the model with all the available data. We propose an estimator that uses the full data set for the computationally cheap observed component, but a reduced sample size for the predicted component. We show consistency, asymptotic normality, and derive standard errors and a practical criterion for when our estimator is variance-reducing. We demonstrate the estimator’s properties on a range of models through Monte Carlo studies and an empirical application to alcohol demand.

Suggested Citation

  • O’Connell, Martin & Smith, Howard & Thomassen, Øyvind, 2023. "A two sample size estimator for large data sets," Discussion Papers 2023/1, Norwegian School of Economics, Department of Business and Management Science.
  • Handle: RePEc:hhs:nhhfms:2023_001
    as

    Download full text from publisher

    File URL: https://hdl.handle.net/11250/3051932
    File Function: Full text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sokbae Lee & Serena Ng, 2020. "An Econometric Perspective on Algorithmic Subsampling," Annual Review of Economics, Annual Reviews, vol. 12(1), pages 45-80, August.
    2. Meghan R. Busse & Christopher R. Knittel & Florian Zettelmeyer, 2013. "Are Consumers Myopic? Evidence from New and Used Car Purchases," American Economic Review, American Economic Association, vol. 103(1), pages 220-256, February.
    3. Hunt Allcott & Benjamin B Lockwood & Dmitry Taubinsky, 2019. "Regressive Sin Taxes, with an Application to the Optimal Soda Tax," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 134(3), pages 1557-1626.
    4. Steven T. Berry & Philip A. Haile, 2021. "Foundations of Demand Estimation," NBER Working Papers 29305, National Bureau of Economic Research, Inc.
    5. Pacini, David & Windmeijer, Frank, 2016. "Robust inference for the Two-Sample 2SLS estimator," Economics Letters, Elsevier, vol. 146(C), pages 50-54.
    6. Rachel Griffith & Martin O’Connell & Kate Smith, 2022. "Price Floors and Externality Correction," The Economic Journal, Royal Economic Society, vol. 132(646), pages 2273-2289.
    7. Hamish Low & Costas Meghir, 2017. "The Use of Structural Models in Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 33-58, Spring.
    8. Guido W. Imbens & Tony Lancaster, 1994. "Combining Micro and Macro Data in Microeconometric Models," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 61(4), pages 655-680.
    9. Saez, Emmanuel, 2002. "The desirability of commodity taxation under non-linear income taxation and heterogeneous tastes," Journal of Public Economics, Elsevier, vol. 83(2), pages 217-230, February.
    10. Øyvind Thomassen & Howard Smith & Stephan Seiler & Pasquale Schiraldi, 2017. "Multi-category Competition and Market Power: A Model of Supermarket Pricing," American Economic Review, American Economic Association, vol. 107(8), pages 2308-2351, August.
    11. Atsushi Inoue & Gary Solon, 2010. "Two-Sample Instrumental Variables Estimators," The Review of Economics and Statistics, MIT Press, vol. 92(3), pages 557-561, August.
    12. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. O'Connell, Martin & Smith, Kate, 2020. "Corrective Tax Design and Market Power," CEPR Discussion Papers 14582, C.E.P.R. Discussion Papers.
    2. Brett Hollenbeck & Kosuke Uetake, 2021. "Taxation and market power in the legal marijuana industry," RAND Journal of Economics, RAND Corporation, vol. 52(3), pages 559-595, September.
    3. Spencer Bastani & Sebastian Koehne, 2022. "How Should Consumption Be Taxed?," CESifo Working Paper Series 10038, CESifo.
    4. Rachel Griffith & Martin O’Connell & Kate Smith, 2022. "Price Floors and Externality Correction," The Economic Journal, Royal Economic Society, vol. 132(646), pages 2273-2289.
    5. Thomas F. Crossley & Peter Levell & Stavros Poupakis, 2022. "Regression with an imputed dependent variable," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(7), pages 1277-1294, November.
    6. Anja Gaentzsch & Gabriela Zapata Román, 2018. "More educated, less mobile? Diverging trends in income and educational mobility in Chile and Peru," Global Development Institute Working Paper Series 312018, GDI, The University of Manchester.
    7. Xiang, Di & Zhan, Lue & Bordignon, Massimo, 2020. "A reconsideration of the sugar sweetened beverage tax in a household production model," Food Policy, Elsevier, vol. 95(C).
    8. Choudhury, Sanchari, 2023. "Non-random selection into entrepreneurship in the realm of government decentralization and corruption," European Journal of Political Economy, Elsevier, vol. 78(C).
    9. Bhaven Sampat & Heidi L. Williams, 2019. "How Do Patents Affect Follow-On Innovation? Evidence from the Human Genome," American Economic Review, American Economic Association, vol. 109(1), pages 203-236, January.
    10. Marco Francesconi & Jonathan James, 2022. "Alcohol Price Floors and Externalities: The Case of Fatal Road Crashes," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 41(4), pages 1118-1156, September.
    11. Crossley, Thomas F. & Levell, Peter & Low, Hamish, 2024. "House price rises and borrowing to invest," Journal of Economic Behavior & Organization, Elsevier, vol. 223(C), pages 86-105.
    12. Buchinsky, Moshe & Li, Fanghua & Liao, Zhipeng, 2022. "Estimation and inference of semiparametric models using data from several sources," Journal of Econometrics, Elsevier, vol. 226(1), pages 80-103.
    13. Sokbae Lee & Serena Ng, 2020. "Least Squares Estimation Using Sketched Data with Heteroskedastic Errors," Papers 2007.07781, arXiv.org, revised Jun 2022.
    14. Belchior, Carlos Alberto & Gonzaga, Gustavo & Ulyssea, Gabriel, 2023. "Unpacking Neighborhood Effects: Experimental Evidence from a Large-Scale Housing Program in Brazil," IZA Discussion Papers 16113, Institute of Labor Economics (IZA).
    15. Martin O'Connell & Pierre Dubois & Rachel Griffith, 2022. "The Use of Scanner Data for Economics Research," Annual Review of Economics, Annual Reviews, vol. 14(1), pages 723-745, August.
    16. Øyvind Thomassen & Howard Smith & Stephan Seiler & Pasquale Schiraldi, 2017. "Multi-category Competition and Market Power: A Model of Supermarket Pricing," American Economic Review, American Economic Association, vol. 107(8), pages 2308-2351, August.
    17. Chen, Jiaying & Park, Albert, 2021. "School entry age and educational attainment in developing countries: Evidence from China's compulsory education law," Journal of Comparative Economics, Elsevier, vol. 49(3), pages 715-732.
    18. Jonas Hjort & Xuan Li & Heather Sarsons, 2020. "Across-Country Wage Compression in Multinationals," NBER Working Papers 26788, National Bureau of Economic Research, Inc.
    19. Hirukawa, Masayuki & Prokhorov, Artem, 2018. "Consistent estimation of linear regression models using matched data," Journal of Econometrics, Elsevier, vol. 203(2), pages 344-358.
    20. Jonas Hjort & Xuan Li & Heather Sarsons, 2020. "Random-Coefficients Logit Demand Estimation with Zero-Valued Market Shares," Working Papers 2020-15, Becker Friedman Institute for Research In Economics.

    More about this item

    Keywords

    GMM; estimation; micro data;
    All these keywords.

    JEL classification:

    • C20 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - General
    • C51 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Construction and Estimation
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hhs:nhhfms:2023_001. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Stein Fossen (email available below). General contact details of provider: https://edirc.repec.org/data/dfnhhno.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.