IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0110257.html
   My bibliography  Save this article

Random-Effects, Fixed-Effects and the within-between Specification for Clustered Data in Observational Health Studies: A Simulation Study

Author

Listed:
  • Joseph L Dieleman
  • Tara Templin

Abstract

Background: When unaccounted-for group-level characteristics affect an outcome variable, traditional linear regression is inefficient and can be biased. The random- and fixed-effects estimators (RE and FE, respectively) are two competing methods that address these problems. While each estimator controls for otherwise unaccounted-for effects, the two estimators require different assumptions. Health researchers tend to favor RE estimation, while researchers from some other disciplines tend to favor FE estimation. In addition to RE and FE, an alternative method called within-between (WB) was suggested by Mundlak in 1978, although is utilized infrequently. Methods: We conduct a simulation study to compare RE, FE, and WB estimation across 16,200 scenarios. The scenarios vary in the number of groups, the size of the groups, within-group variation, goodness-of-fit of the model, and the degree to which the model is correctly specified. Estimator preference is determined by lowest mean squared error of the estimated marginal effect and root mean squared error of fitted values. Results: Although there are scenarios when each estimator is most appropriate, the cases in which traditional RE estimation is preferred are less common. In finite samples, the WB approach outperforms both traditional estimators. The Hausman test guides the practitioner to the estimator with the smallest absolute error only 61% of the time, and in many sample sizes simply applying the WB approach produces smaller absolute errors than following the suggestion of the test. Conclusions: Specification and estimation should be carefully considered and ultimately guided by the objective of the analysis and characteristics of the data. The WB approach has been underutilized, particularly for inference on marginal effects in small samples. Blindly applying any estimator can lead to bias, inefficiency, and flawed inference.

Suggested Citation

  • Joseph L Dieleman & Tara Templin, 2014. "Random-Effects, Fixed-Effects and the within-between Specification for Clustered Data in Observational Health Studies: A Simulation Study," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-17, October.
  • Handle: RePEc:plo:pone00:0110257
    DOI: 10.1371/journal.pone.0110257
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0110257
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0110257&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0110257?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sophia Rabe‐Hesketh & Anders Skrondal, 2006. "Multilevel modelling of complex survey data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(4), pages 805-827, October.
    2. Hahn, Jinyong & Ham, John & Moon, Hyungsik Roger, 2011. "Test of random versus fixed effects with small within variation," Economics Letters, Elsevier, vol. 112(3), pages 293-297, September.
    3. Hausman, Jerry, 2015. "Specification tests in econometrics," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 38(2), pages 112-134.
    4. D. Pfeffermann & C. J. Skinner & D. J. Holmes & H. Goldstein & J. Rasbash, 1998. "Weighting for unequal selection probabilities in multilevel models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 23-40.
    5. Gary Chamberlain, 1980. "Analysis of Covariance with Qualitative Data," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 47(1), pages 225-238.
    6. Peter Kennedy, 2003. "A Guide to Econometrics, 5th Edition," MIT Press Books, The MIT Press, edition 5, volume 1, number 026261183x, April.
    7. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, April.
    8. Desai, M. & Begg, M.D., 2008. "A comparison of regression approaches for analyzing clustered data," American Journal of Public Health, American Public Health Association, vol. 98(8), pages 1425-1429.
    9. Duncan, Craig & Jones, Kelvyn & Moon, Graham, 1998. "Context, composition and heterogeneity: Using multilevel models in health research," Social Science & Medicine, Elsevier, vol. 46(1), pages 97-117, January.
    10. Nicolas Debarsy, 2012. "The Mundlak Approach in the Spatial Durbin Panel Data Model," Spatial Economic Analysis, Taylor & Francis Journals, vol. 7(1), pages 109-131, March.
    11. Jesse A. Berlin & Stephen E. Kimmel & Thomas R. Ten Have & Mary D. Sammel, 1999. "An Empirical Comparison of Several Clustered Data Approaches Under Confounding Due to Cluster Effects in the Analysis of Complications of Coronary Angioplasty," Biometrics, The International Biometric Society, vol. 55(2), pages 470-476, June.
    12. Mundlak, Yair, 1978. "On the Pooling of Time Series and Cross Section Data," Econometrica, Econometric Society, vol. 46(1), pages 69-85, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Seonho Shin, 2021. "Were they a shock or an opportunity?: The heterogeneous impacts of the 9/11 attacks on refugees as job seekers—a nonlinear multi-level approach," Empirical Economics, Springer, vol. 61(5), pages 2827-2864, November.
    2. Georges Bresson & Guy Lacroix & Mohammad Arshad Rahman, 2021. "Bayesian panel quantile regression for binary outcomes with correlated random effects: an application on crime recidivism in Canada," Empirical Economics, Springer, vol. 60(1), pages 227-259, January.
    3. Anders Skrondal & Sophia Rabe-Hesketh, 2022. "The Role of Conditional Likelihoods in Latent Variable Modeling," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 799-834, September.
    4. Yang, Yimin, 2022. "A correlated random effects approach to the estimation of models with multiple fixed effects," Economics Letters, Elsevier, vol. 213(C).
    5. Chen, Yi-Yi & Schmidt, Peter & Wang, Hung-Jen, 2014. "Consistent estimation of the fixed effects stochastic frontier model," Journal of Econometrics, Elsevier, vol. 181(2), pages 65-76.
    6. Arnd Kölling & Claus Schnabel, 2022. "Owners, external managers and industrial relations in German establishments," British Journal of Industrial Relations, London School of Economics, vol. 60(2), pages 424-443, June.
    7. Vigren, Andreas, 2020. "The Distance Factor in Swedish Bus Contracts How far are operators willing to go?," Transportation Research Part A: Policy and Practice, Elsevier, vol. 132(C), pages 188-204.
    8. Nadjia Mehraban & Christoph Kubitza & Zulkifli Alamsyah & Matin Qaim, 2021. "Oil palm cultivation, household welfare, and exposure to economic risk in the Indonesian small farm sector," Journal of Agricultural Economics, Wiley Blackwell, vol. 72(3), pages 901-915, September.
    9. Maggie Xiaoyang Chen & Aaditya Mattoo, 2008. "Regionalism in standards: good or bad for trade?," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 41(3), pages 838-863, August.
    10. Papineau, Maya & Yassin, Kareman & Newsham, Guy & Brice, Sarah, 2021. "Conditional demand analysis as a tool to evaluate energy policy options on the path to grid decarbonization," Renewable and Sustainable Energy Reviews, Elsevier, vol. 149(C).
    11. Kondo, M., 2018. "Schooling and Within-Sector Labor Productivity Outcome in Uganda: Joint Estimation of Returns to Education and Labor Supply," 2018 Conference, July 28-August 2, 2018, Vancouver, British Columbia 277473, International Association of Agricultural Economists.
    12. Bluhm, Richard & Crombrugghe, Denis de & Szirmai, Adam, 2012. "Explaining the dynamics of stagnation: An empirical examination of the North, Wallis and Weingast approach," MERIT Working Papers 2012-040, United Nations University - Maastricht Economic and Social Research Institute on Innovation and Technology (MERIT).
    13. Murtazashvili, Irina & Wooldridge, Jeffrey M., 2016. "A control function approach to estimating switching regression models with endogenous explanatory variables and endogenous switching," Journal of Econometrics, Elsevier, vol. 190(2), pages 252-266.
    14. Toni Mora & Joan Gil & Antoni Sicras-Mainar, 2015. "The influence of obesity and overweight on medical costs: a panel data perspective," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 16(2), pages 161-173, March.
    15. Hajivassiliou, Vassilis, 2019. "Estimation and specification testing of panel data models with non-ignorable persistent heterogeneity, contemporaneous and intertemporal simultaneity and observable and unobservable dynamics," LSE Research Online Documents on Economics 102843, London School of Economics and Political Science, LSE Library.
    16. Antonio Ruiz Porras, 2016. "La investigación econométrica mediante paneles de datos:historia, modelos y usos en México," Archivos Revista Economía y Política., Facultad de Ciencias Económicas y Administrativas, Universidad de Cuenca., vol. 24, pages 11-32, Julio.
    17. Manudeep Bhuller & Christian N. Brinch & Sebastian Königs, 2017. "Time Aggregation and State Dependence in Welfare Receipt," Economic Journal, Royal Economic Society, vol. 127(604), pages 1833-1873, September.
    18. Alberto Gude & Inmaculada Álvarez & Luis Orea, 2018. "Heterogeneous spillovers among Spanish provinces: a generalized spatial stochastic frontier model," Journal of Productivity Analysis, Springer, vol. 50(3), pages 155-173, December.
    19. H. Allen Klaiber & Klaus Salhofer & Stanley R. Thompson, 2017. "Capitalisation of the SPS into Agricultural Land Rental Prices under Harmonisation of Payments," Journal of Agricultural Economics, Wiley Blackwell, vol. 68(3), pages 710-726, September.
    20. Valizadeh, Pourya & Ng, Shu Wen, 2020. "The New school food standards and nutrition of school children: Direct and Indirect Effect Analysis," Economics & Human Biology, Elsevier, vol. 39(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0110257. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.