IDEAS home Printed from https://ideas.repec.org/a/wly/camsys/v15y2019i1-2ne1027.html
   My bibliography  Save this article

Within study comparisons and risk of bias in international development: Systematic review and critical appraisal

Author

Listed:
  • Paul Fenton Villar
  • Hugh Waddington

Abstract

Background Many systematic reviews incorporate nonrandomised studies of effects, sometimes called quasi‐experiments or natural experiments. However, the extent to which nonrandomised studies produce unbiased effect estimates is unclear in expectation or in practice. The usual way that systematic reviews quantify bias is through “risk of bias assessment” and indirect comparison of findings across studies using meta‐analysis. A more direct, practical way to quantify the bias in nonrandomised studies is through “internal replication research”, which compares the findings from nonrandomised studies with estimates from a benchmark randomised controlled trial conducted in the same population. Despite the existence of many risks of bias tools, none are conceptualised to assess comprehensively nonrandomised approaches with selection on unobservables, such as regression discontinuity designs (RDDs). The few that are conceptualised with these studies in mind do not draw on the extensive literature on internal replications (within‐study comparisons) of randomised trials. Objectives Our research objectives were as follows: Objective 1: to undertake a systematic review of nonrandomised internal study replications of international development interventions. Objective 2: to develop a risk of bias tool for RDDs, an increasingly common method used in social and economic programme evaluation. Methods We used the following methods to achieve our objectives. Objective 1: we searched systematically for nonrandomised internal study replications of benchmark randomised experiments of social and economic interventions in low‐ and middle‐income countries (L&MICs). We assessed the risk of bias in benchmark randomised experiments and synthesised evidence on the relative bias effect sizes produced by benchmark and nonrandomised comparison arms. Objective 2: We used document review and expert consultation to develop further a risk of bias tool for quasi‐experimental studies of interventions (ROBINS‐I) for RDDs. Results Objective 1: we located 10 nonrandomised internal study replications of randomised trials in L&MICs, six of which are of RDDs and the remaining use a combination of statistical matching and regression techniques. We found that benchmark experiments used in internal replications in international development are in the main well‐conducted but have “some concerns” about threats to validity, usually arising due to the methods of outcomes data collection. Most internal replication studies report on a range of different specifications for both the benchmark estimate and the nonrandomised replication estimate. We extracted and standardised 604 bias coefficient effect sizes from these studies, and present average results narratively. Objective 2: RDDs are characterised by prospective assignment of participants based on a threshold variable. Our review of the literature indicated there are two main types of RDD. The most common type of RDD is designed retrospectively in which the researcher identifies post‐hoc the relationship between outcomes and a threshold variable which determines assignment to intervention at pretest. These designs usually draw on routine data collection such as administrative records or household surveys. The other, less common, type is a prospective design where the researcher is also involved in allocating participants to treatment groups from the outset. We developed a risk of bias tool for RDDs. Conclusions Internal study replications provide the grounds on which bias assessment tools can be evidenced. We conclude that existing risk of bias tools needs to be further developed for use by Campbell collaboration authors, and there is a wide range of risk of bias tools and internal study replications to draw on in better designing these tools. We have suggested the development of a promising approach for RDD. Further work is needed on common methodologies in programme evaluation, for example on statistical matching approaches. We also highlight that broader efforts to identify all existing internal replication studies should consider more specialised systematic search strategies within particular literatures; so as to overcome a lack of systematic indexing of this evidence.

Suggested Citation

  • Paul Fenton Villar & Hugh Waddington, 2019. "Within study comparisons and risk of bias in international development: Systematic review and critical appraisal," Campbell Systematic Reviews, John Wiley & Sons, vol. 15(1-2), June.
  • Handle: RePEc:wly:camsys:v:15:y:2019:i:1-2:n:e1027
    DOI: 10.1002/cl2.1027
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/cl2.1027
    Download Restriction: no

    File URL: https://libkey.io/10.1002/cl2.1027?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Trine Filges & Anders Bruun Jonassen & Anne‐Marie Klint Jørgensen, 2018. "Reducing unemployment benefit duration to increase job finding rates: a systematic review," Campbell Systematic Reviews, John Wiley & Sons, vol. 14(1), pages 1-194.
    2. Jennifer Petkovic & Vivian Welch & Marie Helena Jacob & Manosila Yoganathan & Ana Patricia Ayala & Heather Cunningham & Peter Tugwell, 2018. "Do evidence summaries increase health policy‐makers' use of evidence from systematic reviews? A systematic review," Campbell Systematic Reviews, John Wiley & Sons, vol. 14(1), pages 1-52.
    3. Lise, Jeremy & Seitz, Shannon & Smith, Jeffrey A., 2003. "Equilibrium Policy Experiments and the Evaluation of Social Programs," IZA Discussion Papers 758, Institute of Labor Economics (IZA).
    4. LaLonde, Robert J, 1986. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," American Economic Review, American Economic Association, vol. 76(4), pages 604-620, September.
    5. Inna Cintina & Inessa Love, 2014. "The Miracle of Microfinance Revisited: Evidence from Propensity Score Matching," Working Papers 201424, University of Hawaii at Manoa, Department of Economics.
    6. Friedlander, Daniel & Robins, Philip K, 1995. "Evaluating Program Evaluations: New Evidence on Commonly Used Nonexperimental Methods," American Economic Review, American Economic Association, vol. 85(4), pages 923-937, September.
    7. Anthony Petrosino & Claire Morgan & Trevor A. Fronius & Emily E. Tanner‐Smith & Robert F. Boruch, 2012. "Interventions in Developing Nations for Improving Primary and Secondary School Enrollment of Children: A Systematic Review," Campbell Systematic Reviews, John Wiley & Sons, vol. 8(1), pages -192.
    8. Catherine Hausman & David S. Rapson, 2018. "Regression Discontinuity in Time: Considerations for Empirical Applications," Annual Review of Resource Economics, Annual Reviews, vol. 10(1), pages 533-552, October.
    9. David B. Wilson & Charlotte Gill & Ajima Olaghere & Dave McClure, 2016. "Juvenile Curfew Effects on Criminal Behavior and Victimization: A Systematic Review," Campbell Systematic Reviews, John Wiley & Sons, vol. 12(1), pages 1-97.
    10. Cyrus Samii & Matthew Lisiecki & Parashar Kulkarni & Laura Paler & Larry Chavis, 2014. "Effects of decentralized forest management (DFM) on deforestation and poverty in low‐ and middle‐income countries: a systematic review," Campbell Systematic Reviews, John Wiley & Sons, vol. 10(1), pages 1-88.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Pierre Marion & Etienne Lwamba & Andrea Floridi & Suvarna Pande & Megha Bhattacharyya & Sarah Young & Paul Fenton Villar & Shannon Shisler, 2024. "The effects of agricultural output market access interventions on agricultural, socio‐economic, food security, and nutrition outcomes in low‐ and middle‐income countries: A systematic review," Campbell Systematic Reviews, John Wiley & Sons, vol. 20(2), June.
    2. Ariel M. Aloe & Ruth Garside, 2021. "Editorial: Types of methods research papers in the journal Campbell Systematic Reviews," Campbell Systematic Reviews, John Wiley & Sons, vol. 17(2), June.
    3. Davidson, Angus Alexander & Young, Michael Denis & Leake, John Espie & O’Connor, Patrick, 2022. "Aid and forgetting the enemy: A systematic review of the unintended consequences of international development in fragile and conflict-affected situations," Evaluation and Program Planning, Elsevier, vol. 92(C).
    4. Paul Fenton Villar & Tomasz Kozakiewicz & Vinitha Bachina & Sarah Young & Shannon Shisler, 2023. "PROTOCOL: The effects of agricultural output market access interventions on agricultural, socio‐economic and food and nutrition security outcomes in low‐ and middle‐income countries: A systematic revi," Campbell Systematic Reviews, John Wiley & Sons, vol. 19(3), September.
    5. Hugh Sharma Waddington & Paul Fenton Villar & Jeffrey C. Valentine, 2023. "Can Non-Randomised Studies of Interventions Provide Unbiased Effect Estimates? A Systematic Review of Internal Replication Studies," Evaluation Review, , vol. 47(3), pages 563-593, June.
    6. Dagim Dawit Gonsamo & Herman Hay Ming Lo & Ko Ling Chan, 2021. "The Role of Stomach Infrastructures on Children’s Work and Child Labour in Africa: Systematic Review," IJERPH, MDPI, vol. 18(16), pages 1-26, August.
    7. Morton, Matthew H. & Kugley, Shannon & Epstein, Richard & Farrell, Anne, 2020. "Interventions for youth homelessness: A systematic review of effectiveness studies," Children and Youth Services Review, Elsevier, vol. 116(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jorgen Hansen & Xingfei Liu, 2015. "Estimating labour supply responses and welfare participation: Using a natural experiment to validate a structural labour supply model," Canadian Journal of Economics, Canadian Economics Association, vol. 48(5), pages 1831-1854, December.
    2. A. Smith, Jeffrey & E. Todd, Petra, 2005. "Does matching overcome LaLonde's critique of nonexperimental estimators?," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 305-353.
    3. Pieter Gautier & Paul Muller & Bas van der Klaauw & Michael Rosholm & Michael Svarer, 2018. "Estimating Equilibrium Effects of Job Search Assistance," Journal of Labor Economics, University of Chicago Press, vol. 36(4), pages 1073-1125.
    4. Jeffrey Smith & Arthur Sweetman, 2016. "Viewpoint: Estimating the causal effects of policies and programs," Canadian Journal of Economics, Canadian Economics Association, vol. 49(3), pages 871-905, August.
    5. Jeremy Lise & Shannon Seitz & Jeffrey Smith, 2015. "Evaluating search and matching models using experimental data," IZA Journal of Labor Economics, Springer;Forschungsinstitut zur Zukunft der Arbeit GmbH (IZA), vol. 4(1), pages 1-35, December.
    6. de Crombrugghe, D.P.I. & Espinoza, H. & Heijke, J.A.M., 2010. "Job-training programmes with low completion rates: the case of Projoven-Peru," ROA Research Memorandum 004, Maastricht University, Research Centre for Education and the Labour Market (ROA).
    7. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    8. Peter R. Mueser & Kenneth R. Troske & Alexey Gorislavsky, 2007. "Using State Administrative Data to Measure Program Performance," The Review of Economics and Statistics, MIT Press, vol. 89(4), pages 761-783, November.
    9. V. Joseph Hotz & Guido W. Imbens & Jacob A. Klerman, 2006. "Evaluating the Differential Effects of Alternative Welfare-to-Work Training Components: A Reanalysis of the California GAIN Program," Journal of Labor Economics, University of Chicago Press, vol. 24(3), pages 521-566, July.
    10. Fredrik Andersson & Harry J. Holzer & Julia I. Lane & David Rosenblum & Jeffrey Smith, 2024. "Does Federally Funded Job Training Work? Nonexperimental Estimates of WIA Training Impacts Using Longitudinal Data on Workers and Firms," Journal of Human Resources, University of Wisconsin Press, vol. 59(4), pages 1244-1283.
    11. Astrid Grasdal, 2001. "The performance of sample selection estimators to control for attrition bias," Health Economics, John Wiley & Sons, Ltd., vol. 10(5), pages 385-398, July.
    12. Lechner, Michael & Wunsch, Conny, 2013. "Sensitivity of matching-based program evaluations to the availability of control variables," Labour Economics, Elsevier, vol. 21(C), pages 111-121.
    13. Kenneth Fortson & Philip Gleason & Emma Kopa & Natalya Verbitsky-Savitz, "undated". "Horseshoes, Hand Grenades, and Treatment Effects? Reassessing Bias in Nonexperimental Estimators," Mathematica Policy Research Reports 1c24988cd5454dd3be51fbc2c, Mathematica Policy Research.
    14. Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
    15. Deborah A. Cobb‐Clark & Thomas Crossley, 2003. "Econometrics for Evaluations: An Introduction to Recent Developments," The Economic Record, The Economic Society of Australia, vol. 79(247), pages 491-511, December.
    16. Vivian C. Wong & Peter M. Steiner, 2018. "Designs of Empirical Evaluations of Nonexperimental Methods in Field Settings," Evaluation Review, , vol. 42(2), pages 176-213, April.
    17. Carlos A. Flores & Oscar A. Mitnik, 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," Working Papers 2010-10, University of Miami, Department of Economics.
    18. Fortson, Kenneth & Gleason, Philip & Kopa, Emma & Verbitsky-Savitz, Natalya, 2015. "Horseshoes, hand grenades, and treatment effects? Reassessing whether nonexperimental estimators are biased," Economics of Education Review, Elsevier, vol. 44(C), pages 100-113.
    19. Bryson, Alex & Dorsett, Richard & Purdon, Susan, 2002. "The use of propensity score matching in the evaluation of active labour market policies," LSE Research Online Documents on Economics 4993, London School of Economics and Political Science, LSE Library.
    20. Burt S. Barnow & Jeffrey Smith, 2015. "Employment and Training Programs," NBER Chapters, in: Economics of Means-Tested Transfer Programs in the United States, Volume 2, pages 127-234, National Bureau of Economic Research, Inc.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:camsys:v:15:y:2019:i:1-2:n:e1027. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://doi.org/10.1111/(ISSN)1891-1803 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.