IDEAS home Printed from https://ideas.repec.org/a/vrs/offsta/v32y2016i3p643-660n5.html
   My bibliography  Save this article

Detecting Fraudulent Interviewers by Improved Clustering Methods – The Case of Falsifications of Answers to Parts of a Questionnaire

Author

Listed:
  • De Haas Samuel

    (University of Giessen, Chair of Industrial Organisation, Regulation and Antitrust, and Chair of Statistics and Econometrics, Licher Strasse 64, 35394 Giessen, Germany.)

  • Winker Peter

    (University of Giessen, Chair of Industrial Organisation, Regulation and Antitrust, and Chair of Statistics and Econometrics, Licher Strasse 64, 35394 Giessen, Germany.)

Abstract

Falsified interviews represent a serious threat to empirical research based on survey data. The identification of such cases is important to ensure data quality. Applying cluster analysis to a set of indicators helps to identify suspicious interviewers when a substantial share of all of their interviews are complete falsifications, as shown by previous research. This analysis is extended to the case when only a share of questions within all interviews provided by an interviewer is fabricated. The assessment is based on synthetic datasets with a priori set properties. These are constructed from a unique experimental dataset containing both real and fabricated data for each respondent. Such a bootstrap approach makes it possible to evaluate the robustness of the method when the share of fabricated answers per interview decreases. The results indicate a substantial loss of discriminatory power in the standard cluster analysis if the share of fabricated answers within an interview becomes small. Using a novel cluster method which allows imposing constraints on cluster sizes, performance can be improved, in particular when only few falsifiers are present. This new approach will help to increase the robustness of survey data by detecting potential falsifiers more reliably.

Suggested Citation

  • De Haas Samuel & Winker Peter, 2016. "Detecting Fraudulent Interviewers by Improved Clustering Methods – The Case of Falsifications of Answers to Parts of a Questionnaire," Journal of Official Statistics, Sciendo, vol. 32(3), pages 643-660, September.
  • Handle: RePEc:vrs:offsta:v:32:y:2016:i:3:p:643-660:n:5
    DOI: 10.1515/jos-2016-0033
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/jos-2016-0033
    Download Restriction: no

    File URL: https://libkey.io/10.1515/jos-2016-0033?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Gilli, Manfred & Maringer, Dietmar & Schumann, Enrico, 2011. "Numerical Methods and Optimization in Finance," Elsevier Monographs, Elsevier, edition 1, number 9780123756626.
    2. Bredl, Sebastian & Storfinger, Nina & Menold, Natalja, 2011. "A literature review of methods to detect fabricated survey data," Discussion Papers 56, Justus Liebig University Giessen, Center for international Development and Environmental Research (ZEU).
    3. Bredl, Sebastian & Winker, Peter & Kötschau, Kerstin, 2008. "A statistical approach to detect cheating interviewers," Discussion Papers 39, Justus Liebig University Giessen, Center for international Development and Environmental Research (ZEU).
    4. Finn, Arden & Ranchhod, Vimal, 2013. "Genuine Fakes: The prevalence and implications of fieldworker fraud in a large South African survey," SALDRU Working Papers 115, Southern Africa Labour and Development Research Unit, University of Cape Town.
    5. Kemper, Christoph & Trofimow, Viktoria & Rammstedt, Beatrice & Menold, Natalja, 2011. "Indicators for the ex post detection of faking in survey data constructed from responses to the Big Five Inventory-10 (BFI-10)," Publications of Darmstadt Technical University, Institute for Business Studies (BWL) 65242, Darmstadt Technical University, Department of Business Administration, Economics and Law, Institute for Business Studies (BWL).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Olbrich, Lukas & Kosyakova, Yuliya & Sakshaug, Joseph W., 2022. "The reliability of adult self-reported height: The role of interviewers," Economics & Human Biology, Elsevier, vol. 45(C).
    2. Eugenio Paglino & Tom Emery, 2020. "Evaluating interviewer manipulation in the new round of the Generations and Gender Survey," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 43(50), pages 1461-1494.
    3. Kosyakova, Yuliya & Olbrich, Lukas & Sakshaug, Joseph & Schwanhäuser, Silvia, 2019. "Identification of interviewer falsification in the IAB-BAMF-SOEP Survey of Refugees in Germany," FDZ Methodenreport 201902_en, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    4. repec:iab:iabfme:201902(en is not listed on IDEAS

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kosyakova, Yuliya & Olbrich, Lukas & Sakshaug, Joseph & Schwanhäuser, Silvia, 2019. "Identification of interviewer falsification in the IAB-BAMF-SOEP Survey of Refugees in Germany," FDZ Methodenreport 201902_en, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    2. Mario Gyori & Tatiana Martínez Zavala & Jessica Baier & Maria Hernandez & Sofie Olsson & Alexis Lefevre, 2017. "Social and Behaviour Change Communication (SBCC) project in Manica, Mozambique: baseline survey report," Working Papers 162, International Policy Centre for Inclusive Growth.
    3. Storfinger, Nina & Winker, Peter, 2011. "Robustness of clustering methods for identification of potential falsifications in survey data," Discussion Papers 57, Justus Liebig University Giessen, Center for international Development and Environmental Research (ZEU).
    4. Marc S. Paolella, 2014. "Fast Methods For Large-Scale Non-Elliptical Portfolio Optimization," Annals of Financial Economics (AFE), World Scientific Publishing Co. Pte. Ltd., vol. 9(02), pages 1-32.
    5. Kapetanios, George & Marcellino, Massimiliano & Papailias, Fotis, 2016. "Forecasting inflation and GDP growth using heuristic optimisation of information criteria and variable reduction methods," Computational Statistics & Data Analysis, Elsevier, vol. 100(C), pages 369-382.
    6. Cahuc, Pierre & Malherbet, Franck & Prat, Julien, 2019. "The Detrimental Effect of Job Protection on Employment: Evidence from France," IZA Discussion Papers 12384, Institute of Labor Economics (IZA).
    7. Manfred Gilli & Enrico Schumann, 2012. "Heuristic optimisation in financial modelling," Annals of Operations Research, Springer, vol. 193(1), pages 129-158, March.
    8. Josten Michael & Trappmann Mark, 2016. "Interviewer Effects on a Network-Size Filter Question," Journal of Official Statistics, Sciendo, vol. 32(2), pages 349-373, June.
    9. Manuel Kleinknecht & Wing Lon Ng, 2015. "Minimizing Basel III Capital Requirements with Unconditional Coverage Constraint," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 22(4), pages 263-281, October.
    10. Longbing Cao, 2021. "AI in Finance: Challenges, Techniques and Opportunities," Papers 2107.09051, arXiv.org.
    11. Capuozzo, Pietro & Panella, Emanuele & Schettini Gherardini, Tancredi & Vvedensky, Dimitri D., 2021. "Path integral Monte Carlo method for option pricing," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 581(C).
    12. Kerstin Ruckdeschel & Lenore Sauer & Robert Naderi, 2016. "Reliability of retrospective event histories within the German Generations and Gender Survey," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 34(11), pages 321-358.
    13. Singh, Vikas Vikram & Lisser, Abdel & Arora, Monika, 2021. "An equivalent mathematical program for games with random constraints," Statistics & Probability Letters, Elsevier, vol. 174(C).
    14. Samuel Fern'andez-Lorenzo & Diego Porras & Juan Jos'e Garc'ia-Ripoll, 2020. "Hybrid quantum-classical optimization for financial index tracking," Papers 2008.12050, arXiv.org, revised Oct 2021.
    15. Essers, Dennis, 2013. "South African labour market transitions during the global financial and economic crisis: Micro-level evidence from the NIDS panel and matched QLFS cross-sections," IOB Working Papers 2013.12, Universiteit Antwerpen, Institute of Development Policy (IOB).
    16. Kingori, Patricia & Gerrets, René, 2016. "Morals, morale and motivations in data fabrication: Medical research fieldworkers views and practices in two Sub-Saharan African contexts," Social Science & Medicine, Elsevier, vol. 166(C), pages 150-159.
    17. Hatice Uenal & David Hampel, 2017. "Economic Aspects of the Missing Data Problem - the Case of the Patient Registry," Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, Mendel University Press, vol. 65(5), pages 1779-1791.
    18. Stefan Andreea-Mirabela, 2020. "Metaheuristichybridization: Memeticalgorithm," Annals of University of Craiova - Economic Sciences Series, University of Craiova, Faculty of Economics and Business Administration, vol. 1(48), pages 155-164, August.
    19. Miśkiewicz, Janusz, 2013. "Power law classification scheme of time series correlations. On the example of G20 group," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(9), pages 2150-2162.
    20. Andrea Scozzari & Fabio Tardella & Sandra Paterlini & Thiemo Krink, 2013. "Exact and heuristic approaches for the index tracking problem with UCITS constraints," Annals of Operations Research, Springer, vol. 205(1), pages 235-250, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:offsta:v:32:y:2016:i:3:p:643-660:n:5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.sciendo.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.