IDEAS home Printed from https://ideas.repec.org/a/sae/somere/v37y2008i1p75-104.html
   My bibliography  Save this article

A New Mixture Model for Misclassification With Applications for Survey Data

Author

Listed:
  • Simon Cheng

    (University of Connecticut, Storrs, simon.cheng@uconn.edu)

  • Yingmei Xi

    (University of Connecticut, Storrs)

  • Ming-Hui Chen

    (University of Connecticut, Storrs)

Abstract

Social scientists often rely on survey data to examine group differences. A problem with survey data is the potential misclassification of group membership due to poorly trained interviewers, inconsistent responses, or errors in marking questions. In data containing unequal subsample sizes, the consequences of misclassification can be considerable, especially for groups with small sample sizes. In this study, the authors develop a new mixture model that allows researchers to address the problem using the data they have. By supplying additional information from the data, this two-stage model is estimated using a Bayesian method. The method is illustrated with the Early Childhood Longitudinal Study data. As anticipated, the more information supplied to adjust for group membership, the better the model performs. Even when small amounts of information are supplied, the model produces reasonably robust estimates and improves the fit compared to the no-adjustment model. Sensitivity analyses are conducted on choices of priors.

Suggested Citation

  • Simon Cheng & Yingmei Xi & Ming-Hui Chen, 2008. "A New Mixture Model for Misclassification With Applications for Survey Data," Sociological Methods & Research, , vol. 37(1), pages 75-104, August.
  • Handle: RePEc:sae:somere:v:37:y:2008:i:1:p:75-104
    DOI: 10.1177/0049124107313854
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/0049124107313854
    Download Restriction: no

    File URL: https://libkey.io/10.1177/0049124107313854?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. repec:mpr:mprres:3250 is not listed on IDEAS
    2. Helmut Küchenhoff & Samuel M. Mwalili & Emmanuel Lesaffre, 2006. "A General Method for Dealing with Misclassification in Regression: The Misclassification SIMEX," Biometrics, The International Biometric Society, vol. 62(1), pages 85-96, March.
    3. Neil S. Seftor & NSarah E. Turner, 2002. "Back to School: Federal Student Aid Policy and Adult College Enrollment," Journal of Human Resources, University of Wisconsin Press, vol. 37(2), pages 336-352.
    4. W. R. Gilks & P. Wild, 1992. "Adaptive Rejection Sampling for Gibbs Sampling," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 41(2), pages 337-348, June.
    5. Patrick Royston, 2004. "Multiple imputation of missing values," Stata Journal, StataCorp LP, vol. 4(3), pages 227-241, September.
    6. Rajeev H. Dehejia & Sadek Wahba, 2002. "Propensity Score-Matching Methods For Nonexperimental Causal Studies," The Review of Economics and Statistics, MIT Press, vol. 84(1), pages 151-161, February.
    7. Hill, Jennifer L. & Kriesi, Hanspeter, 2001. "Classification by Opinion-Changing Behavior: A Mixture Model Approach," Political Analysis, Cambridge University Press, vol. 9(4), pages 301-324, January.
    8. David J. Spiegelhalter & Nicola G. Best & Bradley P. Carlin & Angelika Van Der Linde, 2002. "Bayesian measures of model complexity and fit," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 583-639, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chibuzor Christopher Nnanatu & Glory Atilola & Paul Komba & Lubanzadio Mavatikua & Zhuzhi Moore & Dennis Matanda & Otibho Obianwu & Ngianga-Bakwin Kandala, 2021. "Evaluating changes in the prevalence of female genital mutilation/cutting among 0-14 years old girls in Nigeria using data from multiple surveys: A novel Bayesian hierarchical spatio-temporal model," PLOS ONE, Public Library of Science, vol. 16(2), pages 1-31, February.
    2. S. Upadhyay & M. Peshwani, 2008. "Posterior analysis of lognormal regression models using the Gibbs sampler," Statistical Papers, Springer, vol. 49(1), pages 59-85, March.
    3. Andrade, A.R. & Teixeira, P.F., 2015. "Statistical modelling of railway track geometry degradation using Hierarchical Bayesian models," Reliability Engineering and System Safety, Elsevier, vol. 142(C), pages 169-183.
    4. Radu Tunaru, 2015. "Model Risk in Financial Markets:From Financial Engineering to Risk Management," World Scientific Books, World Scientific Publishing Co. Pte. Ltd., number 9524, September.
    5. Refik Soyer & M. Murat Tarimcilar, 2008. "Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach," Management Science, INFORMS, vol. 54(2), pages 266-278, February.
    6. Wenchen Liu & Yincai Tang & Ancha Xu, 2021. "Zero-and-one-inflated Poisson regression model," Statistical Papers, Springer, vol. 62(2), pages 915-934, April.
    7. Franta, Michal, 2017. "Rare shocks vs. non-linearities: What drives extreme events in the economy? Some empirical evidence," Journal of Economic Dynamics and Control, Elsevier, vol. 75(C), pages 136-157.
    8. Hosoe, Nobuhiro & Takagi, Shingo, 2012. "Retail power market competition with endogenous entry decision—An auction data analysis," Journal of the Japanese and International Economies, Elsevier, vol. 26(3), pages 351-368.
    9. Md. Tuhin Sheikh & Ming-Hui Chen & Jonathan A. Gelfond & Joseph G. Ibrahim, 2022. "A Power Prior Approach for Leveraging External Longitudinal and Competing Risks Survival Data Within the Joint Modeling Framework," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 14(2), pages 318-336, July.
    10. Cynthia Tojeiro & Francisco Louzada, 2012. "A general threshold stress hybrid hazard model for lifetime data," Statistical Papers, Springer, vol. 53(4), pages 833-848, November.
    11. Min-Je Choi & Do-Hoon Kim, 2020. "Assessment and Management of Small Yellow Croaker ( Larimichthys polyactis ) Stocks in South Korea," Sustainability, MDPI, vol. 12(19), pages 1-17, October.
    12. Wu, Elwin & El-Bassel, Nabila & Gilbert, Louisa & Chang, Mingway & Sanders, Glorice, 2010. "Effects of receiving additional off-site services on abstinence from illicit drug use among men on methadone: A longitudinal study," Evaluation and Program Planning, Elsevier, vol. 33(4), pages 403-409, November.
    13. Kaan Kuzu & Refik Soyer, 2018. "Bayesian modeling of abandonments in ticket queues," Naval Research Logistics (NRL), John Wiley & Sons, vol. 65(6-7), pages 499-521, September.
    14. Ngianga-Bakwin Kandala & Chibuzor Christopher Nnanatu & Glory Atilola & Paul Komba & Lubanzadio Mavatikua & Zhuzhi Moore & Gerry Mackie & Bettina Shell-Duncan, 2019. "A Spatial Analysis of the Prevalence of Female Genital Mutilation/Cutting among 0–14-Year-Old Girls in Kenya," IJERPH, MDPI, vol. 16(21), pages 1-28, October.
    15. Song, J.J. & Ghosh, M. & Miaou, S. & Mallick, B., 2006. "Bayesian multivariate spatial models for roadway traffic crash mapping," Journal of Multivariate Analysis, Elsevier, vol. 97(1), pages 246-273, January.
    16. Naranjo, L. & Martín, J. & Pérez, C.J., 2014. "Bayesian binary regression with exponential power link," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 464-476.
    17. Cristina Borra & Almudena Sevilla & Jonathan Gershuny, 2013. "Calibrating Time-Use Estimates for the British Household Panel Survey," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 114(3), pages 1211-1224, December.
    18. Brown, Sarah & Ghosh, Pulak & Su, Li & Taylor, Karl, 2015. "Modelling household finances: A Bayesian approach to a multivariate two-part model," Journal of Empirical Finance, Elsevier, vol. 33(C), pages 190-207.
    19. Zhang, Yue & Zhang, Bin, 2018. "Semiparametric spatial model for interval-censored data with time-varying covariate effects," Computational Statistics & Data Analysis, Elsevier, vol. 123(C), pages 146-156.
    20. Roy, Vivekananda, 2014. "Efficient estimation of the link function parameter in a robust Bayesian binary regression model," Computational Statistics & Data Analysis, Elsevier, vol. 73(C), pages 87-102.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:somere:v:37:y:2008:i:1:p:75-104. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.