IDEAS home Printed from https://ideas.repec.org/p/cen/wpaper/15-44.html
   My bibliography  Save this paper

Simultaneous Edit-Imputation for Continuous Microdata

Author

Listed:
  • Hang J. Kim
  • Lawrence H. Cox
  • Alan F. Karr
  • Jerome P. Reiter
  • Quanli Wang

Abstract

Many statistical organizations collect data that are expected to satisfy linear constraints; as examples, component variables should sum to total variables, and ratios of pairs of variables should be bounded by expert-specified constants. When reported data violate constraints, organizations identify and replace values potentially in error in a process known as edit-imputation. To date, most approaches separate the error localization and imputation steps, typically using optimization methods to identify the variables to change followed by hot deck imputation. We present an approach that fully integrates editing and imputation for continuous microdata under linear constraints. Our approach relies on a Bayesian hierarchical model that includes (i) a flexible joint probability model for the underlying true values of the data with support only on the set of values that satisfy all editing constraints, (ii) a model for latent indicators of the variables that are in error, and (iii) a model for the reported responses for variables in error. We illustrate the potential advantages of the Bayesian editing approach over existing approaches using simulation studies. We apply the model to edit faulty data from the 2007 U.S. Census of Manufactures. Supplementary materials for this article are available online.

Suggested Citation

  • Hang J. Kim & Lawrence H. Cox & Alan F. Karr & Jerome P. Reiter & Quanli Wang, 2015. "Simultaneous Edit-Imputation for Continuous Microdata," Working Papers 15-44, Center for Economic Studies, U.S. Census Bureau.
  • Handle: RePEc:cen:wpaper:15-44
    as

    Download full text from publisher

    File URL: https://www2.census.gov/ces/wp/2015/CES-WP-15-44.pdf
    File Function: First version, 2015
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hang J. Kim & Jerome P. Reiter & Quanli Wang & Lawrence H. Cox & Alan F. Karr, 2014. "Multiple Imputation of Missing or Faulty Values Under Linear Constraints," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 32(3), pages 375-386, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Hang J. Kim & Jörg Drechsler & Katherine J. Thompson, 2021. "Synthetic microdata for establishment surveys under informative sampling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 255-281, January.
    2. Orozco Vázquez Miguel, 2023. "Misallocation of Resources, Firm Characteristics, and Structural Factors: Evidence from Mexico," Working Papers 2023-11, Banco de México.
    3. Hang J. Kim & Jerome P. Reiter & Alan F. Karr, 2018. "Simultaneous edit-imputation and disclosure limitation for business establishment data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 45(1), pages 63-82, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jared S. Murray & Jerome P. Reiter, 2016. "Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1466-1479, October.
    2. Ton de Waal & Arnout van Delden & Sander Scholtus, 2020. "Multi‐source Statistics: Basic Situations and Methods," International Statistical Review, International Statistical Institute, vol. 88(1), pages 203-228, April.
    3. Danhyang Lee & Jae Kwang Kim, 2022. "Semiparametric imputation using conditional Gaussian mixture models under item nonresponse," Biometrics, The International Biometric Society, vol. 78(1), pages 227-237, March.
    4. Ton de Waal & Wieger Coutinho, 2017. "Preserving Logical Relations while Estimating Missing Values," Romanian Statistical Review, Romanian Statistical Review, vol. 65(3), pages 47-59, September.
    5. Hang J. Kim & Jörg Drechsler & Katherine J. Thompson, 2021. "Synthetic microdata for establishment surveys under informative sampling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 255-281, January.
    6. Paiva Thais & Reiter Jerome P., 2017. "Stop or Continue Data Collection: A Nonignorable Missing Data Approach for Continuous Variables," Journal of Official Statistics, Sciendo, vol. 33(3), pages 579-599, September.
    7. Nicole M. Dalzell & Jerome P. Reiter & Gale Boyd, 2017. "File Matching with Faulty Continuous Matching Variables," Working Papers 17-45, Center for Economic Studies, U.S. Census Bureau.
    8. Hang J. Kim & Jerome P. Reiter & Alan F. Karr, 2018. "Simultaneous edit-imputation and disclosure limitation for business establishment data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 45(1), pages 63-82, January.
    9. Thais Paiva & Jerry Reiter, 2014. "Using Imputation Techniques To Evaluate Stopping Rules In Adaptive Survey Design," Working Papers 14-40, Center for Economic Studies, U.S. Census Bureau.

    More about this item

    Keywords

    Bayesian; Economic; Editing; Missing; Mixture; Survey;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cen:wpaper:15-44. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Dawn Anderson (email available below). General contact details of provider: https://edirc.repec.org/data/cesgvus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.