IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v75y2007i2p199-217.html
   My bibliography  Save this article

Statistical Disclosure Control Methods for Census Frequency Tables

Author

Listed:
  • Natalie Shlomo

Abstract

This paper provides a review of common statistical disclosure control (SDC) methods implemented at statistical agencies for standard tabular outputs containing whole population counts from a census (either enumerated or based on a register). These methods include record swapping on the microdata prior to its tabulation and rounding of entries in the tables after they are produced. The approach for assessing SDC methods is based on a disclosure risk–data utility framework and the need to find a balance between managing disclosure risk while maximizing the amount of information that can be released to users and ensuring high quality outputs. To carry out the analysis, quantitative measures of disclosure risk and data utility are defined and methods compared. Conclusions from the analysis show that record swapping as a sole SDC method leaves high probabilities of disclosure risk. Targeted record swapping lowers the disclosure risk, but there is more distortion of distributions. Small cell adjustments (rounding) give protection to census tables by eliminating small cells but only one set of variables and geographies can be disseminated in order to avoid disclosure by differencing nested tables. Full random rounding offers more protection against disclosure by differencing, but margins are typically rounded separately from the internal cells and tables are not additive. Rounding procedures protect against the perception of disclosure risk compared to record swapping since no small cells appear in the tables. Combining rounding with record swapping raises the level of protection but increases the loss of utility to census tabular outputs. For some statistical analysis, the combination of record swapping and rounding balances to some degree opposing effects that the methods have on the utility of the tables. Cet article propose une revue des méthodes de contrôle de la divulgation statistique (CDS) mises en place par les agences statistiques lors de production de tableaux statistiques dérivés de données des recensements. Ceci inclue des techniques de pré‐traitements du type ≪hybridation≫—échange partiel d'information entre individus—ou des méthodes d'arrondis effectuées après la production des tableaux. L'approche des méthodes CDS présentée insiste sur la nécessité de trouver un équilibre entre la gestion du risque de divulgation tout en maximisant la quantité d'information qui peut être fournie aux utilisateurs. Des mesures quantitatives de risques et de degré d'utilité sont proposés et comparées. Les conclusions des analyses montrent que la technique d'hybridation peut conduire à des cas de divulgations pour les tableaux présentant des cellules à faibles effectifs. La même technique utilisée sur des individus “ciblés” diminue le risque mais au détriment des distributions statistiques. La méthode de l'arrondi protége les tableaux en éliminant les cellules à faibles effectifs mais un seul type de variables et géographie doivent être publiés pour éviter le risque de divulgation par différenciation quand les tableaux sont liés les uns aux autres. L'arrondi aléatoire donne plus de protection contre le risque par différenciation mais certaines cellules peuvent être reconstruites par comparaison avec les marges. Les techniques d'arrondis protègent contre la perception du risque mieux que l'hybridation. Combiner hybridation et arrondi augmente le niveau de protection mais augmente la perte de qualité quant à l'utilité des sorties statistiques. Dans certaines analyses statistiques, les deux approches utilisées simultanément peuvent cependant produire un effet équilibré.

Suggested Citation

  • Natalie Shlomo, 2007. "Statistical Disclosure Control Methods for Census Frequency Tables," International Statistical Review, International Statistical Institute, vol. 75(2), pages 199-217, August.
  • Handle: RePEc:bla:istatr:v:75:y:2007:i:2:p:199-217
    DOI: 10.1111/j.1751-5823.2007.00010.x
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/j.1751-5823.2007.00010.x
    Download Restriction: no

    File URL: https://libkey.io/10.1111/j.1751-5823.2007.00010.x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chipperfield James O., 2014. "Disclosure-Protected Inference with Linked Microdata Using a Remote Analysis Server," Journal of Official Statistics, Sciendo, vol. 30(1), pages 123-146, March.
    2. Shlomo Natalie & Antal Laszlo & Elliot Mark, 2015. "Measuring Disclosure Risk and Data Utility for Flexible Table Generators," Journal of Official Statistics, Sciendo, vol. 31(2), pages 305-324, June.
    3. Jerome P. Reiter, 2009. "Using Multiple Imputation to Integrate and Disseminate Confidential Microdata," International Statistical Review, International Statistical Institute, vol. 77(2), pages 179-195, August.
    4. Christine M. O'Keefe & James O. Chipperfield, 2013. "A Summary of Attack Methods and Confidentiality Protection Measures for Fully Automated Remote Analysis Systems," International Statistical Review, International Statistical Institute, vol. 81(3), pages 426-455, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:75:y:2007:i:2:p:199-217. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.