IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v250y2016i1p91-100.html
   My bibliography  Save this article

Obtaining cell counts for contingency tables from rounded conditional frequencies

Author

Listed:
  • Sage, Andrew J.
  • Wright, Stephen E.

Abstract

We present an integer linear programming formulation and solution procedure for determining the tightest bounds on cell counts in a multi-way contingency table, given knowledge of a corresponding derived two-way table of rounded conditional probabilities and the sample size. The problem has application in statistical disclosure limitation, which is concerned with releasing useful data to the public and researchers while also preserving privacy and confidentiality. Previous work on this problem invoked the simplifying assumption that the conditionals were released as fractions in lowest terms, rather than the more realistic and complicated setting of rounded decimal values that is treated here. The proposed procedure finds all possible counts for each cell and runs fast enough to handle moderately sized tables.

Suggested Citation

  • Sage, Andrew J. & Wright, Stephen E., 2016. "Obtaining cell counts for contingency tables from rounded conditional frequencies," European Journal of Operational Research, Elsevier, vol. 250(1), pages 91-100.
  • Handle: RePEc:eee:ejores:v:250:y:2016:i:1:p:91-100
    DOI: 10.1016/j.ejor.2015.09.011
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221715008358
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2015.09.011?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Krishnamurty Muralidhar & Rathindra Sarathy, 2006. "Data Shuffling--A New Masking Approach for Numerical Data," Management Science, INFORMS, vol. 52(5), pages 658-670, May.
    2. Castro, Jordi, 2006. "Minimum-distance controlled perturbation methods for large-scale tabular data protection," European Journal of Operational Research, Elsevier, vol. 171(1), pages 39-52, May.
    3. James Kelly & Bruce Golden & Arjang Assad, 1990. "Using Simulated Annealing to Solve Controlled Rounding Problems," INFORMS Journal on Computing, INFORMS, vol. 2(2), pages 174-185, May.
    4. Aleksandra Slavković & Xiaotian Zhu & Sonja Petrović, 2015. "Fibers of multi-way contingency tables given conditionals: relation to marginals, cell bounds and Markov bases," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(4), pages 621-648, August.
    5. Sanil, Ashish & Gomatam, Shanti & Karr, Alan F., 2003. "NISS WebSwap: A Web Service for Data Swapping," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 8(i07).
    6. Castro, Jordi, 2012. "Recent advances in optimization techniques for statistical tabular data protection," European Journal of Operational Research, Elsevier, vol. 216(2), pages 257-269.
    7. James P. Kelly & Bruce L. Golden & Arjang A. Assad & Edward K. Baker, 1990. "Controlled Rounding of Tabular Data," Operations Research, INFORMS, vol. 38(5), pages 760-772, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Castro, Jordi, 2012. "Recent advances in optimization techniques for statistical tabular data protection," European Journal of Operational Research, Elsevier, vol. 216(2), pages 257-269.
    2. Castro, Jordi, 2006. "Minimum-distance controlled perturbation methods for large-scale tabular data protection," European Journal of Operational Research, Elsevier, vol. 171(1), pages 39-52, May.
    3. George, John A. & Kuan, Chong Juin & Ring, Brendan J., 1995. "Confidentiality control of tabulated data: Some practical network models," European Journal of Operational Research, Elsevier, vol. 85(3), pages 454-472, September.
    4. Ring, Brendan J. & George, John A. & Kuan, Chong Juin, 1997. "A fast algorithm for large-scale controlled rounding of 3-dimensional census tables," Socio-Economic Planning Sciences, Elsevier, vol. 31(1), pages 41-55, March.
    5. Juan-José Salazar-González, 2005. "A Unified Mathematical Programming Framework for Different Statistical Disclosure Limitation Methods," Operations Research, INFORMS, vol. 53(5), pages 819-829, October.
    6. Sedeño-Noda, A. & González-Dávila, E. & González-Martín, C. & González-Yanes, A., 2009. "Preemptive benchmarking problem: An approach for official statistics in small areas," European Journal of Operational Research, Elsevier, vol. 196(1), pages 360-369, July.
    7. Trottini, Mario & Muralidhar, Krish & Sarathy, Rathindra, 2011. "Maintaining tail dependence in data shuffling using t copula," Statistics & Probability Letters, Elsevier, vol. 81(3), pages 420-428, March.
    8. Amanda M. Y. Chu & Benson S. Y. Lam & Agnes Tiwari & Mike K. P. So, 2019. "An Empirical Study of Applying Statistical Disclosure Control Methods to Public Health Research," IJERPH, MDPI, vol. 16(22), pages 1-17, November.
    9. Sumit Dutta Chowdhury & George T. Duncan & Ramayya Krishnan & Stephen F. Roehrig & Sumitra Mukherjee, 1999. "Disclosure Detection in Multivariate Categorical Databases: Auditing Confidentiality Protection Through Two New Matrix Operators," Management Science, INFORMS, vol. 45(12), pages 1710-1723, December.
    10. Kazuhiro Minami & Yutaka Abe, 2017. "Statistical Disclosure Control for Tabular Data in R," Romanian Statistical Review, Romanian Statistical Review, vol. 65(4), pages 67-76, December.
    11. Daniel Baena & Jordi Castro & Antonio Frangioni, 2020. "Stabilized Benders Methods for Large-Scale Combinatorial Optimization, with Application to Data Privacy," Management Science, INFORMS, vol. 66(7), pages 3051-3068, July.
    12. Matthew J. Schneider & Dawn Iacobucci, 2020. "Protecting survey data on a consumer level," Journal of Marketing Analytics, Palgrave Macmillan, vol. 8(1), pages 3-17, March.
    13. repec:crs:wpidms:m2016-07 is not listed on IDEAS
    14. Templ, Matthias & Kowarik, Alexander & Meindl, Bernhard, 2015. "Statistical Disclosure Control for Micro-Data Using the R Package sdcMicro," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 67(i04).
    15. Hugo E. Caceres & Ben Moews, 2024. "Evaluating utility in synthetic banking microdata applications," Papers 2410.22519, arXiv.org.
    16. Jordi Castro & Antonio Frangioni & Claudio Gentile, 2014. "Perspective Reformulations of the CTA Problem with L 2 Distances," Operations Research, INFORMS, vol. 62(4), pages 891-909, August.
    17. Haibing Lu & Jaideep Vaidya & Vijayalakshmi Atluri & Yingjiu Li, 2015. "Statistical Database Auditing Without Query Denial Threat," INFORMS Journal on Computing, INFORMS, vol. 27(1), pages 20-34, February.
    18. Alexander Naidenov, 2016. "Contemporary methods for statistical disclosure control," Economic Thought journal, Bulgarian Academy of Sciences - Economic Research Institute, issue 2, pages 125-134.
    19. Jordi Castro & Jordi Cuesta, 2013. "Solving L 1 -CTA in 3D tables by an interior-point method for primal block-angular problems," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 21(1), pages 25-47, April.
    20. Nigel Melville & Michael McQuaid, 2012. "Research Note ---Generating Shareable Statistical Databases for Business Value: Multiple Imputation with Multimodal Perturbation," Information Systems Research, INFORMS, vol. 23(2), pages 559-574, June.
    21. Ross, Anthony D., 2000. "A two-phased approach to the supply network reconfiguration problem," European Journal of Operational Research, Elsevier, vol. 122(1), pages 18-30, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:250:y:2016:i:1:p:91-100. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.