IDEAS home Printed from https://ideas.repec.org/a/inm/ormnsc/v52y2006i5p658-670.html
   My bibliography  Save this article

Data Shuffling--A New Masking Approach for Numerical Data

Author

Listed:
  • Krishnamurty Muralidhar

    (Gatton College of Business and Economics, University of Kentucky, Lexington, Kentucky 40506)

  • Rathindra Sarathy

    (Spears School of Business, Oklahoma State University, Stillwater, Oklahoma 74078)

Abstract

This study discusses a new procedure for masking confidential numerical data--a procedure called data shuffling--in which the values of the confidential variables are "shuffled" among observations. The shuffled data provides a high level of data utility and minimizes the risk of disclosure. From a practical perspective, data shuffling overcomes reservations about using perturbed or modified confidential data because it retains all the desirable properties of perturbation methods and performs better than other masking techniques in both data utility and disclosure risk. In addition, data shuffling can be implemented using only rank-order data, and thus provides a nonparametric method for masking. We illustrate the applicability of data shuffling for small and large data sets.

Suggested Citation

  • Krishnamurty Muralidhar & Rathindra Sarathy, 2006. "Data Shuffling--A New Masking Approach for Numerical Data," Management Science, INFORMS, vol. 52(5), pages 658-670, May.
  • Handle: RePEc:inm:ormnsc:v:52:y:2006:i:5:p:658-670
    DOI: 10.1287/mnsc.1050.0503
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/mnsc.1050.0503
    Download Restriction: no

    File URL: https://libkey.io/10.1287/mnsc.1050.0503?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Krishnamurty Muralidhar & Rahul Parsa & Rathindra Sarathy, 1999. "A General Additive Data Perturbation Method for Database Security," Management Science, INFORMS, vol. 45(10), pages 1399-1415, October.
    2. Robert T. Clemen & Terence Reilly, 1999. "Correlations and Copulas for Decision and Risk Analysis," Management Science, INFORMS, vol. 45(2), pages 208-224, February.
    3. Ram Gopal & Robert Garfinkel & Paulo Goes, 2002. "Confidentiality via Camouflage: The CVC Approach to Disclosure Limitation When Answering Queries to Databases," Operations Research, INFORMS, vol. 50(3), pages 501-516, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Alexander Naidenov, 2016. "Contemporary methods for statistical disclosure control," Economic Thought journal, Bulgarian Academy of Sciences - Economic Research Institute, issue 2, pages 125-134.
    2. Castro, Jordi, 2012. "Recent advances in optimization techniques for statistical tabular data protection," European Journal of Operational Research, Elsevier, vol. 216(2), pages 257-269.
    3. Sage, Andrew J. & Wright, Stephen E., 2016. "Obtaining cell counts for contingency tables from rounded conditional frequencies," European Journal of Operational Research, Elsevier, vol. 250(1), pages 91-100.
    4. Natsuki Sano, 2022. "Utility and Risk Evaluation of Synthetic Data by Orthogonal Transformation," The Review of Socionetwork Strategies, Springer, vol. 16(1), pages 71-79, April.
    5. Nigel Melville & Michael McQuaid, 2012. "Research Note ---Generating Shareable Statistical Databases for Business Value: Multiple Imputation with Multimodal Perturbation," Information Systems Research, INFORMS, vol. 23(2), pages 559-574, June.
    6. Lomax, Nik & Loukides, Grigorios, 2021. "Privacy-preserving data publishing through anonymization, statistical disclosure control, and de-identification," OSF Preprints 2fvj7, Center for Open Science.
    7. Chu, Amanda M.Y. & Ip, Chun Yin & Lam, Benson S.Y. & So, Mike K.P., 2022. "Vine copula statistical disclosure control for mixed-type data," Computational Statistics & Data Analysis, Elsevier, vol. 176(C).
    8. repec:crs:wpidms:m2016-07 is not listed on IDEAS
    9. Seokho Lee & Marc G. Genton & Reinaldo B. Arellano-Valle, 2010. "Perturbation of Numerical Confidential Data via Skew-t Distributions," Management Science, INFORMS, vol. 56(2), pages 318-333, February.
    10. Templ, Matthias & Kowarik, Alexander & Meindl, Bernhard, 2015. "Statistical Disclosure Control for Micro-Data Using the R Package sdcMicro," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 67(i04).
    11. Trottini, Mario & Muralidhar, Krish & Sarathy, Rathindra, 2011. "Maintaining tail dependence in data shuffling using t copula," Statistics & Probability Letters, Elsevier, vol. 81(3), pages 420-428, March.
    12. Matthew J. Schneider & Dawn Iacobucci, 2020. "Protecting survey data on a consumer level," Journal of Marketing Analytics, Palgrave Macmillan, vol. 8(1), pages 3-17, March.
    13. Yi Qian & Hui Xie, 2015. "Drive More Effective Data-Based Innovations: Enhancing the Utility of Secure Databases," Management Science, INFORMS, vol. 61(3), pages 520-541, March.
    14. Haibing Lu & Jaideep Vaidya & Vijayalakshmi Atluri & Yingjiu Li, 2015. "Statistical Database Auditing Without Query Denial Threat," INFORMS Journal on Computing, INFORMS, vol. 27(1), pages 20-34, February.
    15. Amanda M. Y. Chu & Benson S. Y. Lam & Agnes Tiwari & Mike K. P. So, 2019. "An Empirical Study of Applying Statistical Disclosure Control Methods to Public Health Research," IJERPH, MDPI, vol. 16(22), pages 1-17, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rathindra Sarathy & Krishnamurty Muralidhar & Rahul Parsa, 2002. "Perturbing Nonnormal Confidential Attributes: The Copula Approach," Management Science, INFORMS, vol. 48(12), pages 1613-1627, December.
    2. Seokho Lee & Marc G. Genton & Reinaldo B. Arellano-Valle, 2010. "Perturbation of Numerical Confidential Data via Skew-t Distributions," Management Science, INFORMS, vol. 56(2), pages 318-333, February.
    3. A. E. Ades & Karl Claxton & Mark Sculpher, 2006. "Evidence synthesis, parameter correlation and probabilistic sensitivity analysis," Health Economics, John Wiley & Sons, Ltd., vol. 15(4), pages 373-381, April.
    4. Ho-Yin Mak & Zuo-Jun Max Shen, 2014. "Pooling and Dependence of Demand and Yield in Multiple-Location Inventory Systems," Manufacturing & Service Operations Management, INFORMS, vol. 16(2), pages 263-269, May.
    5. Plischke, Elmar & Borgonovo, Emanuele, 2019. "Copula theory and probabilistic sensitivity analysis: Is there a connection?," European Journal of Operational Research, Elsevier, vol. 277(3), pages 1046-1059.
    6. Alexei Alexandrov & Özlem Bedre-Defolie, 2014. "The Equivalence of Bundling and Advance Sales," Marketing Science, INFORMS, vol. 33(2), pages 259-272, March.
    7. P. Daniel Wright & Matthew J. Liberatore & Robert L. Nydick, 2006. "A Survey of Operations Research Models and Applications in Homeland Security," Interfaces, INFORMS, vol. 36(6), pages 514-529, December.
    8. Tianyang Wang & James S. Dyer & Warren J. Hahn, 2017. "Sensitivity analysis of decision making under dependent uncertainties using copulas," EURO Journal on Decision Processes, Springer;EURO - The Association of European Operational Research Societies, vol. 5(1), pages 117-139, November.
    9. Charles J. Corbett & Kumar Rajaram, 2006. "A Generalization of the Inventory Pooling Effect to Nonnormal Dependent Demand," Manufacturing & Service Operations Management, INFORMS, vol. 8(4), pages 351-358, August.
    10. Benoumechiara Nazih & Bousquet Nicolas & Michel Bertrand & Saint-Pierre Philippe, 2020. "Detecting and modeling critical dependence structures between random inputs of computer models," Dependence Modeling, De Gruyter, vol. 8(1), pages 263-297, January.
    11. Donald L. Keefer & Craig W. Kirkwood & James L. Corner, 2004. "Perspective on Decision Analysis Applications, 1990–2001," Decision Analysis, INFORMS, vol. 1(1), pages 4-22, March.
    12. Benoumechiara Nazih & Bousquet Nicolas & Michel Bertrand & Saint-Pierre Philippe, 2020. "Detecting and modeling critical dependence structures between random inputs of computer models," Dependence Modeling, De Gruyter, vol. 8(1), pages 263-297, January.
    13. Trottini, Mario & Muralidhar, Krish & Sarathy, Rathindra, 2011. "Maintaining tail dependence in data shuffling using t copula," Statistics & Probability Letters, Elsevier, vol. 81(3), pages 420-428, March.
    14. Penikas, Henry & Simakova, Varvara, 2009. "Interest Rate Risk Management Based on Copula-GARCH Models," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 13(1), pages 3-36.
    15. Wang, Fan & Li, Heng & Dong, Chao & Ding, Lieyun, 2019. "Knowledge representation using non-parametric Bayesian networks for tunneling risk analysis," Reliability Engineering and System Safety, Elsevier, vol. 191(C).
    16. Amanda M. Y. Chu & Benson S. Y. Lam & Agnes Tiwari & Mike K. P. So, 2019. "An Empirical Study of Applying Statistical Disclosure Control Methods to Public Health Research," IJERPH, MDPI, vol. 16(22), pages 1-17, November.
    17. Chenguang (Allen) Wu & Achal Bassamboo & Ohad Perry, 2019. "Service System with Dependent Service and Patience Times," Management Science, INFORMS, vol. 65(3), pages 1151-1172, March.
    18. Durante Fabrizio & Puccetti Giovanni & Scherer Matthias & Vanduffel Steven, 2017. "My introduction to copulas: An interview with Roger Nelsen," Dependence Modeling, De Gruyter, vol. 5(1), pages 88-98, January.
    19. Jason R. W. Merrick, 2008. "Getting the Right Mix of Experts," Decision Analysis, INFORMS, vol. 5(1), pages 43-52, March.
    20. Pham, Linh & Nguyen, Canh Phuc, 2021. "Asymmetric tail dependence between green bonds and other asset classes," Global Finance Journal, Elsevier, vol. 50(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormnsc:v:52:y:2006:i:5:p:658-670. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.