IDEAS home Printed from https://ideas.repec.org/a/inm/orisre/v33y2022i1p152-178.html
   My bibliography  Save this article

Modifying Transactional Databases to Hide Sensitive Association Rules

Author

Listed:
  • Syam Menon

    (Jindal School of Management, University of Texas at Dallas, Richardson, Texas 75080)

  • Abhijeet Ghoshal

    (Gies College of Business, University of Illinois Urbana–Champaign, Champaign, Illinois 61820)

  • Sumit Sarkar

    (Jindal School of Management, University of Texas at Dallas, Richardson, Texas 75080)

Abstract

Firms have been sharing transactional data with business partners ever since electronic data interchange was introduced to the retail industry in the 1980s. The potential benefits of data sharing notwithstanding, there has been continued reluctance on the part of data owners to share their data, for fear of sensitive information potentially making its way to competitors. Approaches that can help hide sensitive information could alleviate such concerns and increase the number of firms that are willing to share. Sensitive information in transactional databases often manifests itself in the form of association rules. Association rules can be concealed by altering transactions such that these sensitive rules stay hidden when the data are mined. The problem of hiding sensitive association rules is NP-hard, and to date, it has only been addressed via heuristic approaches. In this paper, we introduce a nonlinear integer formulation to hide sensitive association rules while maximizing the accuracy of the altered database. We then separate it into two problems: the sanitization problem , which hides sensitive association rules from a specific transaction while altering the transaction as minimally as possible, and the accuracy maximization problem , which maximizes the accuracy of the altered database, given a solution to the sanitization problem. We show how the sanitization problem can be represented as an integer program and propose a heuristic based on intuition from this formulation to solve it. Next, we formulate the accuracy maximization problem as a nonlinear integer program, show how it can be linearized, and derive various results that help reduce the size of the problem to be solved. Computational experiments are conducted on real and synthetic data sets, the largest of which has 100 million transactions. Our results show that although the nonlinear integer formulations are not practical, the linearizations and problem-reduction steps make a significant impact on solvability and solution time. We also find that there are substantial gains to be realized vis-à-vis existing approaches in terms of both solution quality and solution time and that hiding the rules directly (rather than by hiding the associated itemsets) can result in significantly fewer transactions being sanitized.

Suggested Citation

  • Syam Menon & Abhijeet Ghoshal & Sumit Sarkar, 2022. "Modifying Transactional Databases to Hide Sensitive Association Rules," Information Systems Research, INFORMS, vol. 33(1), pages 152-178, March.
  • Handle: RePEc:inm:orisre:v:33:y:2022:i:1:p:152-178
    DOI: 10.1287/isre.2021.1033
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/isre.2021.1033
    Download Restriction: no

    File URL: https://libkey.io/10.1287/isre.2021.1033?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Manuel A. Nunez & Robert S. Garfinkel & Ram D. Gopal, 2007. "Stochastic Protection of Confidential Information in Databases: A Hybrid of Data Perturbation and Query Restriction," Operations Research, INFORMS, vol. 55(5), pages 890-908, October.
    2. Syam Menon & Sumit Sarkar, 2007. "Minimizing Information Loss and Preserving Privacy," Management Science, INFORMS, vol. 53(1), pages 101-116, January.
    3. Hau L. Lee & Kut C. So & Christopher S. Tang, 2000. "The Value of Information Sharing in a Two-Level Supply Chain," Management Science, INFORMS, vol. 46(5), pages 626-643, May.
    4. Peng Cheng & Chun-Wei Lin & Jeng-Shyang Pan, 2015. "Use HypE to Hide Association Rules by Adding Items," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-19, June.
    5. Syam Menon & Sumit Sarkar & Shibnath Mukherjee, 2005. "Maximizing Accuracy of Shared Databases when Concealing Sensitive Patterns," Information Systems Research, INFORMS, vol. 16(3), pages 256-270, September.
    6. Yossi Aviv, 2002. "Gaining Benefits from Joint Forecasting and Replenishment Processes: The Case of Auto-Correlated Demand," Manufacturing & Service Operations Management, INFORMS, vol. 4(1), pages 55-74, December.
    7. Yossi Aviv, 2007. "On the Benefits of Collaborative Forecasting Partnerships Between Retailers and Manufacturers," Management Science, INFORMS, vol. 53(5), pages 777-794, May.
    8. Robert Garfinkel & Ram Gopal & Paulo Goes, 2002. "Privacy Protection of Binary Confidential Data Against Deterministic, Stochastic, and Insider Threat," Management Science, INFORMS, vol. 48(6), pages 749-764, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Karimi, Majid & Zaerpour, Nima, 2022. "Put your money where your forecast is: Supply chain collaborative forecasting with cost-function-based prediction markets," European Journal of Operational Research, Elsevier, vol. 300(3), pages 1035-1049.
    2. Graça, Paula & Camarinha-Matos, Luís M., 2017. "Performance indicators for collaborative business ecosystems — Literature review and trends," Technological Forecasting and Social Change, Elsevier, vol. 116(C), pages 237-255.
    3. Li Chen & Wei Luo & Kevin Shang, 2017. "Measuring the Bullwhip Effect: Discrepancy and Alignment Between Information and Material Flows," Manufacturing & Service Operations Management, INFORMS, vol. 19(1), pages 36-51, February.
    4. Tetsuo Iida & Paul Zipkin, 2010. "Competition and Cooperation in a Two-Stage Supply Chain with Demand Forecasts," Operations Research, INFORMS, vol. 58(5), pages 1350-1363, October.
    5. Sechan Oh & Özalp Özer, 2013. "Mechanism Design for Capacity Planning Under Dynamic Evolutions of Asymmetric Demand Forecasts," Management Science, INFORMS, vol. 59(4), pages 987-1007, April.
    6. Bharadwaj Kadiyala & Özalp Özer & Alain Bensoussan, 2020. "A Mechanism Design Approach to Vendor Managed Inventory," Management Science, INFORMS, vol. 66(6), pages 2628-2652, June.
    7. Eksoz, Can & Mansouri, S. Afshin & Bourlakis, Michael, 2014. "Collaborative forecasting in the food supply chain: A conceptual framework," International Journal of Production Economics, Elsevier, vol. 158(C), pages 120-135.
    8. Li Chen & Hau L. Lee, 2009. "Information Sharing and Order Variability Control Under a Generalized Demand Model," Management Science, INFORMS, vol. 55(5), pages 781-797, May.
    9. Li, Tian & Zhang, Hongtao, 2015. "Information sharing in a supply chain with a make-to-stock manufacturer," Omega, Elsevier, vol. 50(C), pages 115-125.
    10. Choi, Tsan-Ming & Sethi, Suresh, 2010. "Innovative quick response programs: A review," International Journal of Production Economics, Elsevier, vol. 127(1), pages 1-12, September.
    11. Sari, Kazim, 2008. "On the benefits of CPFR and VMI: A comparative simulation study," International Journal of Production Economics, Elsevier, vol. 113(2), pages 575-586, June.
    12. Ma, Yungao & Wang, Nengmin & He, Zhengwen & Lu, Jizhou & Liang, Huigang, 2015. "Analysis of the bullwhip effect in two parallel supply chains with interacting price-sensitive demands," European Journal of Operational Research, Elsevier, vol. 243(3), pages 815-825.
    13. Li Chen & Hau L. Lee, 2012. "Bullwhip Effect Measurement and Its Implications," Operations Research, INFORMS, vol. 60(4), pages 771-784, August.
    14. Ramanathan, Usha & Muyldermans, Luc, 2010. "Identifying demand factors for promotional planning and forecasting: A case of a soft drink company in the UK," International Journal of Production Economics, Elsevier, vol. 128(2), pages 538-545, December.
    15. Kaijie Zhu & Ulrich W. Thonemann, 2004. "Modeling the Benefits of Sharing Future Demand Information," Operations Research, INFORMS, vol. 52(1), pages 136-147, February.
    16. Vladimir Kovtun & Avi Giloni & Clifford Hurvich, 2014. "Assessing the value of demand sharing in supply chains," Naval Research Logistics (NRL), John Wiley & Sons, vol. 61(7), pages 515-531, October.
    17. Yossi Aviv, 2003. "A Time-Series Framework for Supply-Chain Inventory Management," Operations Research, INFORMS, vol. 51(2), pages 210-227, April.
    18. Warburton, Roger D.H. & Hodgson, J.P.E. & Nielsen, E.H., 2014. "Exact solutions to the supply chain equations for arbitrary, time-dependent demands," International Journal of Production Economics, Elsevier, vol. 151(C), pages 195-205.
    19. Xu, Xiaoyan & Choi, Tsan-Ming & Chung, Sai-Ho & Guo, Shu, 2023. "Collaborative-commerce in supply chains: A review and classification of analytical models," International Journal of Production Economics, Elsevier, vol. 263(C).
    20. R Fildes & K Nikolopoulos & S F Crone & A A Syntetos, 2008. "Forecasting and operational research: a review," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 59(9), pages 1150-1172, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orisre:v:33:y:2022:i:1:p:152-178. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.