Abstract
Many data sharing applications require that publishing data should protect sensitive information pertaining to individuals, such as diseases of patients, the credit rating of a customer, and the salary of an employee. Meanwhile, certain information is required to be published. In this paper, we consider data-publishing applications where the publisher specifies both sensitive information and shared information. An adversary can infer the real value of a sensitive entry with a high confidence by using publishing data. The goal is to protect sensitive information in the presence of data inference using derived association rules on publishing data. We formulate the inference attack framework, and develop complexity results. We show that computing a safe partial table is an NP-hard problem. We classify the general problem into subcases based on the requirements of publishing information, and propose algorithms for finding a safe partial table to publish. We have conducted an empirical study to evaluate these algorithms on real data. The test results show that the proposed algorithms can produce approximate maximal published data and improve the performance of existing algorithms.
Similar content being viewed by others
References
Byun J W, Bertino E. Micro-views, or on how to protect privacy while enhancing data usability—concepts and challenges. SIGMOD Record, 2006, 35(1): 9–13
Sweeney L. Datafly: A system for providing anonymity in medical data. In: Lin T Y, Qian S, eds. Database Securty XI: Status and Prospects, IFIP TC11 WG11.3 Eleventh International Conference on Database Security (DBSec), IFIP Conference Proceedings, vol. 113. California: Chapman & Hall, 1997. 356–381
Yang X, Li C. Secure xml publishing without information leakage in the presence of data inference. In: Nascimento M A, Özsu M T, Kossmann D, et al., eds. Proceedings of the Thirtieth International Conference on Very Large Data Bases (VLDB), Toronto: Morgan Kaufmann, 2004. 96–107
Aggarwal G, Feder T, Kenthapadi K, et al. Anonymizing tables. In: Eiter T, Libkin L, eds. Database Theory (ICDT), Lect Notes in Computer Science, vol. 3363. Berlin: Springer-Verlag, 2005. 246–258
Meyerson A, Williams R. On the complexity of optimal k-anonymity. In: Deutsch A, ed. Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), Paris: ACM, 2004. 223–228
Sweeney L. k-Anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl.-Based Syst, 2002, 10(5): 557–570
Agrawal R, Srikant R. Privacy-preserving data mining. In: Chen W, Naughton J F, Bernstein P A, eds. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas: ACM, 2000. 439–450
Atallah M, Bertino E, Elmagarmid A, et al. Disclosure limitation of sensitive rules. In: Proceedings of 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX’99), Chicago, IL., 1999. 45–52
Saygin Y, Verykios V S, Clifton C. Using unknowns to prevent discovery of association rules. SIGMOD Record, 2001, 30(4): 45–54
Verykios V S, Elmagarmid A K, Bertino E, et al. Association rule hiding. IEEE Trans Knowl Data Eng, 2004, 16(4): 434–447
Damiani E, Vimercati S D C D, et al. Design and implementation of an access control processor for xml documents. Comput Netw, 2000, 33(1–6): 59–75
Damiani E, Vimercati S D C D, Paraboschi S, et al. A fine-grained access control system for xml documents. ACM Trans Inf Syst Secur, 2001, 5(2): 169–202
Miklau G, Suciu D. Controlling access to published data using cryptography. In: Freytag J C, Lockemann P C, Abiteboul S, et al., eds. Proceedings of 29th International Conference on Very Large Data Bases (VLDB). Berlin: Morgan Kaufmann, 2003. 898–909
Brodskyand A, Farkas C, Jajodia S. Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Trans Knowl Data Eng, 2000, 12(6): 900–919
Dawson S, Vimercati S D C D, Lincoln P, et al. Minimal data upgrading to prevent inference and association attacks. In: Proceedings of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), Philadelphia, Pennsylvania: ACM Press, 1999. 114–125
Jajodia S, Meadows C. Inference problems in multilevel secure database management systems. In: Information Security: An Integrated Collection of Essays, 1995. 570–584
Qian X, Lunt T. A semantic framework of the multilevel secure relational model. IEEE Trans Knowl Data Eng, 1997, 9(2): 292–301
Bertino E, Carminati B, Ferrari E. A secure publishing service for digital libraries of xml documents. In: Davida G I, Frankel Y, eds. Information Security, Proceedings of 4th International Conference (ISC) Malaga, Spain, Lect Notes in Computer Science, vol. 2200. Berlin: Springer-Verlag, 2001. 347–362
Gabillon A, Bruno E. Regulating access to xml documents. In: Olivier M S, Spooner D L, eds. Database and Application Security XV, IFIP TC11/WG11.3 Fifteenth Annual Working Conference on Database and Application Security (DBSec), Ontario, Canada, vol. 215 of IFIP Conference Proceedings, Kluwer, 2001. 299–314
Zhang H, He Y P, Shi Z G. A formal model for access control with supporting spatial context. Sci China Ser F-Inf Sci, 2007, 50(3): 419–439
Agrawal R, Srikant R. Fast algorithms for mining association rules in large databases. In: Bocca J B, Jarke M, Zaniolo C, eds. Proceedings of 20th International Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile: Morgan Kaufmann, 1994. 487–499
Rajagopalan S, Vazirani V V. Primal-dual RNC approximation algorithms for set cover and covering integer programs. SIAM J Comput, 1999, 28(2): 525–540
Graham R L, Knuth D E, Patashnik O. Concrete mathematics: a foundation for computer science. Boston, MA: Addison-Wesley Longman Publishing Co., Inc. 1989
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the Program for New Century Excellent Talents in Universities (Grant No. NCET-06-0290), the National Natural Science Foundation of China (Grant Nos. 60828004, 60503036), and the Fok Ying Tong Education Foundation Award (Grant No. 104027)
Rights and permissions
About this article
Cite this article
Yang, X., Wang, B. & Yu, G. Efficient secure data publishing algorithms for supporting information sharing. Sci. China Ser. F-Inf. Sci. 52, 627–644 (2009). https://doi.org/10.1007/s11432-009-0023-y
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1007/s11432-009-0023-y