论文标题
基于排列模型的一般机密性和公用事业指标用于隐私数据发布
General Confidentiality and Utility Metrics for Privacy-Preserving Data Publishing Based on the Permutation Model
论文作者
论文摘要
可以在置换模型的镜头下查看隐私数据出版的匿名数据,也称为统计披露控制(SDC)。根据该模型,任何用于单个数据记录的SDC方法在功能上等同于置换步骤加上噪声添加步骤,其中添加的噪声是边缘的,从某种意义上说,它不会改变等级。在这里,我们提出的指标以根据置换模型来量化SDC方法实现的数据机密性和效用。我们区分了两个隐私概念:在我们的工作中,匿名性是指主题,因此主要是针对记录重新识别的保护,而机密性是指为属性值属于属性披露提供的保护。因此,即使使用隐私模型确保了匿名级别的前面,我们的机密指标也很有用。 The utility metric is a general-purpose metric that can be conveniently traded off against the confidentiality metrics, because all of them are bounded between 0 and 1. As an application, we compare the utility-confidentiality trade-offs achieved by several anonymization approaches, including privacy models (k-anonymity and $ε$-differential privacy) as well as SDC methods (additive noise, multiplicative noise and synthetic数据)无隐私模型。
Anonymization for privacy-preserving data publishing, also known as statistical disclosure control (SDC), can be viewed under the lens of the permutation model. According to this model, any SDC method for individual data records is functionally equivalent to a permutation step plus a noise addition step, where the noise added is marginal, in the sense that it does not alter ranks. Here, we propose metrics to quantify the data confidentiality and utility achieved by SDC methods based on the permutation model. We distinguish two privacy notions: in our work, anonymity refers to subjects and hence mainly to protection against record re-identification, whereas confidentiality refers to the protection afforded to attribute values against attribute disclosure. Thus, our confidentiality metrics are useful even if using a privacy model ensuring an anonymity level ex ante. The utility metric is a general-purpose metric that can be conveniently traded off against the confidentiality metrics, because all of them are bounded between 0 and 1. As an application, we compare the utility-confidentiality trade-offs achieved by several anonymization approaches, including privacy models (k-anonymity and $ε$-differential privacy) as well as SDC methods (additive noise, multiplicative noise and synthetic data) used without privacy models.