一种针对小数据集的增强型数据扰动方法

An Enhanced Data Perturbation Approach for Small Data Sets

DECISION SCIENCES · 2005

被引 31

人大 AABS 3

Krishnamurty Muralidhar · 肯塔基大学通讯
Rathindra Sarathy · 俄克拉荷马州立大学通讯

中文导读

针对现有扰动方法不适用于小数据集的问题，改进了通用加性数据扰动技术，使扰动后数据在降低泄露风险的同时，保证常用统计分析结果与原数据一致。

Abstract

ABSTRACT As modern organizations gather, analyze, and share large quantities of data, issues of privacy, and confidentiality are becoming increasingly important. Perturbation methods are used to protect confidentiality when confidential, numerical data are shared or disseminated for analysis. Unfortunately, existing perturbation methods are not suitable for protecting small data sets. With small data sets, existing perturbation methods result in reduced protection against disclosure risk due to sampling error. Sampling error may also produce different results from the analysis of perturbed data compared to the original data, reducing data utility. In this study, we develop an enhancement of an existing perturbation technique, General Additive Data Perturbation, that can be used to effectively mask both large and small data sets. The proposed enhancement minimizes the risk of disclosure while ensuring that the results of commonly performed statistical analyses are identical and equal for both the original and the perturbed data.

数据隐私数据安全数据挖掘统计保密

阅读原文 ↗