微观数据的披露风险与披露规避

Disclosure Risk and Disclosure Avoidance for Microdata

Journal of Business & Economic Statistics · 1988
被引 113 · 同刊同年前 10%
人大 AABS 4

中文导读

估计了不含姓名地址等直接标识的微观数据中可识别记录的比例,考虑数据噪声和样本性质,发现当调查者额外知识有限时无披露风险,但知识丰富时风险高,需大幅修改数据或通过组织法律限制来平衡隐私与数据质量。

Abstract

Under given concrete exogenous conditions, the fraction of identifiable records in a microdata file without positive identifiers such as name and address is estimated. The effect of possible noise in the data, as well as the sample property of microdata files, is taken into account. Using real microdata files, it is shown that there is no risk of disclosure if the information content of characteristics known to the investigator (additional knowledge) is limited. Files with additional knowledge of large information content yield a high risk of disclosure. This can be eliminated only by massive modifications of the data records, which, however, involve large biases for complex statistical evaluations. In this case, the requirement for privacy protection and high-quality data perhaps may be fulfilled only if the linkage of such files with extensive additional knowledge is prevented by appropriate organizational and legal restrictions.

微观数据披露风险披露规避可识别记录