缺失数据

Missing Data

ORGANIZATIONAL RESEARCH METHODS · 2014

被引 1366

人大 A-ABS 4

Daniel A. Newman · 伊利诺伊大学厄巴纳-香槟分校通讯

中文导读

这篇综述介绍了缺失数据的层次、机制、问题及处理方法，并给出五条实用指南，帮助社会科学研究者减少缺失数据导致的偏差和错误。

Abstract

Missing data (a) reside at three missing data levels of analysis (item-, construct-, and person-level), (b) arise from three missing data mechanisms (missing completely at random, missing at random, and missing not at random) that range from completely random to systematic missingness, (c) can engender two missing data problems (biased parameter estimates and inaccurate hypothesis tests/inaccurate standard errors/low power), and (d) mandate a choice from among several missing data treatments (listwise deletion, pairwise deletion, single imputation, maximum likelihood, and multiple imputation). Whereas all missing data treatments are imperfect and are rooted in particular statistical assumptions, some missing data treatments are worse than others, on average (i.e., they lead to more bias in parameter estimates and less accurate hypothesis tests). Social scientists still routinely choose the more biased and error-prone techniques (listwise and pairwise deletion), likely due to poor familiarity with and misconceptions about the less biased/less error-prone techniques (maximum likelihood and multiple imputation). The current user-friendly review provides five easy-to-understand practical guidelines, with the goal of reducing missing data bias and error in the reporting of research results. Syntax is provided for correlation, multiple regression, and structural equation modeling with missing data.

统计学计量经济学社会科学研究方法数据挖掘

阅读原文 ↗