An empirical total survey error decomposition using data combination
利用纽约州行政记录与三大住户调查的链接数据,将总调查误差分解为覆盖误差、无回答误差和测量误差,发现测量误差是最大来源,为调查设计者和使用者评估误差影响提供方法。
Survey error is known to be pervasive and to bias even simple, but important, estimates of means, rates, and totals, such as the poverty and the unemployment rate. In order to summarize and analyze the extent, sources, and consequences of survey error, we define empirical counterparts of key components of the Total Survey Error Framework that can be estimated using data combination. Specifically, we estimate total survey error and decompose it into three high level sources of error: generalized coverage error, item non-response error and measurement error. We further decompose these sources into lower level sources such as failure to report a positive amount and errors in amounts conditional on reporting a positive value. For errors in dollars paid by two large government transfer programs, we use administrative records on the universe of program payments in New York State linked to three major household surveys to estimate the error components previously defined. We find that total survey error is large and varies in its size and composition, but measurement error is always by far the largest source of error. Our application shows that data combination makes it possible to routinely measure total survey error and its components. Our results allow survey producers to assess error reduction strategies and survey users to mitigate the consequences of survey errors or gauge the reliability of their conclusions.