聚类数据的密度估计

DENSITY ESTIMATION FOR CLUSTERED DATA

Econometric Reviews · 2001
被引 16
人大 A-ABS 3

中文导读

研究了聚类抽样对核密度估计的影响,发现独立同分布数据的最优窗宽不再适用,并提出了使用高阶核的新最优带宽,模拟显示其均方积分误差更小。

Abstract

The commonly used survey technique of clustering introduces dependence into sample data. Such data is frequently used in economic analysis, though the dependence induced by the sample structure of the data is often ignored. In this paper, the effect of clustering on the non-parametric, kernel estimate of the density, f(x), is examined. The window width commonly used for density estimation for the case of i.i.d. data is shown to no longer be optimal. A new optimal bandwidth using a higher-order kernel is proposed and is shown to give a smaller integrated mean squared error than two window widths which are widely used for the case of i.i.d. data. Several illustrations from simulation are provided.

聚类数据密度估计核估计最优带宽