DENSITY ESTIMATION FOR CLUSTERED DATA
研究了聚类抽样对核密度估计的影响,发现独立同分布数据的最优窗宽不再适用,并提出了使用高阶核的新最优带宽,模拟显示其均方积分误差更小。
The commonly used survey technique of clustering introduces dependence into sample data. Such data is frequently used in economic analysis, though the dependence induced by the sample structure of the data is often ignored. In this paper, the effect of clustering on the non-parametric, kernel estimate of the density, f(x), is examined. The window width commonly used for density estimation for the case of i.i.d. data is shown to no longer be optimal. A new optimal bandwidth using a higher-order kernel is proposed and is shown to give a smaller integrated mean squared error than two window widths which are widely used for the case of i.i.d. data. Several illustrations from simulation are provided.