🌙

通过样本覆盖度估计类别数量

Estimating the Number of Classes via Sample Coverage

Journal of the American Statistical Association · 1992
被引 245
ABS 4

中文导读

提出一种非参数估计方法,利用样本覆盖度(已观测类别的概率之和)来估计总体中未知的类别数量,并通过蒙特卡洛模拟验证其效果。

Abstract

Abstract Assume that a random sample is drawn from a population with unknown number of classes and possibly unequal class probabilities. A nonparametric estimation technique is proposed to estimate the number of classes using the idea of sample coverage, which is defined as the sum of the cell probabilities of the observed classes. Since expected sample coverage can be well estimated, we were motivated to find its role in the estimation of the number of classes. This work generalizes the result of Esty to a nonparametric approach and extends Darroch and Ratcliff to incorporate the heterogeneity of the class probabilities. The coefficient of variation of the class sizes is shown to play an important role in the recommended estimation procedures. The performance of the proposed estimators is investigated by means of Monte Carlo simulations.

非参数统计估计方法蒙特卡洛方法样本覆盖度