Mean Integrated Squared Error Sampling
研究了使用均值积分平方误差替代均方误差的分层抽样方法,以截断级数估计整个分布,并推广了无放回抽样的级数项包含规则。通过二元向量和连续分布两个特例,展示了该方法在小样本下误差更小,且样本量选择规则与传统方法差异显著。
Abstract Stratified sampling is considered, where (a) the mean integrated squared error (MISE) metric is used in place of the mean squared error (MSE) metric; (b) the entire distribution [i.e., f(x)], rather than a property of the distribution [e.g., E(x)], is used as a target of the procedure; (c) the distribution f(x) is estimated by a truncated series (to counterbalance model complexity with sample size availability); and finally, (d) samples are taken both with and without replacement. In the last regard, series term inclusion rules are generalized to deal with samples taken without replacement from a finite population. Two special cases are treated in detail. The first shows that even for sample sizes as small as three and for data forms as elementary as the bivariate binary vectors, the use of orthogonal series representation can lead to smaller expected error than would be achievable through use of conventional representation and methods. The second demonstrates that in the continuous case, MISE-based and conventional sample size selection rules can differ substantially.