Optimal Cost Sampling for Decision Making with Multiple Regression Models
给出了观测研究中多回归模型样本量、抽样误差与成本之间的明确关系,提供了确定最优样本量的图表和公式,发现样本量超过20后估计不精确度和总成本对样本量增加不敏感。
ABSTRACT This paper develops an explicit relationship between sample size, sampling error, and related costs for the application of multiple regression models in observational studies. Graphs and formulas for determining optimal sample sizes and related factors are provided to facilitate the application of the derived models. These graphs reveal that, in most cases, the imprecision of estimates and minimum total cost are relatively insensitive to increases in sample size beyond n =20. Because of the intrinsic variation of the regression model, even if larger samples are optimal, the relative change in the total cost function is small when the cost of imprecision is a quadratic function. A model‐utility approach, however, may impose a lower bound on sample size that requires the sample size be larger than indicated by the estimation or cost‐minimization approaches. Graphs are provided to illustrate lower‐bound conditions on sample size. Optimal sample size in view of all considerations is obtained by the maximin criterion, the maximum of the minimum sample size for all approaches.