Parametric modelling of cost data: some simulation evidence
通过模拟实验比较了样本均值和对数正态分布均值估计量在成本数据中的表现,发现样本均值在真实分布未知时更稳健,尤其在小样本下。
Recently, commentators have suggested that the distributional form of cost data should be explicitly modelled to gain efficiency in estimating the population mean. We perform a series of simulation experiments to evaluate the usual sample mean and the mean estimator of a lognormal distribution, in the context of both theoretical distributions and three large empirical datasets. The sample mean is always unbiased, but is somewhat less efficient when the population distribution is truly lognormal. However the lognormal estimator can perform appallingly when the true distribution is not lognormal. In practical situations, where the true distribution is unknown, the sample mean generally remains the estimator of choice, especially when limited sample size prohibits detailed modelling of the cost data distribution.