指标：非平均统计量何时以及为何有效

Metrics—When and Why Nonaveraging Statistics Work

Management Science · 2008

被引 16

人大 A+FT50UTD24ABS 4*

Steven M. Shugan · 佛罗里达大学
Debanjan Mitra · 佛罗里达大学

中文导读

通过实证、理论证明和模拟，说明在特定环境下（如样本量与潜在能力相关、失败较多的环境），非平均统计量（如最大值、方差）比平均统计量更能有效汇总信息，并解释其背后的“穆特效应”和“安娜·卡列尼娜效应”。

Abstract

Good metrics are well-defined formulae (often involving averaging) that transmute multiple measures of raw numerical performance (e.g., dollar sales, referrals, number of customers) to create informative summary statistics (e.g., average share of wallet, average customer tenure). Despite myriad uses (benchmarking, monitoring, allocating resources, diagnosing problems, explanatory variables), most uses require metrics that contain information summarizing multiple observations. On this criterion, we show empirically (with people data) that although averaging has remarkable theoretical properties, supposedly inferior nonaveraging metrics (e.g., maximum, variance) are often better. We explain theoretically (with exact proofs) and numerically (with simulations) when and why. For example, when the environment causes a correlation between observed sample sizes (e.g., number of past purchases, projects, observations) and latent underlying parameters (e.g., the likelihood of favorable outcomes), the maximum statistic is a better metric than the mean. We refer to this environmental effect as the Muth effect, which occurs when rational markets provide more opportunities (i.e., more observations) to individuals and organizations with greater innate ability. Moreover, when environments are adverse (e.g., failure-rich), nonaveraging metrics correctly overweight favorable outcomes. We refer to this environmental effect as the Anna Karenina effect, which occurs when less-favorable outcomes convey less information. These environmental effects impact metric construction, selection, and employment.

指标非平均统计量Muth效应最大值统计量

阅读原文 ↗