支撑点:一种紧凑表示连续概率分布的新方法

Support points

Annals of Statistics · 2018
被引 113 · 同刊同年前 8%
ABS 4★

中文导读

本文提出一种通过最小化能量距离来获得代表点(支撑点)的方法,这些点能紧凑表示连续概率分布,在积分计算和不确定性传播中比蒙特卡洛方法误差更小,并可用于压缩MCMC样本。

Abstract

This paper introduces a new way to compact a continuous probability distribution $F$ into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to $F$, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.

统计学蒙特卡洛方法贝叶斯计算数值积分