Boundary Estimation
提出一种无需分布假设的边界估计方法,适用于网格数据中未知边界的分区问题,在质量控制、流行病学等领域有应用。
Abstract A data set consists of independent observations taken at the nodes of a grid. An unknown boundary partitions the grid into two regions. All the observations coming from a particular region share a common distribution, but the distributions are different for the two different regions. These two distributions are entirely unknown and need not differ in their means, medians, or any other measure of “level.” The grid is of arbitrary dimension, and its mesh is rectangular. Our objective is to estimate the boundary without making any distributional assumptions. We propose a class of estimators and obtain strong consistency for them (including rates of convergence and a bound on the error probability). The boundary estimate is selected from an appropriate collection of candidate boundaries, which must be specified by the user. The candidate boundaries as well as the true boundary must satisfy certain intuitively natural regularity assumptions, including a “smoothness” condition. The boundary estimation problem has applications in diverse fields, including quality control, epidemiology, forestry, marine science, meteorology, and geology. Our method provides (as special cases) estimators for the change point problem, the epidemic change model, templates, linear bisection of the plane, and Lipschitz boundaries. Each of these examples is explicitly analyzed. A simulation study provides numerical evidence that the boundary estimators work well; in this simulation, the two distributions actually share the same mean, median, variance, and skewness. Finally, as an illustration, a boundary estimate is calculated on a data grid of cancer mortality rates in the United States.