基于学习聚类的相依数据推断

Inference for Dependent Data with Learned Clusters

Review of Economics and Statistics · 2024
被引 3
人大 AFT50ABS 4

中文导读

提出一种利用无监督聚类算法对空间相依数据进行分组,然后进行基于聚类的统计假设检验的方法,允许聚类数量由数据决定,并给出渐近性质证明和有限样本模拟验证。

Abstract

Abstract This article presents and analyzes an approach to cluster-based inference for dependent data. The primary setting considered here is with spatially indexed data in which the dependence structure of observed random variables is characterized by a known, observed dissimilarity measure over spatial indices. Observations are partitioned into clusters with the use of an unsupervised clustering algorithm applied to the dissimilarity measure. Once the partition into clusters is learned, a cluster-based inference procedure is applied to a statistical hypothesis testing procedure. The procedure proposed in the article allows the number of clusters to depend on the data, which gives researchers a principled method for choosing an appropriate clustering level. The article gives conditions under which the proposed procedure asymptotically attains correct size. A simulation study shows that the proposed procedure attains near nominal size in finite samples in a variety of statistical testing problems with dependent data.

聚类推断相依数据空间数据无监督聚类