分析大尺度相依数据的一些方法

Some Methods for Analyzing Big Dependent Data

Journal of Business & Economic Statistics · 2016
被引 31
人大 AABS 4

中文导读

提出将大尺度相依数据转化为密度函数型时间序列,并运用K均值聚类、树分类、阈值近似因子模型和Hellinger距离自回归模型进行分析与预测,适用于电力需求、股票收益和收入分布等场景。

Abstract

We consider an approach to analyze big data of time series. Big dependent data are first transformed into functional time series of densities via nonparametric density estimation. We then discuss some tools for exploratory data analysis of the resulting functional time series. The tools employed include K-means cluster analysis and tree-based classification. For modeling, we propose a threshold approximate-factor model and a Hellinger distance autoregressive model for functional time series of continuous densities. The latent factors of factor models are estimated by functional principal component analysis. Cross-validation and Hellinger distance are used to select the number of principal component functions. For prediction of high-dimensional time series, we use the results of cluster analysis to obtain parsimonious models. We demonstrate the proposed analysis by considering the demand of electricity, the behavior of daily U.S. stock returns, and U.S. income distributions.

大依赖数据分析函数型时间序列阈值近似因子模型