Online renewable smooth quantile regression
针对流式数据的分位数回归问题,提出一种无限可微的凸光滑分位数损失,并构建在线可更新框架,在有限内存下实现估计量的渐近正态性和Oracle性质,适用于数据持续到达的场景。
This paper concerns quantile regression for streaming data, where large amounts of data arrive batch by batch. Limited memory and non-smoothness of quantile regression loss all pose challenges in both computation and theoretical development. To address the challenges, we first introduce a convex smooth quantile loss, which is infinitely differentiable and converges to the quantile loss uniformly. Then an online renewable framework is proposed, in which the quantile regression estimator is renewed with current data and summary statistics of historical data. In theory, the estimation consistency and asymptotic normality of the renewable estimator are established without any restriction on the total number of data batches, which leads to the oracle property, and gives theoretical guarantee that the new method is adaptive to the situation where streaming data sets arrive perpetually. Numerical experiments on both synthetic and real data verify the theoretical results and illustrate the good performance of the new method.