🌙

零售业中基于梯度提升树的可扩展概率预测:一种实践者方法

Scalable probabilistic forecasting in retail with gradient boosted trees: A practitioner’s approach

International Journal of Production Economics · 2024
被引 3
ABS 3

中文导读

针对大型电商数据规模大、间歇性强的问题,提出一种两层分层预测框架,先在聚合层用梯度提升树做概率预测,再分解到决策层,在多个数据集上验证了可扩展性和效果。

Abstract

The recent M5 competition has advanced the state-of-the-art in retail forecasting. However, there are important differences between the competition challenge and the challenges we face in a large e-commerce company. The datasets in our scenario are larger (hundreds of thousands of time series), and e-commerce can afford to have a larger stock assortment than brick-and-mortar retailers, leading to more intermittent data. To scale to larger dataset sizes with feasible computational effort, we investigate a two-layer hierarchy, namely the decision level with product unit sales and an aggregated level, e.g., through warehouse-product aggregation, reducing the number of series and degree of intermittency. We propose a top-down approach to forecasting at the aggregated level, and then disaggregate to obtain decision-level forecasts. Probabilistic forecasts are generated under distributional assumptions. The proposed scalable method is evaluated on both a large propreitary dataset, as well as the publically available Corporación Favorita and M5 datasets. We are able to show the differences in characteristics of the e-commerce and brick-and-mortar retail datasets. Notably, our top-down forecasting framework enters the top 50 of the original M5 competition, even with models trained at a higher level under a much simpler setting.

零售预测概率预测梯度提升树时间序列电子商务