Online Planning in Nonstationary Environments
提出一种新的计算方法,在非平稳环境中平衡实时决策与长期性能,适用于优惠券分配、订单履行和资源分配等问题,数值测试显示优于现有方法。
A team of researchers, Cheung and Lyu, developed a new computational approach to tackle a central challenge in online planning: balancing real-time decisions with long-term performance in dynamic systems. Traditional solutions often rely on intractable dynamic programming, but the new method delivers near-optimal performance efficiently, even in nonstationary environments. The framework addresses a broad class of planning problems with concave objectives and convex constraints, applicable to real-world scenarios like coupon assignment, order fulfillment, and resource allocation. The study considers two decision-making settings: one with unlimited access to data samples via simulations and another with finite sample data. By leveraging gradient-based insights from offline simulations, the researchers propose an offline-to-online framework that performs well in both settings. Notably, in the sampling setting, the approach improves as more data or longer planning horizons become available. Numerical tests on coupon assignment and supply chain management problems show significant gains over existing methods.