超越O(T)遗憾:在线线性规划中学习与决策的解耦

Beyond O(T) Regret: Decoupling Learning and Decision Making in Online Linear Programming

Operations Research · 2026
被引 0
人大 AFT50UTD24ABS 4*

中文导读

提出一种新框架,通过解耦学习与决策,使快速一阶方法在在线线性规划中达到与LP方法相当的遗憾保证,适用于收益管理、在线广告和云计算等场景。

Abstract

Breaking the [Formula: see text] Barrier in Online Linear Programming How can fast first-order methods for online linear programming (OLP) match the performance of more computationally intensive LP-based approaches? In the paper “Beyond [Formula: see text] Regret: Decoupling Learning and Decision Making in Online Linear Programming,” the authors develop a new framework that improves the regret guarantees of scalable first-order algorithms for online linear programming. They show that when the dual problem satisfies a mild error bound condition, first-order methods can achieve [Formula: see text] regret, including [Formula: see text] in continuous support settings and [Formula: see text] in finite support settings. The key insight is to decouple learning and decision making. The algorithm first learns an approximate dual solution and then localizes decisions to a neighborhood around that solution using an adaptive procedure. This decoupling narrows the gap between efficient gradient-based methods and LP-based benchmarks, providing strong theoretical guarantees while maintaining scalability for applications such as revenue management, online advertising, and cloud computing.

在线线性规划遗憾分析一阶方法收益管理在线广告