排队系统中不确定收益的在线学习与自适应控制集成

Integrated Online Learning and Adaptive Control in Queueing Systems with Uncertain Payoffs

Operations Research · 2021

被引 19

人大 AFT50UTD24ABS 4*

Wei-Kang Hsu · 普渡大学
Jiaming Xu · 杜克大学
Xiaojun Lin · 普渡大学
Mark R. Bell · 普渡大学

中文导读

针对在线服务平台在容量约束下平衡现有客户收益与新客户探索的难题，提出一种效用引导的分配算法，集成在线学习与自适应控制，提供有性能保证的高系统收益。

Abstract

Many online service platforms have dedicated algorithms to match their available resources to incoming clients to maximize client satisfaction. One of the key challenges is to balance the generation of higher payoffs from existing clients and exploration of new clients’ unknown characteristics while at the same time satisfy the resource capacity constraints. In “Integrated Online Learning and Adaptive Control in Queueing Systems with Uncertain Payoffs,” Hsu, Xu, Lin, and Bell show that traditional approaches such as maximizing instantaneous payoffs with current knowledge or using queue-length based controls guided by “shadow prices,” would lead to suboptimal long-term payoffs. Instead, they propose a novel utility-guided assignment algorithm that seamlessly integrates online learning and adaptive control to provide high system payoffs with performance guarantees. The theoretical performance bound also lends system design insights into the impact of uncertain client dynamics, payoff learning, and backlogged clients. They further develop a decentralized version of the algorithm, which is applicable to large systems and performs well even when the service rates are random.

排队论在线学习自适应控制资源分配运营管理

阅读原文 ↗