资源分配、竞争与预留的休止赌博机模型

A Restless Bandit Model for Resource Allocation, Competition, and Reservation

Operations Research · 2021

被引 15

人大 AFT50UTD24ABS 4*

Jing Fu · 墨尔本大学
Bill Moran · 墨尔本大学
Peter Taylor · 墨尔本大学

中文导读

研究有限容量资源在多个请求间分配的问题，建模为受资源约束的异质休止多臂赌博机，提出一种索引策略并证明其渐近最优条件，数值实验显示有限情形下也有效。

Abstract

In “A Restless Bandit Model for Resource Allocation, Competition and Reservation,” J. Fu, B. Moran, and P. G. Taylor study a resource allocation problem with varying requests and with resources of limited capacity shared by multiple requests. This problem is modeled as a set of heterogeneous restless multi-armed bandit problems (RMABPs) connected by constraints imposed by resource capacity. Following Whittle’s idea of relaxing the constraints and Weber and Weiss’s proof of asymptotic optimality, the authors propose an index policy and establish conditions for it to be asymptotically optimal in a regime where both arrival rates and capacities increase. In particular, they provide a simple sufficient condition for asymptotic optimality of the policy and, in complete generality, propose a method that generates a set of candidate policies for which asymptotic optimality can be checked. Via numerical experiments, they demonstrate the effectiveness of these results even in the pre-limit case.

资源分配多臂赌博机渐近最优算法运筹学数学经济学

阅读原文 ↗