技术说明：基于贝叶斯奖励的在线匹配

Technical Note—Online Matching with Bayesian Rewards

Operations Research · 2023

被引 1

人大 AFT50UTD24ABS 4*

Xinshang Wang · 上海交通大学
David Simchi‐Levi · 麻省理工学院
Rui Sun · 麻省理工学院

中文导读

研究一个在线匹配问题，平台需将有限资源分配给顺序到达的用户，匹配奖励取决于资源类型和到达时间且初始未知，通过贝叶斯方法实时学习真实奖励以优化分配。

Abstract

In This Issue Navigating Dynamic Resource Allocation: A Bayesian Approach In “Online Matching with Bayesian Rewards,” D. Simchi-Levi, R. Sun, and X. Wang address an online matching problem where a central platform must allocate limited resources to user groups arriving sequentially over time. The paper innovatively considers the variability in the reward for each matching option, which depends on both the resource type and the user’s arrival time. The challenge lies in the fact that these matching rewards are initially unknown but are assumed to be drawn from known probability distributions. The platform is then tasked with learning these true rewards in real time based on the observed matching results. This intriguing exploration of online Bayesian matching techniques provides valuable insights for improving resource allocation in dynamic environments.

在线匹配贝叶斯方法资源分配动态环境

阅读原文 ↗