面向多智能体网络系统的可扩展强化学习

Scalable Reinforcement Learning for Multiagent Networked Systems

Operations Research · 2022

被引 26

人大 AFT50UTD24ABS 4*

Guannan Qu · 卡内基梅隆大学
Adam Wierman · 加州理工学院
Na Li · 哈佛大学

中文导读

针对大规模网络系统（如能源、交通、通信网络）中状态和动作空间随节点数指数增长的问题，提出一种利用网络结构实现指数衰减的可扩展强化学习方法SAC，突破维度灾难。

Abstract

Highlighted by success stories like AlphaGo, reinforcement learning (RL) has emerged as a powerful tool for decision making in complex environments. However, the success of RL has thus far been limited to small-scale or single-agent systems. To apply RL to large-scale networked systems such as energy, transportation, and communication networks, a critical hurdle is the curse of dimensionality, because for these systems, the state and action space can be exponentially large in the number of nodes in the network. This article attempts to break this curse of dimensionality and designs a scalable RL method, named scalable actor critic (SAC), for large networked systems. The key technical contribution is to exploit the network structure to derive an exponential decay property, which enables the design of the SAC approach.

强化学习多智能体系统网络系统可扩展性

阅读原文 ↗