LDR: Learning Discrete Representation to Improve Noise Robustness in Multiagent Tasks
提出LDR方法,通过量化模块和分段机制将观测与队友动作编码为离散表示,减少噪声对决策的影响,在星际争霸II和多智能体MuJoCo任务中优于现有算法。
In real-world applications of multiagent reinforcement learning (MARL), agents often face inaccurate environments due to unavoidable noise, presenting a challenge to their robustness. However, limited prior work focuses on addressing such noise in observations, hindering the deployment of multiagent systems. In this article, we propose a method named learning discrete representation (LDR) to improve robustness against noise in multiagent tasks. Specifically, LDR employs a quantization module with a segment mechanism to encode observations and teammate actions, generating discrete representations from learnable codebooks. These representations are subsequently processed via a combiner for decision-making. Through discretization, LDR is able to mitigate the impact of minor noise on decision-making. To enhance the learning efficiency, we incorporate a set-input block that treats the joint observations of agents as a permutation-invariant set, thereby reducing the complexity of the joint observation space. Additionally, we theoretically analyze the expressiveness of discrete representation and the boundedness of discrete distortion. We evaluate the proposed method on StarCraft II micromanagement tasks and multiagent MuJoCo with noisy observations. Empirical results demonstrate that LDR outperforms existing algorithms, improving robustness in noisy cooperative MARL tasks while maintaining superior performance in clean observations.