强化学习在生产计划与控制中的应用

Reinforcement learning applied to production planning and control

International Journal of Production Research · 2022

被引 137 · 同刊同年前 2%

ABS 3

Ana Esteso
David Peidro
Josefa Mula 通讯
Manuel Díaz‐Madroñero

中文导读

这篇综述分析了181篇文章，发现强化学习主要应用于生产调度和采购供应管理，其效果优于传统数学规划和启发式方法，尤其在处理不确定性和非线性问题时表现更佳。

Abstract

The objective of this paper is to examine the use and applications of reinforcement learning (RL) techniques in the production planning and control (PPC) field addressing the following PPC areas: facility resource planning, capacity planning, purchase and supply management, production scheduling and inventory management. The main RL characteristics, such as method, context, states, actions, reward and highlights, were analysed. The considered number of agents, applications and RL software tools, specifically, programming language, platforms, application programming interfaces and RL frameworks, among others, were identified, and 181 articles were sreviewed. The results showed that RL was applied mainly to production scheduling problems, followed by purchase and supply management. The most revised RL algorithms were model-free and single-agent and were applied to simplified PPC environments. Nevertheless, their results seem to be promising compared to traditional mathematical programming and heuristics/metaheuristics solution methods, and even more so when they incorporate uncertainty or non-linear properties. Finally, RL value-based approaches are the most widely used, specifically Q-learning and its variants and for deep RL, deep Q-networks. In recent years however, the most widely used approach has been the actor-critic method, such as the advantage actor critic, proximal policy optimisation, deep deterministic policy gradient and trust region policy optimisation.

生产计划与控制强化学习生产调度库存管理运筹学

作者公开的免费版 ↗阅读原文 ↗