EKG-AC: A New Paradigm for Process Industrial Optimization Based on Offline Reinforcement Learning With Expert Knowledge Guidance
提出一种离线演员-评论家算法EKG-AC,通过扩散变换器生成动作并融合专家知识,解决过程工业中数据不平衡和安全敏感问题,在两个真实工业案例中验证了优化效果与专家决策高度一致。
Operation optimization plays a crucial role in process control, directly influencing product quality and profitability. Reinforcement learning (RL), with its capabilities in autonomous learning and dynamic adaptability, has become a promising solution in this domain. However, its real-world application is constrained by the high costs and risks associated with its interactions with environments. Offline RL, which leverages fixed datasets without interactions, offers an alternative but faces significant challenges in the process industry due to imbalanced multioperating condition scenarios and heightened safety sensitivity. To address these challenges, this article introduces a novel offline actor-critic algorithm with expert knowledge guidance (EKG-AC). The method begins with a diffusion-transformer-based action generation framework that mitigates the out-of-distribution problem by capturing the evolution of decision sequences and the interdependencies between states and actions. An expert knowledge guidance mechanism is then integrated, steering the model to generate safe and adaptive candidate actions aligned with current operating conditions and expert knowledge. Subsequently, within the actor-critic framework, the optimal action is selected from the candidate pool based on the evaluated Q-value, thereby setting the operational variables for the optimization task. The proposed algorithm is validated through two real-world industrial processes, demonstrating superior optimization performance and behavior that is closely aligned with expert decision-making, underscoring its substantial practical value.