Deep reinforcement learning for solving steelmaking-continuous casting scheduling problems under time-of-use tariffs
提出一种基于深度强化学习的智能调度方法,首次在峰谷电价下优化炼钢-连铸调度,兼顾缩短总停留时间和降低电费,实验表明解质量和计算时间大幅优于对比算法。
This paper proposes a novel intelligent scheduling method based on deep reinforcement learning (DRL) to solve the multi-objective steelmaking-continuous casting (SCC) scheduling problem, under time-of-use (TOU) tariffs for the first time. The intelligent scheduling system architecture is designed, and a mathematical model is established to minimise the total sojourn time and electricity cost. To effectively reduce production costs by avoiding peak periods of electricity consumption, the ‘start time’ of the system is generated based on the Markov Decision Process (MDP), and heuristic scheduling rules related to power cost are used as the action space, with corresponding reward functions designed according to the characteristics of these two objectives. To satisfy the continuous casting which is a particular SCC constraint, a backward strategy is developed. Additionally, a branching duelling double deep Q-network (BD3QN) is adapted to guide action selection and avoid blind search in the iteration process, and then applied to real-time scheduling. Numerical experiments demonstrate that the proposed method outperforms comparison algorithms in terms of solution quality and CPU times by a large margin.