Reinforcement learning for robotic flow shop scheduling with processing time variations
研究了机器人流水车间中两种零件加工时间变化时的调度问题,用强化学习获得机器人任务序列以最小化完工时间,并与FIFO和反向序列规则对比,验证了方法的有效性。
We address a robotic flow shop scheduling problem where two part types are processed on each given set of dedicated machines. A single robot moving on a fixed rail transports one part at a time, and the processing times of the parts vary on the machines within a given time interval. We use a reinforcement learning (RL) approach to obtain efficient robot task sequences to minimise makespan. We model the problem with a Petri net used for a RLenvironment and develop a lower bound for the makespan. We then define states, actions, and rewards based on the Petri net model; further, we show that the RL approach works better than the first-in-first-out (FIFO) rule and the reverse sequence (RS), which is extensively used for cyclic scheduling of a robotic flow shop; moreover, the gap between the makespan from the proposed algorithm and a lower bound is not large; finally, the makespan from the RL method is compared to an optimal solution in a relaxed problem. This research shows the applicability of RL for the scheduling of robotic flow shops and its efficiency by comparing it to FIFO, RS and a lower bound. This work can be easily extended to several other variants of robotic flow shop scheduling problems.