Multi-objective reinforcement learning-based framework for solving selective maintenance problems in reconfigurable cyber-physical manufacturing systems
针对可重构信息物理制造系统,提出首个考虑不完美维修和健康状态观测不确定性的选择性维护模型,并用深度强化学习求解多目标优化问题,以最大化可靠性、最小化成本与方差。
Unlike mass production manufacturing systems, where configurations are rarely changed after the initial design, reconfigurable cyber-physical systems (RCPMS) self-change their structures throughout missions and thus self-adjust production in response to demand requirements. Accordingly, such a paradigm requires enhancing selective maintenance strategy to optimise scheduling maintenance actions, selecting configuration layouts for capacity and product family changes, and achieving maintenance cost reduction and reliability maximisation. This paper is the first to propose a robust model for a selective maintenance problem with imperfect repairs in the RCPMS context. The model also integrates uncertainties originating from the imperfect observations of components' health status. The model's objectives are to maximise the expected reliability and minimise the variance and maintenance cost under maintenance resource constraints. Moreover, we propose a new deep reinforcement learning framework for solving the resulting multi-objective and combinatorial optimisation problem. In addition, we use decision values to enhance the scalarisation process by permitting the priorities of specific objectives to be adjusted after the learning process. Furthermore, we employ Analytical Hierarchy Process to adjust the static priorities with respect to the objective functions and the actual learning context. Finally, broad experiments are conducted to highlight the performance of the proposed model and resolution framework.