Safety-driven decentralized decisions in networks: An indicator-based reinforcement learning approach
提出一种基于连续安全指标的强化学习方法,用于城市规模网络的分散维护调度,在保障安全的同时优化性能,并通过水分配网络案例验证。
Network scheduling must balance performance with safety considerations, particularly in city-scale networks where safety is critical. In traditional optimization, measures such as cost or time are adopted as objectives overlooking the safety component of these systems. Existing safety-driven decision approaches either employ model-based techniques such as reachability analysis or control barrier functions, which require precise system dynamics on safety, or implement model-free methods with simplistic binary safety metrics and constraints that fall short in capturing the nuances of safety in complex systems. Thus, in this paper, we adopt the concept of safety learning in decentralized Markov decision processes and present a safety-driven maintenance scheduling framework for networks consisting of multiple components. Our approach introduces a novel reinforcement learning methodology for decentralized network scheduling, integrating continuous safety indicators to assess the safety of each policy more precisely. The proposed method demonstrates superior performance in identifying and eliminating unsafe policies while maintaining finite-time guarantees. We validate the proposed approach through numerical experiments and a real-world case study of a Water Distribution Network, where we optimize maintenance scheduling under contamination risks. This work provides city-scale network governors with a practical framework to evaluate maintenance strategies in the face of disruptive events.