🌙

带服务中断和重启的容错系统的成本优化与可靠性分析

Cost optimization and reliability analysis of fault tolerant system with service interruption and reboot

Reliability Engineering and System Safety · 2024
被引 8
ABS 3

中文导读

研究了带不完美覆盖的容错系统的可靠性,通过直接搜索和粒子群优化方法找到最优控制参数,以最小化成本效益比,适用于实时系统设计。

Abstract

• We discuss reliability analysis of a fault tolerant system with imperfect coverage. • We explore system reliability, MTTF and other queuing measures. • The cost-effective ratio is evaluated to upgrade and improve availability. • We obtain optimal control parameters via a direct search approach and PSO . Due to widespread usage in many real time systems, reliability modeling and cost optimization of fault tolerance system have drawn attention of the practitioners. The fault tolerance in these systems can be provided by the support of maintenance and redundant components that help in smooth operation of the system in spite of failure of some active components. This investigation deals with the performance modeling of a fault-tolerant system consisting of a finite number of active (online) and standby components. During the switching from active to standby, the recovery procedure is performed, which may be imperfect. In case of imperfect recovery, the system reboot takes place. The maintenance of all the components is managed by a repairman (server) which is subject to failure. When the server is interrupted for rendering the service, functioning does not get stopped due to the system switch-over from perfect working to working breakdown mode. The system works even when the server is on working vacation and performs repair jobs of the failed components. The machine repair model based on Markovian process is developed to derive the transient probabilities and other performance indices of the fault tolerant system using Laplace transforms and matrix analytical method. Using the direct search strategy and particle swarm optimization, the cost-benefit analysis is done. The optimal design of the control parameters for the fault-tolerant system are presented by framing a cost-effective ratio function. The model is examined computationally by performing the numerical simulation and cost optimization.

可靠性工程容错系统成本优化排队论马尔可夫过程