A Model-Free Stealthy Attack for Cyber-Physical Systems Based on Deep Reinforcement Learning
从攻击者视角,提出一种无需系统模型先验知识的隐蔽攻击方法,利用约束马尔可夫决策过程和演员-评论家强化学习算法训练攻击策略,可适用于非线性系统,并通过李雅普诺夫函数保证状态收敛。
This article, from the attacker’s standpoint, develops a model-free stealthy attack that can steer the system state to the predefined target value and evade detection, without prior knowledge of the system dynamics. A constrained Markov decision process (CMDP) is first modeled to characterize the objective of the stealthy attack. On the basis of the established CMDP, an actor–critic reinforcement learning algorithm is proposed to train the attacker’s policy. Furthermore, by introducing a Lyapunov function constructed from the action value function to the algorithm, convergence of the attacked system’s state to the target is theoretically guaranteed. Differing from existing model-free stealthy attacks which are only suitable for linear systems, the proposed approach guarantees the applicability to nonlinear systems. A linear numerical example and a nonlinear example of flotation industrial system are provided to validate the effectiveness of our proposed stealthy attack.