Defending Deep Learning-Based Raw Malware Detectors Against Adversarial Attacks: A Sequence Modeling Approach
提出AMGD框架,通过生成原始恶意软件变种来防御深度学习检测器,无需检测器内部知识,使用独立循环神经网络生成字节级恶意软件序列,提升检测器鲁棒性。
Malware detectors are the first line of defense against cyber-attacks that damage Information Technology (IT) infrastructure. Recently, deep learning (DL)-based malware detectors have yielded breakthrough results in identifying unseen attacks without requiring feature engineering and expensive dynamic malware analysis in a sandbox. However, these detectors are susceptible to adversarial malware attacks. Emulating effective adversarial malware variants is instrumental in revealing the vulnerabilities of such systems and developing automated cyber defense. Current methods for launching such attacks often assume scenarios that require accessing insider knowledge about the architecture of the malware detector and/or cannot operate directly on raw malware files. We propose Adversarial Malware example Generation and Defense (AMGD), a novel framework to defend the detectors by automatically generating malware variants from raw executables without assuming any prior detector knowledge. AMGD is generalizable to multiple detectors as it can be trained on multiple malware detectors simultaneously. AMGD employs Independent Recurrent Neural Nets (IndRNNs) to offer a novel generative byte-level malware sequence model, named Mal-IndRNN, to evade DL-based malware detectors. Mal-IndRNN effectively evades three renowned DL-based malware detectors and outperforms benchmark methods. We utilize malware variants generated by Mal-IndRNN to improve the robustness of malware detectors against adversarial attacks on a real dataset. AMGD offers a practical approach to proactively accounting for the Artificial Intelligence (AI)-enabled adversary in the design and development phase of DL-based malware detectors rather than reactive measures after deployment.