Adversarial Malware Detection: Defending AI Models Against Evasive Attacks

Hasan, Md Zahid

DSpace Home
→
Faculty of Science and Information Technology
→
Department of Computer Science and Engineering
→
Project Report
→
View Item

Adversarial Malware Detection: Defending AI Models Against Evasive Attacks

Hasan, Md Zahid

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/16789

Date: 2025-09-17

Abstract:

Adversarial manipulation of static malware features can derail learned detectors. This research builds and audits a complete pipeline that measures and strengthens robustness for Windows executable screening under explicit, realistic threat models. Windows binaries were represented as fixed-length feature vectors and classified by three complementary learners; a Feed Forward Neural network (FNN) with bounded inputs, a one-dimensional Convolutional Neural Network (1D-CNN) with standardized inputs, and a Gradient-Boosting (LitghtGBM) tree on raw features. Evasion was evaluated with the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) for the neural models and with a decision guided routine for the tree model. Defenses combined adversarial training, feature squeezing detectors calibrated at a fixed 1% false-positive rate, and simple probability averaging ensembling. All iteration used fixed seeds and produced saved artifacts for audit. Clean baselines on the full test set were strong: gradient-boosted trees reached 97.63% accuracy and 99.62% receiver operating characteristic area under the curve (ROCAUC), the convolutional network reached 96.37% and 99.24%, and the feed forward network reached 95.68% and 98.48%. Under iterative adversarial attacks, the convolutional network falls to 0.23% at a large projected gradient descent budget and the feed forward network to 28.43% at a moderate budget. Adversarial training improved robustness only marginally while reducing clean accuracy; detectors recovered up to 44.00% true-positive rate at a 1% false-positive rate; ensembling stabilized predictions but did not neutralize strong attacks.