| dc.description.abstract |
In this paper, we benchmark and compare many machine leaning models in the CIC-
IDS2017 for cyber-attack detection. The models are compared with Logistic Regression,
Decision Tree, Random Forest, XGBoost, and LightGBM. The goal of the work is to compare
the performance of these models in terms of accuracy, recall and precision for
distinguishing the malicious and benigh network traffic. The most significant key features
for attack detection were selected through feature importance rankings and correlations
between attack categories with Random Forest. Experiments results show that the
aggregated learning models, especially XGBoost and LightGBM, are capable to achieve
better performance including accuracy, false positive rate, on malicious traffic detection
compared to other widely used ones, including Logistic Regression and Decision Tree.
Besides, the paper has studied a statistical analysis using Wilcoxon rank-sum test, and
confirmed that the models recalled with no difference. The findings emphasize the potential
of such ensemble techniques when it comes to online intrusion detection of cyber-attacks
and factors which contribute in improving intrusion detection system |
en_US |