| dc.description.abstract |
A fast increase in the digital connectivity in different areas like the finance sector, healthcare sector, and the military has made it much easier to be targeted by cyberattacks, which have revealed the weakness of the traditional signature-based intrusion detection systems (IDS). To solve this, machine learning (ML) will provide intelligent solutions that are adaptive. This paper creates an explicatory ML-based IDS framework based on the CIC-IDS2017 dataset, having more than 2.8 million records. The data pre-processing consisted of noise elimination, label encoding, feature scaling and feature selection with the help of the Random Forest. The top model was the XGBoost that was optimized and trained through RandomizedSearchCV. Accuracy, precision and recall as well as the F1-score and AUC-ROC were used to compare model performance, and the interpretability was achieved through SHAP ( SHapley Additive Explanations ). This is the best model, with XGBoost having a perfect accuracy (1.00) of all types of attack and a high F1-Scores of 0.99 of Web Attacks and 0.89 for Infiltration. These findings highlight the fact that XGBoost is better positioned to deal with common and rare cyberattacks, and can therefore be very useful in the intrusion detection systems of the real world. Huge recall and low false positive rates also prove that the model can be adopted in large-scale, real-time cybersecurity systems in which accuracy and ability to react is paramount. The SHAP analysis indicated the following characteristics, which were variable in terms of packet length, destination port and flow-based attributes which are critical in understanding and interpretation of the decision making process of the model. This study points out that XGBoost, with its accuracy, efficiency, and explainability, is the right model to use in the creation of scalable, interpretable IDS. The proposed framework offers a robust, reproducible approach that can be seamlessly deployed across diverse, high-throughput network environments, providing significant value to cybersecurity efforts. |
en_US |