DSpace Repository

SQLi Attack Detection Using Machine Learning Techniques for Web Application Security

Show simple item record

dc.contributor.author Hasan, Md. Siam
dc.date.accessioned 2026-04-27T04:25:02Z
dc.date.available 2026-04-27T04:25:02Z
dc.date.issued 2025-12-27
dc.identifier.citation SWT en_US
dc.identifier.uri http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/17073
dc.description Thesis Report en_US
dc.description.abstract This thesis addresses the persistent threat of SQL injection attacks, which remain one of the most critical vulnerabilities in web applications despite the widespread use of firewalls and input filters. Such traditional defenses often fail to generalize to previously unseen attack patterns. To tackle this limitation, we develop and evaluate a machine learning based detection framework for SQLi, designed to be integrated into web security. Incoming SQLi query are first preprocessed and transformed into TF-IDF feature vectors, capturing both benign and malicious query patterns. On top of these features, we train and compare six supervised classifiers: Logistic Regression, Linear Support Vector Machine, Decision Tree, Random Forest, Complement Naive Bayes and XGBoost. Models are assessed using ROC-AUC, Precision-Recall AUC (PR-AP), confusion matrices and class wise precision, recall and F1-score on a validation set of 3,981 samples. All the models achieved strong validation performance (ROC-AUC ≥ 99.57%, PR-AP ≥ 98.91%), with Random Forest and Logistic Regression showing particularly high accuracy. Logistic Regression is selected as the primary model based on its best validation PR-AP (99.90%) and consistently high F1-scores for both classes. On an independent test set of 4,280 requests, the selected model attains a ROC-AUC of 99.97% and PR-AP of 99.99%. After optimizing the decision threshold using an F2-score constraint and a cost sensitive objective that heavily penalizes missed attacks, the deployed configuration reaches 99.93% overall accuracy, with macro-F1 of 99.64%, detecting 4,057 out of 4,058 SQLi queries and misclassifying only two benign requests as attacks. These results demonstrate that a carefully tuned, interpretation friendly linear model on TF-IDF features can deliver near perfect SQLi detection performance, offering a practical and easily deployable enhancement to existing web security mechanisms. en_US
dc.description.sponsorship DIU en_US
dc.language.iso en_US en_US
dc.publisher Daffodil International University en_US
dc.subject SQL Injection en_US
dc.subject Attack en_US
dc.subject Detection en_US
dc.subject Machine learning en_US
dc.subject Web Security en_US
dc.title SQLi Attack Detection Using Machine Learning Techniques for Web Application Security en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account