Phishing Website Detection Using Ensemble-Based  Machine Learning Approaches

Rahat, Syed Naimur Rahman

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF SOFTWARE ENGINEERING
→
Thesis Report
→
View Item

Phishing Website Detection Using Ensemble-Based Machine Learning Approaches

Rahat, Syed Naimur Rahman

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/17061

Date: 2025-12-27

Abstract:

Phishing is relatively one of the most common and dangerous cybersecurity threats in existence and convinces users to expose sensitive information by using fake websites. These attacks have dynamic structures and advanced obfuscation techniques that render traditional detection methods ineffective against them. A Pragmatic Analysis on Detection of Phishing Based on URL · This work advocates a URL phishing detection framework based on machine learning that can differentiate the legitimacy of the URLs in high accuracy. To overcome the imbalance of the data, balanced data was created by using the SMOTE technique and then three efficient models were implemented and trained, namely, LightGBM, Random Forest and XGBoost. The strengths of these base classifiers were used to combine them and create a hybrid stacking ensemble model called HyPhish-Net to achieve better detection performance. To enhance interpretability and enhance performance, feature selection as well as correlation analysis were conducted comprehensively. Based on the experimental results shown, all base models give a flexibly robust accuracy but the highest accuracy features score in the proposed HyPhish-Net model is 98.6%, of precision of 0.987 and recall of 0.985. The assessment proved that the model could drastically decrease false footage and false footage, assuring the high reliability of phishing detection. The performance metrics, including confusion matrices and ROC analysis, showed excellent generalizability of HyPhish-Net across training and testing data. A robust equilibrium between sensitivity and specificity, outperforming the best individual models on all the key indices of its performance, was attained by the system. The proposed model can be deployed in browsers, email systems and cybersecurity infrastructures because of its robustness and scalability. Ultimately, the study concludes that if intelligently implemented, ensemble-based learning provides an effective solution for automated, intelligent phishing detection. In summary, this thesis helps us with implementing more secure and adaptive detection mechanism to protect users from real world digital environments.