URL Based Fake Website Detection Using Machine Learning Algorithm

Dowla, MD Mursalin; Ahmed, K. M. Nayem

DSpace Home
→
Faculty of Science and Information Technology
→
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
→
Project Report
→
View Item

URL Based Fake Website Detection Using Machine Learning Algorithm

Dowla, MD Mursalin; Ahmed, K. M. Nayem

URI: http://dspace.daffodilvarsity.edu.bd:8080/handle/123456789/8802

Date: 2022-01-02

Abstract:

Online platforms, mostly social media platforms have become a very important part of our life. We are very used to online content and sites. But many URLs lead us to fake sites. These are intentionally created to mislead users to gain certain information. This generally leads us to account hacking or information thieves. To identify these sites and stop users from using these URLs we will discuss machine learning algorithms and we will also have a dataset and apply all those algorithms on our dataset. Dataset was collected from various online open source platforms. A total of 20,000 data was collected and used, half of which was fake URLs and the other half was real URLs. First, we extracted many features from our initial dataset which was later used to train our model. We used an anaconda environment to implement our project. A Jupyter notebook was used to do the necessary codes. We were successful in extracting necessary features and applying machine learning algorithms. The dataset was divided into 80:20 ratio for training and testing purposes. The best supervised machine learning algorithms were chosen to train our model. Random Forest Classifier got the highest success from our model by gaining maximum accuracy. We got 97.50% accuracy from the Random Forest Classifier. Finally, the model was saved for later improvements. By this we believe we will have the best machine learning approach to detect fake content or sites that are online. Hopefully this will help detect online fake URLs and save users from its attacks

Show full item record