Abstract:
Phishing is a deceptive culture and a shape of a cyber-attack schematic which evolved with the sole
intention of collecting confidential information by containing the camouflage of the original
website. Most of the people lead a broad range of business online, they can offer and purchase
merchandise, perform diverse banking deeds and indeed take part in political and social selection
through online vote casting. Neither purchaser nor vendor needs to meet for any type of transaction
and a purchaser can in some cases be trading with a deceptive business that does not really exist.
An ordinary hazard comes from reputed phishing websites, which have become an issue for online
banking and e-commerce clients. Phishing websites endeavor to trap individuals into uncovering
secure data in order for the fraudster to get to their accounts. The websites that look like
legitimate entities used for users who lack knowledge of browser clues and security
indicators.
The aim of the study is to propose an intelligent framework to detect phishing URLs which
generates a scientific report by evaluating various multi-layer approaches. This scientific report
provides information on the best architecture for phishing URLs detection and also helps antiphishing tools developers to make an initial decision about approach that should be followed.
This paper proposed a novel phishing URLs detection architecture using a) Deep Neural Network
(DNN) b) Neural Network (NN) c) Stacking. In the first level, stacking base classifier provides
temporary prediction along with cross validation and crisps prediction. After the completion of the
cross validation, the second level requires another additional classifier called meta-estimator that
is used in the train set and performed on a test set for final prediction. Neural networks work well
with this dataset for better training, time and complexity. Two types of neural networks are used
for neural network architecture, five layers are used for deep neural networks and two layers are
used for artificial neural networks. Optimized parameters have been used for neural network
architecture, along with five types of adaptive learning optimization algorithms, in combination
with which a better result is selected.
In the case of five-layer Deep Neural networks along with 50 epochs can provide higher accuracy
of 0.95, the minimum mean squared error of 0.30, and also a minimum error rate of 0.074. Using
two-layer neural networks along with 150 epochs can provide higher accuracy of 0.95, the
minimum mean squared error of 0.29 and also a minimum error rate of 0.07. Stack generalization
can reach maximum accuracy 0.97 in binary classification and also provide minimum error rate
MAE 2.1.
Machine learning approaches were utilized to identify the modern as well as the variation of
malicious URL viably. In any case, by the advancement of exploration in machine learning-based
inquiry about, it can be observed that deep learning-based architectures performed better in
comparison to the machine learning algorithm.