Abstract:
The increased usage and popularity of Android devices enables developers of malware to
produce new ways to develop malware in various packaged forms in different applications.
These malware causes fundamental information leakage and financial harm. Unethical
programmers and exploit writers repackages malicious code and launches again in the
market in the form of a new application. The repackaged software is regrettably most often
remains undetected. In this research, emphasis was given to the problem of repackaging
using the Bag-of-Word algorithm for implementing the source code and evaluating the
results using the machine learning. The results of the evaluation resembles 0.55 percent
better than the existing source code-based implantation in this field with modifications in
the Bag-of-Word technique and additional preprocessing of dataset. In this research a
vocabulary was generated to identify malicious source code structure. More 12 malicious
patterns were added to the existing 69 mischievous patterns. The concept was
practically incorporated via a web application. The proposed methodology offers a
comparatively newer approach to analyze malware source code to address malware
repackaging.