Abstract:
Android is a most popular mobile-based operating system with billions of active users, which has encouraged hackers and cyber-criminals to push the malware into this operating system. Accordingly, extensive research has been conducted on malware analysis and detection for Android in recent years; and Android has developed and implemented numerous security controls to deal with the problems, including unique ID (UID) for each application, system permissions, and its distribution platform Google Play. In this paper, we evaluate four tree-based machine learning algorithms for detecting Android malware in conjunction with a substring-based feature selection method for the classifiers. In the experiments 11,120 apps of the DREBIN dataset were used where 5,560 contain malware samples and the rest are benign. It is found that the Random Forest classifier outperforms the best previously reported result (around 94% accuracy, obtained by SVM) with 97.24% accuracy, and thus provides a strong basis for building effective tools for Android malware detection.