2SMaRT: A Two-Stage Machine Learning-Based Approach for Run-Time Specialized Hardware-Assisted Malware Detection

Hossein Sayadi1,a, Hosein Mohammadi Makrani1,b, Sai Manoj Pudukotai Dinakarrao1,c, Tinoosh Mohsenin2, Avesta Sasan1,d, Setareh Rafatirad1,e and Houman Homayoun1,f
1Department of Electrical and Computer Engineering, George Mason University
ahsayadi@gmu.edu
bhmohamm8@gmu.edu
cspudukot@gmu.edu
dasasan@gmu.edu
esrafatir@gmu.edu
fhhomayou@gmu.edu
2Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County
tinoosh@umbc.edu

ABSTRACT


Hardware-assisted Malware Detection (HMD) has emerged as a promising solution to improve the security of computer systems using Hardware Performance Counters (HPCs) information collected at run-time. While several recent studies proposed machine learning-based solutions to identify malware using HPCs, they rely on a large number of microarchitectural events to achieve high accuracy and detection rate. More importantly, they have largely overlooked complexity-effective prediction of malware classes at run-time. As we show in this work, the detection performance of malware classifiers is highly dependent on the number of available HPCs and varies significantly across classes of malware. The limited number of available HPCs in modern microprocessors that can be simultaneously captured makes run-time malware detection with high detection performance using existing solutions a challenging problem, as they require multiple runs of applications to collect a sufficient number of microarchitectural events. In response, in this paper, we first identify the most important HPCs for HMD using an effective feature reduction method. We then develop a specialized two-stage run-time HMD referred as 2SMaRT. 2SMaRT first classifies applications using a multiclass classification technique into either benign or one of the malware classes (Virus, Rootkit, Backdoor, and Trojan). In the second stage, to have a high detection performance, 2SMaRT deploys a machine learning model that works best for each class of malware. To realize an effective run-time solution that relies on only available HPCs, 2SMaRT is further customized using an ensemble learning technique to boost the performance of general malware detectors. The experimental results show that 2SMaRT using ensemble technique with just 4HPCs outperforms state-of-the-art classifiers with 8HPCs by up to 31.25% in terms of detection performance, on average across different classes of malware.

Keywords: Run-Time Malware Detection, Hardware Performance Counters, Machine Learning.



Full Text (PDF)