RADAR: Run-time Adversarial Weight Attack Detection and Accuracy Recovery
Jingtao Li1,a, Adnan Siraj Rakin1,b, Zhezhi He2, Deliang Fan1,c and Chaitali Chakrabarti1,d
1School of Electrical Computer and Energy Engineering, Arizona State University, Tempe, AZ, 85287
ajingtao1@asu.edu
basrakin@asu.edu
cdfan@asu.edu
dchaitali@asu.edu
2Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai
zhezhi.he@sjtu.edu.cn
ABSTRACT
Adversarial attacks on Neural Network weights, such as the progressive bit-flip attack (PBFA), can cause a catastrophic degradation in accuracy by flipping a very small number of bits. Furthermore, PBFA can be conducted at run time on the weights stored in DRAM main memory. In this work, we propose RADAR, a Run-time adversarial weight Attack Detection and Accuracy Recovery scheme to protect DNN weights against PBFA. We organize weights that are interspersed in a layer into groups and employ a checksum-based algorithm on weights to derive a 2-bit signature for each group. At run time, the 2-bit signature is computed and compared with the securely stored golden signature to detect the bit-flip attacks in a group. After successful detection, we zero out all the weights in a group to mitigate the accuracy drop caused by malicious bit-flips. The proposed scheme is embedded in the inference computation stage. For the ResNet-18 ImageNet model, our method can detect 9.6 bit-flips out of 10 on average. For this model, the proposed accuracy recovery scheme can restore the accuracy from below 1% caused by 10 bit flips to above 69%. The proposed method has extremely low time and storage overhead. System-level simulation on gem5 shows that RADAR only adds <1% to the inference time, making this scheme highly suitable for run-time attack detection and mitigation.
Keywords: Neural Networks, Weight Attack, Run-Time Detection, Protection.