Malware classifiers are subject to training-time exploitation due to the need
to regularly retrain using samples collected from the wild. Recent work has
demonstrated the feasibility of backdoor attacks against malware classifiers,
and yet the stealthiness of such attacks is not well understood. In this paper,
we investigate this phenomenon under the clean-label setting (i.e., attackers
do not have complete control over the training or labeling process).
Empirically, we show that existing backdoor attacks in malware classifiers are
still detectable by recent defenses such as MNTD. To improve stealthiness, we
propose a new attack, Jigsaw Puzzle (JP), based on the key observation that
malware authors have little to no incentive to protect any other authors’
malware but their own. As such, Jigsaw Puzzle learns a trigger to complement
the latent patterns of the malware author’s samples, and activates the backdoor
only when the trigger and the latent pattern are pieced together in a sample.
We further focus on realizable triggers in the problem space (e.g., software
code) using bytecode gadgets broadly harvested from benign software. Our
evaluation confirms that Jigsaw Puzzle is effective as a backdoor, remains
stealthy against state-of-the-art defenses, and is a threat in realistic
settings that depart from reasoning about feature-space only attacks. We
conclude by exploring promising approaches to improve backdoor defenses.

Go to Source of this post
Author Of this post: <a href="">Limin Yang</a>, <a href="">Zhi Chen</a>, <a href="">Jacopo Cortellazzi</a>, <a href="">Feargus Pendlebury</a>, <a href="">Kevin Tu</a>, <a href="">Fabio Pierazzi</a>, <a href="">Lorenzo Cavallaro</a>, <a href="">Gang Wang</a>

By admin