Consider patch attacks, where at test-time an adversary manipulates a test
image with a patch in order to induce a targeted misclassification. We consider
a recent defense to patch attacks, Patch-Cleanser (Xiang et al. [2022]). The
Patch-Cleanser algorithm requires a prediction model to have a “two-mask
correctness” property, meaning that the prediction model should correctly
classify any image when any two blank masks replace portions of the image.
Xiang et al. learn a prediction model to be robust to two-mask operations by
augmenting the training set with pairs of masks at random locations of training
images and performing empirical risk minimization (ERM) on the augmented

However, in the non-realizable setting when no predictor is perfectly correct
on all two-mask operations on all images, we exhibit an example where ERM
fails. To overcome this challenge, we propose a different algorithm that
provably learns a predictor robust to all two-mask operations using an ERM
oracle, based on prior work by Feige et al. [2015]. We also extend this result
to a multiple-group setting, where we can learn a predictor that achieves low
robust loss on all groups simultaneously.

Go to Source of this post
Author Of this post: <a href="">Saba Ahmadi</a>, <a href="">Avrim Blum</a>, <a href="">Omar Montasser</a>, <a href="">Kevin Stangl</a>

By admin