Nice talk! I disagree that adversarial robustness has only one attack and differs from other computer security in that way. Once the simple PGD attack is solved in a tight epsilon ball, you still can’t say there is no adversarial image that breaks the model. Enumerating all possible attacks is still very difficult/ impossible for now.
Then also add the fact that the epsilon ball is meaningless from a human perspective. If the ball gets large enough the pertubations range from, in our interpretation 'oh yeah that is definitely still a cat' to 'this is just random gibberish and not even a cat anymore so I cannot blame the AI for saying something wildy different'
@StijnRadboud currently, neural networks are not even robust against tiny tiny epsilon, i.e. 1-5% pixel image change. All of these attacks produce human imperceptible changes.
Nice talk! I disagree that adversarial robustness has only one attack and differs from other computer security in that way.
Once the simple PGD attack is solved in a tight epsilon ball, you still can’t say there is no adversarial image that breaks the model. Enumerating all possible attacks is still very difficult/ impossible for now.
Then also add the fact that the epsilon ball is meaningless from a human perspective. If the ball gets large enough the pertubations range from, in our interpretation 'oh yeah that is definitely still a cat' to 'this is just random gibberish and not even a cat anymore so I cannot blame the AI for saying something wildy different'
this problem has been studied, see formal verification of neural networks such as alpha-beta CROWN and MILP/SMT solvers.
@StijnRadboud currently, neural networks are not even robust against tiny tiny epsilon, i.e. 1-5% pixel image change. All of these attacks produce human imperceptible changes.