Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

SoundnessBench: A Soundness Benchmark for Neural Network Verifiers

Created by
  • Haebom

Author

Xingjian Zhou, Keyi Shen, Andy Xu, Hongji Xu, Cho-Jui Hsieh, Huan Zhang, Zhouxing Shi

Outline

This paper develops "SoundnessBench," a novel benchmark for testing the soundness of neural network (NN) validation. This benchmark focuses on evaluating the soundness of a validator in challenging cases, such as instances containing deliberately hidden counterexamples that are difficult to find and that are difficult for existing validators to handle. SoundnessBench encompasses a variety of model architectures, activation functions, and input data, and successfully identifies bugs in state-of-the-art NN validators.

Takeaways, Limitations

Takeaways:
We propose a new benchmark to evaluate the soundness of NN verifiers, addressing the limitations of existing benchmarks.
We develop a training method that generates instances containing hidden counterexamples to effectively identify errors in the verifier.
SoundnessBench supports a wide range of validation scenarios, including a variety of model architectures, activation functions, and input data.
We successfully identify bugs in state-of-the-art NN verifiers, demonstrating their practical applicability.
Limitations:
The paper may lack detailed descriptions of how to create and characterize specific benchmark instances.
Validation is needed to determine whether SoundnessBench can catch all types of NN validation errors.
It is possible that the way hidden counterexamples are generated is biased towards certain types of neural networks.
👍