This paper develops "SoundnessBench," a novel benchmark for testing the soundness of neural network (NN) validation. This benchmark focuses on evaluating the soundness of a validator in challenging cases, such as instances containing deliberately hidden counterexamples that are difficult to find and that are difficult for existing validators to handle. SoundnessBench encompasses a variety of model architectures, activation functions, and input data, and successfully identifies bugs in state-of-the-art NN validators.