Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

RBT4DNN: Requirements-based Testing of Neural Networks

Created by
  • Haebom

Author

Nusrat Jahan Mozumder, Felipe Toledo, Swaroopa Dola, Matthew B. Dwyer

Outline

This paper proposes a requirements-driven testing method (RBT4DNN) that leverages natural language requirements specifications to address the difficulty of formulating functional requirements for deep neural networks (DNNs). RBT4DNN defines a semantic feature space using a glossary and formalizes preconditions of functional requirements as logical combinations of these features. Using training data consistent with these feature combinations, it fine-tunes a generative model to reliably generate test inputs that satisfy the preconditions. These tests are then run on the trained DNN, comparing the outputs with the expected behavior of the requirement postconditions. RBT4DNN presents two use cases: detecting defects in DNNs and providing feedback on model generalization through requirements-driven exploration of model behavior during development. Evaluation results demonstrate that RBT4DNN-generated tests are realistic, diverse, and consistent with requirement preconditions, enabling targeted analysis of model behavior and effective defect detection.

Takeaways, Limitations

Takeaways:
A novel requirements-based testing method is presented to address the challenges of DNN testing.
Generating DNN test inputs using natural language requirements specifications.
Targeted analysis of model behavior and effective defect detection possible
Provide feedback on model generalization during development.
Limitations:
Additional experiments and verification are needed to determine the applicability and effectiveness of the proposed method.
The accuracy and comprehensiveness of the glossary can affect test results.
Further research is needed on the applicability and scalability of complex DNNs and their requirements.
Possible inaccuracies in test input generation due to errors in natural language processing
👍