Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Inferring Pluggable Types with Machine Learning

Created by
  • Haebom

Author

Kazi Amanul Islam Siddiqui, Martin Kellogg

Outline

This paper studies automatic type qualifier inference for pluggable type systems, which extend the type system of a programming language to incorporate programmer-defined semantic properties. Specifically, we present a method for automatically inferring type qualifiers using machine learning, enabling easy application of pluggable type systems to legacy codebases. To achieve this, we propose a novel representation, NaP-AST, and evaluate various model architectures, including a Graph Transformer Network (GTN), a Graph Convolutional Network, and a Large Language Model. We validate the performance of our models by applying them to 12 open-source programs used in the previous evaluation of the NullAway pluggable type checker, with GTN demonstrating the best performance. Furthermore, we conduct research to estimate the number of Java classes required for a trained model to perform well.

Takeaways, Limitations

Takeaways:
We present the potential to improve type safety in legacy codebases by lowering the barrier to adoption of pluggable type systems using machine learning.
Improve the efficiency of type qualifier inference through a new representation called NaP-AST.
The GTN model demonstrates high recall (0.89) and precision (0.6), proving its effectiveness in real open source projects.
Increases the efficiency of model development by providing guidelines on the number of Java classes required for model training.
Limitations:
Low accuracy may result in false positives.
There are a number of Java classes that can suffer from poor performance due to model overfitting.
Further research is needed to apply the model in real-world settings (e.g., tuning, generalization).
👍