Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Call Me Maybe: Enhancing JavaScript Call Graph Construction using Graph Neural Networks

Created by
  • Haebom

Author

Masudul Hasan Masud Bhuiyan, Gianluca De Stefano, Giancarlo Pellegrino, Cristian-Alexandru Staicu

Outline

In this paper, we present GRAPHIA, a novel approach for generating accurate call graphs of JavaScript programs. Existing static analysis techniques have limitations in generating incomplete or inaccurate call graphs due to the complex language features of JavaScript. GRAPHIA predicts missing call edges by predicting link pairs based on graph neural networks (GNNs) using inaccurate static call information and dynamic execution information. We construct program graphs containing various types of edges and conduct experiments on 50 popular JavaScript libraries. As a result, we show that the correct target function is a top candidate in more than 42% of outstanding call cases and is ranked within the top five candidates in more than 72% of cases, thereby enhancing the efficiency of static analysis. This is the first attempt to apply GNN-based link prediction to the entire program graph.

Takeaways, Limitations

Takeaways:
We demonstrate that GNN-based link prediction can be used to improve the recall of JavaScript call graph generation.
Suggests the possibility of generating more accurate call graphs by leveraging inaccurate static/dynamic information.
Validation of GRAPHIA's effectiveness through experimental results on large-scale JavaScript libraries.
May contribute to improving the accuracy of static analysis tools.
Limitations:
It is difficult to compare with general machine learning performance evaluations, as the evaluation metric is the proportion of correct target functions included in the top k candidates, rather than the ROC curve.
Additional generalization performance verification is needed for different JavaScript code styles and complexities.
The training and inference processes of GNN models can be computationally expensive.
👍