Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Analyzing Latent Concepts in Code Language Models

Created by
  • Haebom

Author

Arushi Sharma, Vedant Pungliya, Christopher J. Quinn, Ali Jannesari

Code Concept Analysis (CoCoA): An Interpretable Code Language Model

Outline

This paper presents a method for interpreting the internal workings of large-scale language models trained on code, focusing on applications requiring trustworthiness, transparency, and semantic robustness. We propose a global posterior interpretability framework, Code Concept Analysis (CoCoA), which clusters contextualized token embeddings into human-interpretable concept groups, thereby uncovering the lexical, syntactic, and semantic structures emerging in the representational space of the code language model. We propose a hybrid annotation pipeline that combines static analysis-based phrase alignment with a prompt-engineered large-scale language model (LLM) to scalably label latent concepts across levels of abstraction. Experimental evaluations across multiple models and tasks demonstrate that CoCoA remains stable under semantically preserving perturbations (average cluster sensitivity index, CSI = 0.288) and discovers concepts that evolve predictably with fine-tuning. A user study on a programming language classification task demonstrates that concept-enhanced explanations clarify token roles and improve human-centered explainability by 37% compared to token-level attribution using unified gradients.

Takeaways, Limitations

Takeaways:
We propose a new framework (CoCoA) for interpreting the internal workings of code language models.
Analyze the model's representation space by understanding its vocabulary, syntax, and semantic structure.
Scalable concept labeling via a hybrid annotation pipeline.
Demonstrating predictable concept evolution through model stability and fine tuning.
Concept-based explanations improve token role clarity and human-centered explainability (37% increase).
Limitations:
There is no Limitations specified in the paper.
👍