[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Stochastic Parameter Decomposition

Created by
  • Haebom

Author

Lucius Bushnaq, Dan Braun, Lee Sharkey

Outline

This paper focuses on a linear parameter decomposition framework that decomposes parameters into a sum of sparsely used vectors in neural network reverse engineering. To address the high computational cost and hyperparameter sensitivity of the existing mainstream method, Attribution-based Parameter Decomposition (APD), we present a more scalable and hyperparameter-robust Stochastic Parameter Decomposition (SPD). We show that SPD enables larger and more complex model decomposition than APD, solves the parameter reduction problem, and better identifies real mechanisms in toy models. By coupling causal mediation analysis and neural network decomposition methods, we remove the scalability barrier of linear parameter decomposition methods for large models, opening new possibilities for studying mechanistic interpretability. The library for implementing SPD and reproducing the experiments is provided in https://github.com/goodfire-ai/spd .

Takeaways, Limitations

Takeaways:
We present SPD, a novel method to address the computational cost and hyperparameter sensitivity issues of APD.
Capable of decomposing larger and more complex models than APD.
Solving parameter reduction problems and identifying real mechanisms more accurately in toy models.
Presenting new possibilities for studying mechanistic interpretability by combining causal mediation analysis and neural network decomposition methods.
Open source library for running SPD and reproducing experiments.
Limitations:
Further validation is needed to determine how well SPD's performance generalizes to real-world large-scale complex models.
There is a need to evaluate the applicability and performance of SPD to various types of neural network architectures.
Further research is needed on hyperparameter optimization strategies for SPD.
👍