Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

EvoP: Robust LLM Inference via Evolutionary Pruning

Created by
  • Haebom

Author

Shangyu Wu, Hongchao Du, Ying Xiong, Shuai Chen, Tei-wei Kuo, Nan Guan, Chun Jason Xue

Outline

This paper proposes EvoP, an evolutionary pruning framework, to address the problem of deploying large-scale language models (LLMs) in resource-constrained environments. To overcome the heuristic strategies and data feature neglect of existing model pruning methods, EvoP introduces a cluster-based correction dataset sampling (CCDS) strategy to generate diverse correction datasets and an evolutionary pruning pattern search (EPPS) method to identify optimal pruning patterns. Experiments on various LLMs and subtasks demonstrate the effectiveness of EvoP, demonstrating its practical and scalable solution for deploying LLMs in real-world applications.

Takeaways, Limitations

Takeaways:
We solved the performance degradation problem of existing heuristic LLM pruning methods.
We have enabled more efficient and high-performance pruning by taking into account data characteristics.
Achieved excellent performance and efficiency in various LLM and subtasks.
Provides practical solutions for deploying LLM in real-world applications.
Limitations:
EvoP's performance gains may be biased towards specific datasets or LLMs.
The EPPS algorithm can have high computational complexity.
Further research is needed on generalization performance on different hardware platforms.
👍