Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

ACING: Actor-Critic for Instruction Learning in Black-Box LLMs

작성자
  • Haebom

Author

Salma Kharrat, Fares Fourati, Marco Canini

Outline

This paper presents ACING, an automated prompt optimization technique for improving the performance of large-scale language models (LLMs). ACING, a reinforcement learning-based framework that operates even in black-box environments where the LLM's parameters and gradients are inaccessible, formulates prompt optimization as a stateless continuous-action problem, exploring an infinite prompt space. Experimental results show that ACING generates prompts that outperform human-generated prompts 76% of the time across a variety of tasks (instruction-induction, summarization, and thought-chain inference), achieving up to 33 points and a median performance improvement of 10 points over the best automated baseline model. Extensive additional experiments confirm the robustness and efficiency of ACING. The source code is available on GitHub.

Takeaways, Limitations

Takeaways:
An effective prompt optimization technique for black-box LLM is presented.
Demonstrating the feasibility of automatically generating prompts that surpass human-generated prompts.
Presenting a general framework applicable to a variety of LLM tasks.
Increased reproducibility and usability through the disclosure of ACING's source code.
Limitations:
Need to verify generalization performance for specific LLMs and tasks.
Further analysis of ACING's computational cost and training time is needed.
Further research is needed on the applicability and performance differences of different types of black-box LLMs.
👍