Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot

Created by
  • Haebom

Author

Xiang Cheng, Chengyan Pan, Minjun Zhao, Deyang Li, Fangchao Liu, Xinyu Zhang, Xiao Zhang, Yong Liu

Outline

This paper investigates the impact of Chain-of-Thought (CoT) examples in In-Context Learning (ICL) on improving the mathematical reasoning ability of state-of-the-art large-scale language models (LLMs). Specifically, we find that, for powerful models such as the Qwen2.5 series, existing CoT examples do not improve performance compared to Zero-Shot CoT, and instead contribute to output format alignment. Furthermore, we find that even enhanced CoT examples composed of answers from superior models fail to improve the model's reasoning ability, with the model tending to ignore examples and focus on instructions. Overall, this study highlights limitations in improving mathematical reasoning ability with the current ICL+CoT framework and calls for a reexamination of the ICL paradigm and the definition of examples.

Takeaways, Limitations

Takeaways:
For modern LLMs, existing CoT examples may not contribute to improving mathematical reasoning skills.
The main function of the CoT example may be limited to sorting the output format.
An improved CoT example (using the answers of the top model) may also fail to improve inference ability.
The LLM can focus more on instructions than examples in an ICL environment.
The limitations of the ICL+CoT framework in improving mathematical reasoning capabilities were revealed.
Limitations:
A reexamination of the current ICL paradigm and its illustrative definitions is needed.
Development of new ICL strategies to improve reasoning ability is required.
More in-depth research is needed into how to use examples of models.
👍