Sign In

How Well Do LLMs Perform on the Simplest Long-Chain Reasoning Tasks: An Empirical Study on the Equivalence Class Problem

Created by
  • Haebom
Category
Empty

์ €์ž

Chun Zheng, Lianlong Wu, Bingqian Li, Lvting Liu, Yi Zhou

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ์—ฐ๊ตฌ๋Š” ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(LLM)์ด ๊ฐ€์žฅ ๋‹จ์ˆœํ•œ ํ˜•ํƒœ์˜ ์žฅ๊ธฐ ์ถ”๋ก  ๊ณผ์ œ์ธ ๋™์น˜ ํด๋ž˜์Šค ๋ฌธ์ œ(Equivalence Class Problem, ECP)์—์„œ ์–ผ๋งˆ๋‚˜ ์ž˜ ์ˆ˜ํ–‰ํ•˜๋Š”์ง€ ์‹ค์ฆ์ ์œผ๋กœ ๋ถ„์„ํ•ฉ๋‹ˆ๋‹ค. ์ถ”๋ก  ๋Šฅ๋ ฅ์ด ๊ฐ•ํ™”๋œ LLM๊ณผ ๊ทธ๋ ‡์ง€ ์•Š์€ LLM์„ ๋Œ€์ƒ์œผ๋กœ ๋‹ค์–‘ํ•œ ๋ฌธ์ œ ์ธ์Šคํ„ด์Šค์— ๋Œ€ํ•œ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ, ์ถ”๋ก  ๋Šฅ๋ ฅ์ด ์—†๋Š” LLM์€ ECP๋ฅผ ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ•˜๋ฉฐ, ์ถ”๋ก  ๋Šฅ๋ ฅ์ด ๊ฐ•ํ™”๋œ LLM์€ ์ƒ๋‹นํ•œ ๊ฐœ์„ ์„ ๋ณด์˜€์œผ๋‚˜ ์—ฌ์ „ํžˆ ์™„๋ฒฝํ•˜๊ฒŒ ํ•ด๊ฒฐํ•˜์ง€๋Š” ๋ชปํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
์ถ”๋ก  ๋Šฅ๋ ฅ์ด ๊ฐ•ํ™”๋œ LLM์€ ๋‹จ์ˆœ ์žฅ๊ธฐ ์ถ”๋ก  ๊ณผ์ œ์—์„œ ๋šœ๋ ทํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋ณด์ด์ง€๋งŒ, ์—ฌ์ „ํžˆ ํ•ด๊ฒฐ์— ์–ด๋ ค์›€์„ ๊ฒช์Šต๋‹ˆ๋‹ค.
โ€ข
๋ฌธ์ œ์˜ ๊ตฌ์กฐ์  ํŠน์„ฑ(์˜ˆ: ์—ฐ๊ฒฐ์„ฑ ํ™•๋ฅ , ์ตœ๋Œ€ ์ง€๋ฆ„)์— ๋”ฐ๋ผ LLM์˜ ์ถ”๋ก  ๋‚œ์ด๋„๊ฐ€ ๋‹ฌ๋ผ์ง€๋ฉฐ, ์ด๋Š” ๋ฌธ์ œ์˜ ํ˜ผ๋ž€๋„ ๋˜๋Š” ์ถ”๋ก  ๋ณต์žก์„ฑ๊ณผ ์—ฐ๊ด€๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํ˜„์žฌ LLM๋“ค์€ ๋‹จ์ˆœ ์žฅ๊ธฐ ์ถ”๋ก  ๊ณผ์ œ์—์„œ๋„ ์™„๋ฒฝํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์ง€ ๋ชปํ•˜๋ฏ€๋กœ, ์ถ”๋ก  ๋Šฅ๋ ฅ ํ–ฅ์ƒ์„ ์œ„ํ•œ ์ถ”๊ฐ€์ ์ธ ์—ฐ๊ตฌ ๋ฐ ๋ชจ๋ธ ๊ฐœ์„ ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘