Sign In

Reasoning Models Struggle to Control their Chains of Thought

Created by
  • Haebom
Category
Empty

์ €์ž

Chen Yueh-Han, Robert McCarthy, Bruce W. Lee, He He, Ian Kivlichan, Bowen Baker, Micah Carroll, Tomek Korbak

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ์—ฐ๊ตฌ๋Š” ์ตœ์‹  ์ถ”๋ก  ๋ชจ๋ธ์ด ์‚ฌ๊ณ  ๊ณผ์ •(Chain-of-Thought, CoT)์„ ์–ผ๋งˆ๋‚˜ ์ž˜ ์ œ์–ดํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” CoT-Control ํ‰๊ฐ€ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋ชจ๋ธ์ด CoT ์ƒ์—์„œ ํŠน์ • ์ œ์•ฝ ์กฐ๊ฑด์„ ๋”ฐ๋ฅด๋„๋ก ์œ ๋„ํ–ˆ์„ ๋•Œ, ๋Œ€๋ถ€๋ถ„์˜ ๋ชจ๋ธ์ด ์ด๋ฅผ ์ œ๋Œ€๋กœ ๋”ฐ๋ฅด์ง€ ๋ชปํ•˜๋ฉฐ, ํŠนํžˆ CoT ์ œ์–ด ๋Šฅ๋ ฅ์€ ์ตœ์ข… ์ถœ๋ ฅ ์ œ์–ด ๋Šฅ๋ ฅ์— ๋น„ํ•ด ํ˜„์ €ํžˆ ๋‚ฎ์Œ์„ ๋ฐœ๊ฒฌํ–ˆ๋‹ค. ์ด๋Š” CoT ๋ชจ๋‹ˆํ„ฐ๋ง์˜ ์‹ ๋ขฐ์„ฑ์„ ์œ ์ง€ํ•˜๋Š” ๋ฐ ๊ธ์ •์ ์ธ ์‹ ํ˜ธ์ด์ง€๋งŒ, ๋‚ฎ์€ ์ œ์–ด ๋Šฅ๋ ฅ์˜ ๊ทผ๋ณธ์ ์ธ ์›์ธ์€ ์•„์ง ๋ช…ํ™•ํžˆ ๊ทœ๋ช…๋˜์ง€ ์•Š์•˜๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
์ตœ์‹  ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ๋“ค์€ ์‚ฌ๊ณ  ๊ณผ์ •(CoT) ์ƒ์—์„œ ํŠน์ • ์ œ์•ฝ์„ ์ค€์ˆ˜ํ•˜๋Š” ๋Šฅ๋ ฅ์ด ๋‚ฎ์•„, CoT ๋ชจ๋‹ˆํ„ฐ๋ง์˜ ์ž ์žฌ์ ์ธ ์ทจ์•ฝ์ ์ด ํฌ์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค.
โ€ข
CoT ์ œ์–ด ๋Šฅ๋ ฅ์€ ๋ชจ๋ธ ํฌ๊ธฐ์— ๋น„๋ก€ํ•˜์—ฌ ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์œผ๋ฉฐ, RL ํ•™์Šต์ด๋‚˜ ๋ฌธ์ œ ๋‚œ์ด๋„ ์ฆ๊ฐ€์— ๋”ฐ๋ผ ๊ฐ์†Œํ•œ๋‹ค.
โ€ข
๋‚ฎ์€ CoT ์ œ์–ด ๋Šฅ๋ ฅ์˜ ์ •ํ™•ํ•œ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ด ์•„์ง ๋ฐํ˜€์ง€์ง€ ์•Š์•˜์œผ๋ฉฐ, ํ–ฅํ›„ ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด ์ด๋ฅผ ์ดํ•ดํ•˜๊ณ  CoT ๋ชจ๋‹ˆํ„ฐ๋ง์˜ ์‹ ๋ขฐ์„ฑ์„ ๋”์šฑ ๋†’์ผ ํ•„์š”๊ฐ€ ์žˆ๋‹ค.
๐Ÿ‘