Sign In

Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked?

Created by
  • Haebom
Category
Empty

์ €์ž

Seok Hwan Song, Mohna Chakraborty, Qi Li, Wallapak Tavanapong

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ์—ฐ๊ตฌ๋Š” ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(LLM)์ด ๋™์ผํ•œ ์ถ”๋ก  ๊ณผ์ œ์— ๋Œ€ํ•ด ์งˆ๋ฌธ ์œ ํ˜•์— ๋”ฐ๋ผ ์„ฑ๋Šฅ ์ฐจ์ด๋ฅผ ๋ณด์ด๋Š”์ง€ ํƒ๊ตฌํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์„ฏ ๊ฐ€์ง€ LLM์„ ๋Œ€์ƒ์œผ๋กœ ๊ฐ๊ด€์‹, ์ฐธ/๊ฑฐ์ง“, ๋‹จ๋‹ตํ˜•/์žฅ๋ฌธํ˜• ์งˆ๋ฌธ ๋“ฑ ์„ธ ๊ฐ€์ง€ ์œ ํ˜•์œผ๋กœ ๋‚˜๋ˆ„์–ด ์–‘์  ๋ฐ ์—ฐ์—ญ์  ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, ์งˆ๋ฌธ ์œ ํ˜•์— ๋”ฐ๋ผ LLM์˜ ์ถ”๋ก  ์ •ํ™•๋„์™€ ์ตœ์ข… ๋‹ต๋ณ€ ์„ ํƒ ์ •ํ™•๋„๊ฐ€ ์œ ์˜๋ฏธํ•˜๊ฒŒ ๋‹ค๋ฅด๋ฉฐ, ์„ ํƒ์ง€์˜ ๊ฐœ์ˆ˜์™€ ๋‹จ์–ด ์„ ํƒ์ด ์„ฑ๋Šฅ์— ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค๋Š” ์ ์„ ๋ฐœ๊ฒฌํ–ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
LLM์˜ ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•  ๋•Œ ์งˆ๋ฌธ ์œ ํ˜•์˜ ๋‹ค์–‘์„ฑ์ด ์„ฑ๋Šฅ์— ํฐ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋ฏ€๋กœ, ์ด๋ฅผ ๊ณ ๋ คํ•œ ํ‰๊ฐ€ ์„ค๊ณ„๊ฐ€ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
LLM์ด ์ถ”๋ก  ๊ณผ์ •์—์„œ ์ •ํ™•์„ฑ์„ ๋ณด์ด๋”๋ผ๋„ ์ตœ์ข… ๋‹ต๋ณ€์„ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์„ ํƒํ•˜๋Š” ๋Šฅ๋ ฅ๊ณผ๋Š” ๋ฐ˜๋“œ์‹œ ๋น„๋ก€ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
๋ณธ ์—ฐ๊ตฌ๋Š” ํŠน์ • LLM๊ณผ ์ถ”๋ก  ์œ ํ˜•์— ๊ตญํ•œ๋˜์—ˆ์œผ๋ฏ€๋กœ, ๋” ๊ด‘๋ฒ”์œ„ํ•œ ๋ชจ๋ธ ๋ฐ ๊ณผ์ œ์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘