Sign In

Measuring and mitigating overreliance to build human-compatible AI

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Lujain Ibrahim, Katherine M. Collins, Sunnie S. Y. Kim, Anka Reuel, Max Lamparth, Kevin Feng, Lama Ahmad, Prajna Soni, Alia El Kattan, Merlin Stein, Siddharth Swaroop, Vishakh Padmakumar, Ilia Sucholutsky, Andrew Strait, Diyi Yang, Q. Vera Liao, Umang Bhatt

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(LLM)์ด ํ˜‘๋ ฅ์ ์ธ '์‚ฌ๊ณ  ํŒŒํŠธ๋„ˆ'๋กœ์„œ ์ธ๊ฐ„์˜ ์˜์‚ฌ๊ฒฐ์ •์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ๋ ฅ์ด ์ปค์ง์— ๋”ฐ๋ผ, LLM์˜ ๋Šฅ๋ ฅ ์ด์ƒ์œผ๋กœ ์˜์กดํ•˜๋Š” '๊ณผ์‹ '์˜ ์œ„ํ—˜์ด ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ์Œ์„ ์ง€์ ํ•ฉ๋‹ˆ๋‹ค. ์—ฐ๊ตฌ๋Š” ๊ฐœ์ธ ๋ฐ ์‚ฌํšŒ์  ์ˆ˜์ค€์˜ ๊ณผ์‹  ์œ„ํ—˜์„ ํ†ตํ•ฉํ•˜๊ณ , LLM์˜ ํŠน์„ฑ, ์‹œ์Šคํ…œ ์„ค๊ณ„, ์‚ฌ์šฉ์ž ์ธ์ง€ ํŽธํ–ฅ์ด ๊ณผ์‹ ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ํƒ๊ตฌํ•˜๋ฉฐ, ๊ธฐ์กด ์ธก์ • ๋ฐฉ๋ฒ•์˜ ํ•œ๊ณ„๋ฅผ ๋ณด์™„ํ•  ์ƒˆ๋กœ์šด ์ธก์ • ๋ฐฉํ–ฅ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด LLM์ด ์ธ๊ฐ„์˜ ๋Šฅ๋ ฅ์„ ์ €ํ•ดํ•˜์ง€ ์•Š๊ณ  ์ฆ๊ฐ•ํ•˜๋„๋ก ๊ณผ์‹ ์„ ์ธก์ •ํ•˜๊ณ  ์™„ํ™”ํ•˜๋Š” ๋ฐฉ์•ˆ์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
LLM ๊ณผ์‹ ์€ ์˜๋ฃŒ, ๊ฐœ์ธ ์ƒ๋‹ด ๋“ฑ ๊ณ ์œ„ํ—˜ ๋ถ„์•ผ์—์„œ ์˜ค๋ฅ˜ ๋ฐœ์ƒ, ๊ฑฐ๋ฒ„๋„Œ์Šค ๋ฌธ์ œ, ์ธ์ง€ ๋Šฅ๋ ฅ ์ €ํ•˜ ๋“ฑ ์‹ฌ๊ฐํ•œ ๊ฐœ์ธ์ , ์‚ฌํšŒ์  ์œ„ํ—˜์„ ์•ผ๊ธฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
LLM์˜ ํŠน์„ฑ, ์‹œ์Šคํ…œ ์„ค๊ณ„, ์‚ฌ์šฉ์ž์˜ ์ธ์ง€ ํŽธํ–ฅ์ด ๋ณตํ•ฉ์ ์œผ๋กœ ์ž‘์šฉํ•˜์—ฌ LLM์— ๋Œ€ํ•œ ๊ณผ์‹  ๋ฌธ์ œ๋ฅผ ์‹ฌํ™”์‹œํ‚ค๋ฏ€๋กœ, ์ด์— ๋Œ€ํ•œ ์ข…ํ•ฉ์ ์ธ ์ดํ•ด๊ฐ€ ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค.
โ€ข
๊ธฐ์กด์˜ ๊ณผ์‹  ์ธก์ • ๋ฐฉ์‹์€ ์ค‘์š”ํ•œ ํ•œ๊ณ„์ ์„ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ, ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์ธก์ • ๋ฐฉ๋ฒ•๋ก ์˜ ๊ฐœ๋ฐœ ๋ฐ ๊ณผ์‹  ์™„ํ™” ์ „๋žต์˜ ์‹ค์งˆ์ ์ธ ์ ์šฉ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘