Sign In

QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Systems

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Jordan Abi Nader, David Lee, Nathaniel Dennler, Andreea Bobu

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๋กœ๋ด‡์ด ์–ธ์–ด์™€ ๋ฌผ๋ฆฌ์  ํ–‰๋™ ํ”ผ๋“œ๋ฐฑ์„ ํšจ๊ณผ์ ์œผ๋กœ ์œตํ•ฉํ•˜์—ฌ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋ณด์ƒ ํ•จ์ˆ˜๋ฅผ ์ถ”๋ก ํ•˜๋Š” QuickLAP ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. QuickLAP์€ ์–ธ์–ด๋ฅผ ์‚ฌ์šฉ์ž์˜ ์ž ์žฌ์  ์„ ํ˜ธ๋„์— ๋Œ€ํ•œ ํ™•๋ฅ ์  ๊ด€์ฐฐ๋กœ ๊ฐ„์ฃผํ•˜์—ฌ, ๋ฌผ๋ฆฌ์  ์ˆ˜์ •์˜ ํ•ด์„์„ ๋ช…ํ™•ํžˆ ํ•˜๊ณ  ์–ด๋–ค ๋ณด์ƒ ํŠน์ง•์ด ์ค‘์š”ํ•œ์ง€ ํŒŒ์•…ํ•ฉ๋‹ˆ๋‹ค. LLM์„ ํ™œ์šฉํ•˜์—ฌ ์ž์œ  ํ˜•์‹์˜ ๋ฐœํ™”์—์„œ ๋ณด์ƒ ํŠน์ง• ์ฃผ์˜ ๋งˆ์Šคํฌ์™€ ์„ ํ˜ธ๋„ ๋ณ€ํ™”๋ฅผ ์ถ”์ถœํ•˜๊ณ , ์ด๋ฅผ ๋ฌผ๋ฆฌ์  ํ”ผ๋“œ๋ฐฑ๊ณผ ๊ฒฐํ•ฉํ•˜์—ฌ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋น ๋ฅธ ๋ณด์ƒ ํ•™์Šต์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
๋กœ๋ด‡์ด ์–ธ์–ด์  ์ง€์‹œ์™€ ๋ฌผ๋ฆฌ์  ํ–‰๋™ ๊ต์ •์„ ํ†ตํ•ฉํ•˜์—ฌ ์‚ฌ์šฉ์ž์˜ ์˜๋„๋ฅผ ๋” ์ •ํ™•ํ•˜๊ณ  ์‹ ์†ํ•˜๊ฒŒ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
์‚ฌ์šฉ์ž๋Š” ์–ธ์–ด์™€ ๋ฌผ๋ฆฌ์  ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•ด ๋กœ๋ด‡์˜ ํ•™์Šต ๊ณผ์ •์„ ๋” ์ž˜ ์ดํ•ดํ•˜๊ณ  ํ˜‘๋ ฅ์ ์ด๋ผ๊ณ  ๋А๋‚„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
LLM์„ ํ™œ์šฉํ•˜์—ฌ ์ž์—ฐ์–ด ํ”ผ๋“œ๋ฐฑ์—์„œ ์œ ์˜๋ฏธํ•œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๋Š” ๊ฒƒ์ด ๋ณด์ƒ ํ•™์Šต ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์ค‘์š”ํ•จ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
โ€ข
๋ณธ ์—ฐ๊ตฌ์˜ ๊ฒฐ๊ณผ๋Š” ์ž์œจ ์ฃผํ–‰ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ๊ฒ€์ฆ๋˜์—ˆ์ง€๋งŒ, ์‹ค์ œ ๋กœ๋ด‡ ์‹œ์Šคํ…œ ๋ฐ ๋” ๋ณต์žกํ•œ ํ™˜๊ฒฝ์—์„œ์˜ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ๊ณผ ์•ˆ์ •์„ฑ์— ๋Œ€ํ•œ ์ถ”๊ฐ€์ ์ธ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘