Sign In

PRISM: Personalized Refinement of Imitation Skills for Manipulation via Human Instructions

Created by
  • Haebom
Category
Empty

์ €์ž

Arnau Boix-Granell, Alberto San-Miguel-Tello, Magi Dalmau-Moreno, Nestor Garcia

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๋กœ๋ด‡ ์กฐ์ž‘ ๋ถ„์•ผ์—์„œ ๋ช…๋ น์–ด ๊ธฐ๋ฐ˜ ๋ชจ๋ฐฉ ํ•™์Šต ์ •์ฑ…์˜ ๊ฐœ์ธํ™”๋œ ๊ฐœ์„  ๋ฐฉ๋ฒ•์ธ PRISM์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. PRISM์€ ๋ชจ๋ฐฉ ํ•™์Šต(IL)๊ณผ ๊ฐ•ํ™” ํ•™์Šต(RL) ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ, ์‚ฌ์šฉ์ž ์•ˆ๋‚ด ์‹œ์—ฐ์œผ๋กœ๋ถ€ํ„ฐ ์ƒ์„ฑ๋œ ์ผ๋ฐ˜์ ์ธ ์ž‘์—…์— ๋Œ€ํ•œ ๋ชจ๋ฐฉ ์ •์ฑ…์„ ๊ฐ•ํ™” ํ•™์Šต์„ ํ†ตํ•ด ๋ฏธ์„ธํ•œ ์ƒˆ๋กœ์šด ํ–‰๋™์œผ๋กœ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์€ ์ž์—ฐ์–ด ์ž‘์—… ์„ค๋ช…์œผ๋กœ๋ถ€ํ„ฐ ๋ฐ˜๋ณต์ ์œผ๋กœ RL ๋ณด์ƒ ํ•จ์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜๋Š” Eureka ํŒจ๋Ÿฌ๋‹ค์ž„์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋ฉฐ, ์ƒˆ๋กœ์šด ๋ชฉํ‘œ ๊ตฌ์„ฑ๊ณผ ์ œ์•ฝ ์กฐ๊ฑด์— ์ ์‘ํ•˜๊ณ  ์ค‘๊ฐ„ ๋กค์•„์›ƒ์— ๋Œ€ํ•œ ์ธ๊ฐ„ ํ”ผ๋“œ๋ฐฑ ์ˆ˜์ •์„ ์ถ”๊ฐ€ํ•˜์—ฌ ์ •์ฑ…์˜ ์žฌ์‚ฌ์šฉ์„ฑ๊ณผ ๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ์„ ๋†’์ž…๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
๋ชจ๋ฐฉ ํ•™์Šต ์ •์ฑ…์„ ๊ฐ•ํ™” ํ•™์Šต์œผ๋กœ ํšจ๊ณผ์ ์œผ๋กœ ๊ฐœ์„ ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ชฉํ‘œ์™€ ์ œ์•ฝ ์กฐ๊ฑด์— ์ ์‘์‹œํ‚ค๊ณ  ๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ์„ ์ฆ๋Œ€์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
์ธ๊ฐ„ ํ”ผ๋“œ๋ฐฑ์„ ํ†ตํ•ฉํ•˜์—ฌ ์ •์ฑ…์˜ ๊ฒฌ๊ณ ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ  ๊ณ„์‚ฐ ๋ถ€๋‹ด์„ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์˜ ๋‹จ์ˆœํ•œ ํ”ฝ์•คํ”Œ๋ ˆ์ด์Šค ์ž‘์—…์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋‚˜, ์‹ค์ œ ๋กœ๋ด‡ ํ™˜๊ฒฝ์ด๋‚˜ ๋” ๋ณต์žกํ•œ ์ž‘์—…์— ๋Œ€ํ•œ ์„ฑ๋Šฅ ๊ฒ€์ฆ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘