Sign In

GAP: Geometric Anchor Pre-training for Data-Efficient Visuomotor Learning of Manipulation Tasks

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Davide Buoso, Andrea Protopapa, Stefano Di Carlo, Francesca Pistilli, Giuseppe Averta

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ํฌ์†Œํ•œ ์ „๋ฌธ๊ฐ€ ์‹œ์—ฐ ๋ฐ์ดํ„ฐ๋กœ ๋กœ๋ด‡ ์กฐ์ž‘ ํ•™์Šต ์‹œ, ๊ณ ์ฐจ์› RGB ์˜์ƒ ํ‘œํ˜„์„ ์ œ์–ด ๊ด€๋ จ ๊ธฐํ•˜ํ•™์  ์ •๋ณด๋กœ ํšจ๊ณผ์ ์œผ๋กœ ์ถ”์ถœํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Geometric Anchor Pre-training (GAP)์ด๋ผ๋Š” ์ƒˆ๋กœ์šด ์‚ฌ์ „ ํ•™์Šต ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. GAP๋Š” ๋ฌผ์ฒด ๋งˆ์Šคํฌ๋ฅผ ํ™œ์šฉํ•œ ๊ฐ€๋ฒผ์šด ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ”„๋ก์‹œ ํƒœ์Šคํฌ์—์„œ ๊ณต๊ฐ„ ์–ด๋Œ‘ํ„ฐ๋ฅผ ์‚ฌ์ „ ํ•™์Šต์‹œ์ผœ, ์•ˆ์ •์ ์ด๊ณ  ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐํ•˜ํ•™์  ์•ต์ปค๋ฅผ ์ƒ์„ฑํ•จ์œผ๋กœ์จ ์ ์€ ๋ฐ์ดํ„ฐ๋กœ๋„ ํšจ๊ณผ์ ์ธ ์ •์ฑ… ํ•™์Šต์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ์„ ๊ทน๋Œ€ํ™”ํ•˜์—ฌ ํฌ์†Œํ•œ ์‹œ์—ฐ ๋ฐ์ดํ„ฐ๋กœ๋„ ๋กœ๋ด‡ ์กฐ์ž‘ ํ•™์Šต์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์ƒ์„ฑ๋œ ๊ธฐํ•˜ํ•™์  ์•ต์ปค๋Š” ์žฅ๋ฉด ๋ณ€ํ™”๋‚˜ ์ž‘์€ ๊ต๋ž€์—๋„ ๊ฐ•๊ฑดํ•˜์—ฌ ์•ˆ์ •์ ์ธ ์ œ์–ด ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์‚ฌ์ „ ํ•™์Šต ๋‹จ๊ณ„๋Š” ๊ฐ€๋ณ๊ณ  ๊ธฐ์กด VFM์„ ๋™๊ฒฐ์‹œํ‚จ ์ƒํƒœ์—์„œ ์ง„ํ–‰๋˜๋ฏ€๋กœ ์‹ค์ œ ์ ์šฉ์— ์šฉ์ดํ•˜๋ฉฐ ์žฌ์‚ฌ์šฉ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค.
โ€ข
์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋ก ์€ ํŠน์ • ๋ฌผ์ฒด ๋งˆ์Šคํฌ๊ฐ€ ํ•„์š”ํ•œ ํ”„๋ก์‹œ ํƒœ์Šคํฌ์— ์˜์กดํ•˜๋ฏ€๋กœ, ๋งˆ์Šคํฌ ์ •๋ณด๊ฐ€ ๋ถ€์กฑํ•˜๊ฑฐ๋‚˜ ์ถ”์ถœํ•˜๊ธฐ ์–ด๋ ค์šด ๋ณต์žกํ•œ ํ™˜๊ฒฝ์—์„œ๋Š” ์ ์šฉ์— ์ œ์•ฝ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๐Ÿ‘