Sign In

ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Carlo Romeo, Andrew D. Bagdanov

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๊ธฐ์กด์˜ ์‹ค์ œ ๋กœ๋ด‡ ํ•˜๋“œ์›จ์–ด์— ๊ธฐ๋ฐ˜ํ•œ ๊ฐ•ํ™”ํ•™์Šต ํ™˜๊ฒฝ์—์„œ ๋ฒ—์–ด๋‚˜, ๋…ํŠนํ•œ ์™ธํ˜•์„ ๊ฐ€์ง„ ๊ฒŒ์ž„ NPC๋ฅผ ๋ชจ๋ฐฉํ•œ ๋„ค ๊ฐ€์ง€ ์ƒˆ๋กœ์šด MuJoCo ์—ฐ์† ์ œ์–ด ํ™˜๊ฒฝ์ธ ARC-RL์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์ด ํ™˜๊ฒฝ๋“ค์€ ๊ณตํ†ต๋œ ๊ด€์ฐฐ ํ…œํ”Œ๋ฆฟ, ํ–‰๋™ ๊ทœ์น™, ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋ฐฉ์‹, ๊ทธ๋ฆฌ๊ณ  ๋‹ค์–‘ํ•œ ๊ฐ•์ ๊ณผ ์•ฝ์ ์„ ๊ฐ€์ง„ ๋‹จ์ผ ๋ณด์ƒ ํ•จ์ˆ˜๋ฅผ ๊ณต์œ ํ•˜์—ฌ, ํ˜•ํƒœํ•™์  ๋‹ค์–‘์„ฑ์— ๋”ฐ๋ฅธ ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์„ฑ๋Šฅ์„ ๋น„๊ต ๋ถ„์„ํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘ก๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
๋…์ฐฝ์ ์ธ ์—ฐ๊ตฌ ํ™˜๊ฒฝ ์ œ๊ณต: ์‹ค์ œ ๋กœ๋ด‡ ํ•˜๋“œ์›จ์–ด ์ œ์•ฝ์„ ๋ฒ—์–ด๋‚˜ ๊ฒŒ์ž„ NPC์™€ ๊ฐ™์€ ๋น„ํ˜„์‹ค์ ์ธ ๋กœ๋ด‡ ํ˜•ํƒœ๋ฅผ ์œ„ํ•œ ๊ฐ•ํ™”ํ•™์Šต ์—ฐ๊ตฌ ํ™˜๊ฒฝ์„ ์ œ๊ณตํ•จ์œผ๋กœ์จ, ๋‹ค์–‘ํ•œ ๋กœ๋ด‡ ํ˜•ํƒœ์— ๋Œ€ํ•œ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋น„๊ต ๋ฐ ๋ถ„์„: ๋‹ค์–‘ํ•œ ์˜จ๋ผ์ธ ๋ฐ ์˜คํ”„๋ผ์ธ ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ARC-RL ํ™˜๊ฒฝ์—์„œ ๋น„๊ต ํ‰๊ฐ€ํ•˜์—ฌ, ๊ฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ํ˜•ํƒœํ•™์  ๋‹ค์–‘์„ฑ๊ณผ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์Šคํƒ€์ผ ์ œ์•ฝ์— ๋Œ€ํ•œ ๋Œ€์ฒ˜ ๋Šฅ๋ ฅ์„ ์ฒด๊ณ„์ ์œผ๋กœ ๋ถ„์„ํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ ‘๊ทผ ๋ฐฉ์‹์„ ํƒ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํ•œ๊ณ„์ : ์ƒˆ๋กœ์šด ํ™˜๊ฒฝ์—์„œ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์„ฑ๋Šฅ ํ‰๊ฐ€๊ฐ€ ์ด๋ฃจ์–ด์กŒ์ง€๋งŒ, ์‹ค์ œ ๋กœ๋ด‡์œผ๋กœ์˜ ์ „์ด(transfer) ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•œ ์ถ”๊ฐ€์ ์ธ ๊ฒ€์ฆ์ด ํ•„์š”ํ•˜๋ฉฐ, ์ œ์•ˆ๋œ ๋ณด์ƒ ํ•จ์ˆ˜์˜ ๋ชจ๋“  ๊ตฌ์„ฑ ์š”์†Œ๊ฐ€ ๋ชจ๋“  ํ˜•ํƒœํ•™์  ํŠน์ง•์— ์ตœ์ ํ™”๋˜์—ˆ๋Š”์ง€์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘