Sign In

World2Mind: Cognition Toolkit for Allocentric Spatial Reasoning in Foundation Models

Created by
  • Haebom
Category
Empty

์ €์ž

Shouwei Ruan, Bin Wang, Zhenyu Wu, Qihui Zhu, Yuxiang Zhang, Hang Su, Yubin Wang

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ(MFM)์˜ ๊ฒฌ๊ณ ํ•œ ๊ณต๊ฐ„ ์ถ”๋ก  ๋Šฅ๋ ฅ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด World2Mind๋ผ๋Š” ํ›ˆ๋ จ ์—†๋Š” ๊ณต๊ฐ„ ์ง€๋Šฅ ํˆดํ‚ท์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. World2Mind๋Š” 3D ์žฌ๊ตฌ์„ฑ ๋ฐ ์ธ์Šคํ„ด์Šค ๋ถ„ํ• ์„ ํ†ตํ•ด ๊ตฌ์กฐํ™”๋œ ๊ณต๊ฐ„ ์ธ์ง€ ์ง€๋„๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ , ๊ธฐํ•˜ํ•™์ -์œ„์ƒํ•™์  ์‚ฌ์ „ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๋Š” Allocentric-Spatial Tree(AST)๋ฅผ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด MFM์˜ ๊ณต๊ฐ„ ์ถ”๋ก  ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ , ํ…์ŠคํŠธ ์ „์šฉ ๋ชจ๋ธ๋„ ๋ณต์žกํ•œ 3D ๊ณต๊ฐ„ ์ถ”๋ก ์ด ๊ฐ€๋Šฅํ•จ์„ ์ž…์ฆํ–ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
์‹œ์‚ฌ์  1: World2Mind๋Š” 3D ์žฌ๊ตฌ์„ฑ์˜ ๋ถ€์ •ํ™•์„ฑ์„ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค๋‹จ๊ณ„ ์ถ”๋ก  ๋ฐฉ์‹์„ ๋„์ž…ํ•˜์—ฌ MFM์˜ ๊ณต๊ฐ„ ์ถ”๋ก  ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
โ€ข
์‹œ์‚ฌ์  2: Allocentric-Spatial Tree(AST)๋Š” ํ…์ŠคํŠธ ์ „์šฉ ๋ชจ๋ธ์ด ๋ณต์žกํ•œ 3D ๊ณต๊ฐ„ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์›ํ•˜๋ฉฐ, ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ์— ๊ทผ์ ‘ํ•˜๋Š” ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.
โ€ข
ํ•œ๊ณ„์ /ํ–ฅํ›„ ๊ณผ์ œ: 3D ์žฌ๊ตฌ์„ฑ์˜ ๊ทผ๋ณธ์ ์ธ ์ •ํ™•๋„ ํ•œ๊ณ„๋Š” ์—ฌ์ „ํžˆ ์กด์žฌํ•˜๋ฉฐ, ๋™์ ์œผ๋กœ ๋ณ€ํ™”ํ•˜๋Š” ํ™˜๊ฒฝ์ด๋‚˜ ๋ณต์žกํ•œ ๊ฐ์ฒด ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ์— ๋Œ€ํ•œ ๊ณต๊ฐ„ ์ถ”๋ก  ๋Šฅ๋ ฅ ๊ฐ•ํ™”๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘