Sign In

SceneSmith: Agentic Generation of Simulation-Ready Indoor Scenes

Created by
  • Haebom
Category
Empty

์ €์ž

Nicholas Pfaff, Thomas Cohn, Sergey Zakharov, Rick Cory, Russ Tedrake

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ์‹ค์ œ ๊ฐ€์ • ๋กœ๋ด‡ ํ›ˆ๋ จ์— ํ•„์š”ํ•œ ๋‹ค์–‘ํ•˜๊ณ  ๋ฌผ๋ฆฌ์ ์œผ๋กœ ๋ณต์žกํ•œ ์‹ค๋‚ด ํ™˜๊ฒฝ์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์–ด๋ ค์›€์ด ์žˆ๋‹ค๋Š” ๋ฌธ์ œ๋ฅผ ์ธ์‹ํ•˜๊ณ , ์ž์—ฐ์–ด ํ”„๋กฌํ”„ํŠธ๋กœ๋ถ€ํ„ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ฐ€๋Šฅํ•œ ์‹ค๋‚ด ํ™˜๊ฒฝ์„ ์ƒ์„ฑํ•˜๋Š” 'SceneSmith'๋ผ๋Š” ๊ณ„์ธต์  ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. SceneSmith๋Š” VLM(Vision-Language Model) ์—์ด์ „ํŠธ๋“ค์ด ์„ค๊ณ„์ž, ๋น„ํ‰๊ฐ€, ์กฐ์ •์ž ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ ๊ฑด์ถ• ๋ ˆ์ด์•„์›ƒ๋ถ€ํ„ฐ ๊ฐ€๊ตฌ ๋ฐฐ์น˜, ์†Œํ˜• ๊ฐ์ฒด ์ถ”๊ฐ€๊นŒ์ง€ ๋‹จ๊ณ„๋ณ„๋กœ ์žฅ๋ฉด์„ ๊ตฌ์„ฑํ•˜๊ณ , ํ…์ŠคํŠธ-ํˆฌ-3D ํ•ฉ์„ฑ, ๋ฐ์ดํ„ฐ์…‹ ๊ฒ€์ƒ‰, ๋ฌผ๋ฆฌ ์†์„ฑ ์ถ”์ • ๋“ฑ์„ ํ†ตํ•ฉํ•˜์—ฌ ์‚ฌ์‹ค์ ์ด๊ณ  ๋กœ๋ด‡ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์— ์ ํ•ฉํ•œ ์žฅ๋ฉด์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
๊ธฐ์กด ๋ฐฉ๋ฒ•๋ก ์œผ๋กœ๋Š” ์ƒ์„ฑํ•˜๊ธฐ ์–ด๋ ค์› ๋˜ ๋ณต์žกํ•˜๊ณ  ํ˜„์‹ค์ ์ธ ์‹ค๋‚ด ํ™˜๊ฒฝ์„ ์ž์—ฐ์–ด ์ง€์‹œ๋ฅผ ํ†ตํ•ด ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
์ƒ์„ฑ๋œ ํ™˜๊ฒฝ์€ ๊ฐ์ฒด ์ถฉ๋Œ์ด ์ ๊ณ  ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์•ˆ์ •์ ์ด๋ฉฐ, ์‹ค์ œ์™€ ๊ฐ™์€ ๋†’์€ ์ˆ˜์ค€์˜ ์‚ฌ์‹ค์„ฑ๊ณผ ํ”„๋กฌํ”„ํŠธ ์ถฉ์‹ค๋„๋ฅผ ๋ณด์—ฌ ๋กœ๋ด‡ ์ •์ฑ… ํ‰๊ฐ€์— ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํ…์ŠคํŠธ-ํˆฌ-3D ์ƒ์„ฑ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ๊ฐœ์„ , ๋‹ค์–‘ํ•œ ์ข…๋ฅ˜์˜ ๋ณต์žกํ•œ ์ƒํ˜ธ์ž‘์šฉ ๊ฐ์ฒด (์˜ˆ: ์—ด๊ณ  ๋‹ซ์„ ์ˆ˜ ์žˆ๋Š” ์„œ๋ž, ์›€์ง์ด๋Š” ๋ฌธ)์— ๋Œ€ํ•œ ํ†ตํ•ฉ ๊ฐ•ํ™”, ๊ทธ๋ฆฌ๊ณ  ๋” ๋„“์€ ๋ฒ”์œ„์˜ ๊ฑด๋ฌผ ๊ตฌ์กฐ (์˜ˆ: ์—ฌ๋Ÿฌ ์ธต, ๋ณต๋„) ์ƒ์„ฑ ๋Šฅ๋ ฅ ํ–ฅ์ƒ์ด ํ–ฅํ›„ ์—ฐ๊ตฌ ๊ณผ์ œ์ž…๋‹ˆ๋‹ค.
๐Ÿ‘