Sign In

KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

Created by
  • Haebom
Category
Empty

์ €์ž

Zebin Yang, Tong Xie, Baotong Lu, Shaoshan Liu, Bo Yu, Meng Li

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ์—ฐ๊ตฌ๋Š” ๋ณต์žกํ•˜๊ณ  ์žฅ๊ธฐ์ ์ธ ์ฒดํ™” ๊ณ„ํš(embodied planning)์—์„œ LLM์˜ ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ ํšจ์œจ์„ฑ์„ ๋†’์ด๊ธฐ ์œ„ํ•œ KV-cache ์ค‘์‹ฌ ์‹œ์Šคํ…œ์ธ KEEP๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. KEEP๋Š” ์ •์ -๋™์  ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์„ฑ, ๋‹ค์ค‘ ํ™‰ ๋ฉ”๋ชจ๋ฆฌ ์žฌ์—ฐ์‚ฐ, ๊ณ„์ธต ๊ท ํ˜• ๋ฉ”๋ชจ๋ฆฌ ๋กœ๋”ฉ ๊ธฐ์ˆ ์„ ํ†ตํ•ด KV ์บ์‹œ ์žฌ๊ณ„์‚ฐ ๋ฐ ๋กœ๋”ฉ์˜ ๋น„ํšจ์œจ์„ฑ์„ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๊ธฐ์กด ํ…์ŠคํŠธ ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ๋ฆฌ ๋ฐฉ์‹ ๋Œ€๋น„ 2.68๋ฐฐ์˜ ์†๋„ ํ–ฅ์ƒ๊ณผ ๋ฌด์‹œํ•  ๋งŒํ•œ ์ •ํ™•๋„ ์†์‹ค์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
LLM ๊ธฐ๋ฐ˜ ์ฒดํ™” ๊ณ„ํš์—์„œ KV ์บ์‹œ๋ฅผ ์ ๊ทน์ ์œผ๋กœ ํ™œ์šฉํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
โ€ข
ํ˜ผํ•ฉ๋œ ๋ฉ”๋ชจ๋ฆฌ ์„ธ๋ถ„์„ฑ, ๋™์ ์ธ ์ค‘์š” ๋ฉ”๋ชจ๋ฆฌ ์ƒํ˜ธ์ž‘์šฉ ์‹๋ณ„, ๊ณ„์ธต๋ณ„ ๋ถ€ํ•˜ ๊ท ํ˜•์„ ํ†ตํ•ด ํšจ์œจ์„ฑ์„ ๊ทน๋Œ€ํ™”ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์ œ์•ˆ๋œ KEEP ์‹œ์Šคํ…œ์€ ALFRED ๋ฐ์ดํ„ฐ์…‹์—์„œ ๊ธฐ์กด ๋ฐฉ๋ฒ•๋ก  ๋Œ€๋น„ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์œผ๋ฉฐ, ํŠนํžˆ CacheBlend์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ ์„ฑ๊ณต๋ฅ  ํ–ฅ์ƒ๊ณผ TTFT ๊ฐ์†Œ๋ผ๋Š” ์„ฑ๊ณผ๋ฅผ ๊ฑฐ๋‘์—ˆ์Šต๋‹ˆ๋‹ค.
โ€ข
๋ณธ ์—ฐ๊ตฌ๋Š” ํŠน์ • ์ฒดํ™” ๊ณ„ํš ํ™˜๊ฒฝ(ALFRED)์—์„œ์˜ ์„ฑ๋Šฅ์„ ๊ฒ€์ฆํ•˜์˜€์œผ๋ฉฐ, ๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ ๋ฐ ๋ณต์žกํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ์˜ ์ผ๋ฐ˜ํ™” ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•œ ์ถ”๊ฐ€์ ์ธ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘