Sign In

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Yufei Xu, Fanxu Meng, Fan Jiang, Yuxuan Wang, Ruijie Zhou, Zhaohui Wang, Jiexi Wu, Zhixin Pan, Xiaojuan Tang, Wenjie Pei, Tongxuan Liu, Di Yin, Xing Sun, Muhan Zhang

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๊ธฐ์กด ํ† ํฐ ์ˆ˜์ค€ ํฌ์†Œ ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์˜ ์ธ๋ฑ์„œ๊ฐ€ ๊ธด ์ปจํ…์ŠคํŠธ ๊ธธ์ด์—์„œ ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ์ผ์œผํ‚ค๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด HISA(Hierarchical Indexed Sparse Attention)๋ฅผ ์ œ์•ˆํ•œ๋‹ค. HISA๋Š” ๋ธ”๋ก ๋ ˆ๋ฒจ์˜ ๊ณ„์ธต์  ํ•„ํ„ฐ๋ง๊ณผ ํ† ํฐ ๋ ˆ๋ฒจ์˜ ์ •์ œ ๋‹จ๊ณ„๋ฅผ ๊ฑฐ์ณ ๊ฒ€์ƒ‰ ๊ฒฝ๋กœ๋ฅผ ์žฌ๊ตฌ์„ฑํ•˜์—ฌ ํšจ์œจ์„ฑ์„ ๋†’์ธ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๊ธฐ์กด DSA์™€ ๋™์ผํ•œ ํฌ์†Œ ํŒจํ„ด์„ ์œ ์ง€ํ•˜๋ฉด์„œ๋„ 64K ์ปจํ…์ŠคํŠธ ๊ธธ์ด์—์„œ ์ตœ๋Œ€ 3๋ฐฐ์˜ ์†๋„ ํ–ฅ์ƒ์„ ๋‹ฌ์„ฑํ•œ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
HISA๋Š” ๊ธด ์ปจํ…์ŠคํŠธ ๊ธธ์ด์— ๋Œ€ํ•œ ๊ธฐ์กด ํฌ์†Œ ์–ดํ…์…˜์˜ ์ธ๋ฑ์„œ ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ํšจ๊ณผ์ ์œผ๋กœ ์™„ํ™”ํ•œ๋‹ค.
โ€ข
๋ณ„๋„์˜ ์ถ”๊ฐ€ ํ•™์Šต ์—†์ด ๊ธฐ์กด ๋ชจ๋ธ์— ํ”Œ๋Ÿฌ๊ทธ ์•ค ํ”Œ๋ ˆ์ด ๋ฐฉ์‹์œผ๋กœ ์ ์šฉ ๊ฐ€๋Šฅํ•˜๋ฉฐ, ์šฐ์ˆ˜ํ•œ ํ’ˆ์งˆ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค.
โ€ข
์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋ก ์€ ๋ธ”๋ก ํฌ์†Œ ๊ธฐ๋ฐ˜์˜ ๊ธฐ์กด ๋ฐฉ์‹๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.
๐Ÿ‘