Sign In

GoQuant: Geometric Orthogonal Residual Projection for Multiplier-Free Power-of-Two Transformer Quantization

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Maoyang Xiang, Tao Luo, Bo Wang

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ LLM ๋ฐ ViT์˜ ์—ฃ์ง€ ๋””๋ฐ”์ด์Šค ๋ฐฐํฌ ์‹œ ๋ฐœ์ƒํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ๋ฐ ๊ณ„์‚ฐ ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Power-of-Two (PoT) ์–‘์žํ™”์˜ ๊ธฐํ•˜ํ•™์  ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๋Š” Geometric Orthogonal Residual Projection Quantization (GoQuant)์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. GoQuant๋Š” ๋“€์–ผ-๋ฒ ์ด์Šค ๊ธฐํ•˜ํ•™์  ํˆฌ์˜์„ ํ†ตํ•ด ์ ์‘์ ์œผ๋กœ ๊ณ ํ•ด์ƒ๋„ ์ž”์—ฌ ๊ฒฉ์ž๋ฅผ ํ•ฉ์„ฑํ•˜์—ฌ, ๊ธฐ์กด PoT ์–‘์žํ™”์˜ ์ €ํ•ด์ƒ๋„ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  ํ•˜๋“œ์›จ์–ด ํšจ์œจ์„ฑ์„ ๋†’์ž…๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
ํ•˜๋“œ์›จ์–ด ํšจ์œจ์ ์ธ ์ €๋น„ํŠธ ์–‘์žํ™”: GoQuant๋Š” ๊ณฑ์…ˆ-์—ฐ์‚ฐ(MAC)์„ ์‹œํ”„ํŠธ-๋ฐ-๋ง์…ˆ ์—ฐ์‚ฐ์œผ๋กœ ๋Œ€์ฒดํ•˜์—ฌ ์—ฃ์ง€ ๋””๋ฐ”์ด์Šค์—์„œ์˜ ์—ฐ์‚ฐ ๋ถ€๋‹ด์„ ํฌ๊ฒŒ ์ค„์ž…๋‹ˆ๋‹ค.
โ€ข
๊ธฐํ•˜ํ•™์  ์ œ์•ฝ ๊ทน๋ณต: ๊ธฐ์กด PoT ์–‘์žํ™”์˜ ๋‚ฎ์€ ๊ฐ๋„ ํ•ด์ƒ๋„ ๋ฌธ์ œ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜์—ฌ ๊ณ ์ฐจ์› ํŠน์ง• ๋งค๋‹ˆํด๋“œ์˜ ์†์‹ค์„ ์ตœ์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๋น ๋ฅธ ๋ชจ๋ธ ๋ณด์ •: ํ•ด์„์  ํ•ด๋ฒ•์„ ํ†ตํ•ด ๊ธฐ์กด์˜ ๊ธฐ์šธ๊ธฐ ๊ธฐ๋ฐ˜ ์ตœ์ ํ™” ๋Œ€๋น„ ํ›จ์”ฌ ๋น ๋ฅธ ๋ชจ๋ธ ๋ณด์ • ์‹œ๊ฐ„์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์ž ์žฌ์ ์ธ ๋ณต์žก์„ฑ: ๋“€์–ผ-๋ฒ ์ด์Šค ํˆฌ์˜ ๋ฐ ์ž”์—ฌ ๊ฒฉ์ž ํ•ฉ์„ฑ ๊ณผ์ •์ด ๊ธฐ์กด ์–‘์žํ™” ๋ฐฉ์‹์— ๋น„ํ•ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์  ๋ณต์žก์„ฑ์„ ์ฆ๊ฐ€์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํ•˜๋“œ์›จ์–ด ๊ตฌํ˜„์˜ ์ตœ์ ํ™”: ์ œ์•ˆ๋œ ์—ฐ์‚ฐ ๋ฐฉ์‹์„ ์‹ค์ œ ํ•˜๋“œ์›จ์–ด์—์„œ ์ตœ์ ์˜ ์„ฑ๋Šฅ์œผ๋กœ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•œ ์ถ”๊ฐ€์ ์ธ ์„ค๊ณ„ ๋ฐ ์ตœ์ ํ™” ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๐Ÿ‘