Sign In

FAIR-Pruner: A Flexible Framework for Automatic Layer-Wise Pruning via Tolerance of Difference

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Chenqing Lin, Mostafa Hussien, Chengyao Yu, Bingyi Jing, Ruixing Ming, Kim Khoa Nguyen, Mohamed Cheriet

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ์—ฐ๊ตฌ๋Š” ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์••์ถ•์„ ์œ„ํ•œ ๊ธฐ์กด ๊ตฌ์กฐ์  ๊ฐ€์ง€์น˜๊ธฐ(structured pruning)์˜ ๋น„ํšจ์œจ์ ์ธ ํฌ์†Œ์„ฑ(sparsity) ํ• ๋‹น ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด FAIR-Pruner๋ผ๋Š” ํƒ์ƒ‰ ์—†๋Š”(search-free) ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜๋ฉฐ, ์ด๋Š” ์ œ๊ฑฐ์— ์œ ๋ฆฌํ•œ ํ›„๋ณด ์œ ๋‹›๊ณผ ์ž‘์—… ๋ฏผ๊ฐ ์œ ๋‹›์„ ๊ตฌ๋ถ„ํ•˜๋Š” ๋‘ ๊ฐ€์ง€ ๋‚ด๋ถ€ ๊ณ„์ธต ์ˆœ์œ„๋ฅผ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค. ํ•ต์‹ฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ธ Tolerance of Difference (ToD)๋Š” ์ œ๊ฑฐ ๋Œ€์ƒ๊ณผ ๋ณด์กด ๋Œ€์ƒ์˜ ์ค‘์ฒฉ์„ ์ธก์ •ํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ๊ณ„์ธต๋ณ„๋กœ ๋น„๊ท ์ผํ•œ ๊ฐ€์ง€์น˜๊ธฐ ๊นŠ์ด๋ฅผ ์œ ๋„ํ•˜์—ฌ ๋†’์€ ์ •ํ™•๋„-์••์ถ•๋ฅ  ๊ท ํ˜•์„ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
์ ์‘์ ์ธ ๊ณ„์ธต๋ณ„ ๊ฐ€์ง€์น˜๊ธฐ: FAIR-Pruner๋Š” ๊ฐ ๊ณ„์ธต์˜ ํŠน์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ ์ตœ์ ์˜ ๊ฐ€์ง€์น˜๊ธฐ ๊นŠ์ด๋ฅผ ์ž๋™์œผ๋กœ ๊ฒฐ์ •ํ•จ์œผ๋กœ์จ, ๊ธฐ์กด์˜ ๊ท ์ผํ•œ ํฌ์†Œ์„ฑ ํ• ๋‹น ๋ฐฉ์‹๋ณด๋‹ค ํšจ๊ณผ์ ์œผ๋กœ ๋ชจ๋ธ์„ ์••์ถ•ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
์œ ์—ฐํ•œ ํ‰๊ฐ€ ์ง€ํ‘œ: Wasserstein ๊ธฐ๋ฐ˜ U-Score์™€ Taylor ๊ธฐ๋ฐ˜ R-Score๋ฅผ ํฌํ•จํ•œ ๋‹ค์–‘ํ•œ ํ‰๊ฐ€ ์ง€ํ‘œ์™€ ์‰ฝ๊ฒŒ ๊ฒฐํ•ฉ๋  ์ˆ˜ ์žˆ์–ด, ๋‹ค์–‘ํ•œ ์ž‘์—… ๋ฐ ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜์— ์ ์šฉ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
โ€ข
ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ์•„ํ‚คํ…์ฒ˜ ์ ์šฉ: MoE(Mixture-of-Experts)์™€ ๊ฐ™์€ ๋ณต์žกํ•œ ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜์—์„œ๋„ ๋™์ผํ•œ ์˜ˆ์‚ฐ ํ•˜์— ํšจ๊ณผ์ ์ธ ๊ฐ€์ง€์น˜๊ธฐ๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ์—ฐ๊ตฌ์˜ ํ™•์žฅ์„ฑ์„ ์ž…์ฆํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์ƒˆ๋กœ์šด ๊ฐ€์ง€์น˜๊ธฐ ๋ฐฉ๋ฒ•๋ก ์˜ ํ•„์š”์„ฑ: ToD ๋ฉ”์ปค๋‹ˆ์ฆ˜์˜ ์ด๋ก ์  ๋ถ„์„์€ ๊ณ ๋ฌด์ ์ด์ง€๋งŒ, ๋ณต์žกํ•œ ๋ชจ๋ธ์ด๋‚˜ ํŠน์ • ์ž‘์—…์—์„œ ์ตœ์ ์˜ ์„ฑ๋Šฅ์„ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•œ ์ถ”๊ฐ€์ ์ธ ํƒ์ƒ‰ ๋ฐ ๊ฒ€์ฆ์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๐Ÿ‘