Sign In

Exponential Approximation Rates and Parameter Efficiency of Learnable Bernstein Activations

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Ibrahim Albool, Malak Gamal El-Din, Salma Elmalaki, Yasser Shoukry

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ํ•™์Šต ๊ฐ€๋Šฅํ•œ Bernstein ๋‹คํ•ญ์‹ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” DeepBern-Net (DBN)์˜ ์ด๋ก ์  ๋ถ„์„์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. DBN์€ ๋„คํŠธ์›Œํฌ ๊นŠ์ด $L$๊ณผ ๋‹คํ•ญ์‹ ์ฐจ์ˆ˜ $n$์— ๋”ฐ๋ผ $\mathcal{O}(n^{-L})$์˜ ์ง€์ˆ˜์ ์ธ ์†๋„๋กœ ๊ทผ์‚ฌ ์˜ค์ฐจ๊ฐ€ ๊ฐ์†Œํ•จ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ์ด๋Š” ReLU ๊ธฐ๋ฐ˜ ๋„คํŠธ์›Œํฌ์˜ ๋‹คํ•ญ์‹ ์†๋„๋ณด๋‹ค ํ›จ์”ฌ ๋น ๋ฆ…๋‹ˆ๋‹ค. 1,344๊ฑด์˜ ๊ณผํ•™ ๋ฐ์ดํ„ฐ์…‹ ์‹คํ—˜์„ ํ†ตํ•ด DBN์€ ReLU, Leaky ReLU, SELU, GeLU ๋Œ€๋น„ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋ฅผ 70% ์ด์ƒ ํฌ๊ฒŒ ์ค„์ด๊ณ , ํ•™์Šต ์ˆ˜๋ ด ์†๋„๋ฅผ ๋†’์ด๋ฉฐ, ๋” ๋‚ฎ์€ ์ตœ์ข… ์†์‹ค์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
ํ•™์Šต ๊ฐ€๋Šฅํ•œ Bernstein ๋‹คํ•ญ์‹ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋Š” ๊ธฐ์กด ReLU ๊ธฐ๋ฐ˜ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ณด๋‹ค ํ›จ์”ฌ ๋›ฐ์–ด๋‚œ ํ‘œํ˜„ ๋Šฅ๋ ฅ๊ณผ ํŒŒ๋ผ๋ฏธํ„ฐ ํšจ์œจ์„ฑ์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
DBN์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์€ ๋‹จ์ˆœํžˆ ๋ถ€๋“œ๋Ÿฌ์›€ ๋•Œ๋ฌธ์ด ์•„๋‹ˆ๋ผ ํ•™์Šต ๊ฐ€๋Šฅํ•œ ๋‹คํ•ญ์‹ ๊ตฌ์กฐ ์ž์ฒด์—์„œ ๋น„๋กฏ๋จ์„ ์‹คํ—˜์ ์œผ๋กœ ์ž…์ฆํ–ˆ์Šต๋‹ˆ๋‹ค.
โ€ข
๋ณธ ์—ฐ๊ตฌ๋Š” ๊ด‘๋ฒ”์œ„ํ•œ ์‹คํ—˜์„ ํ†ตํ•ด DBN์˜ ์ด๋ก ์  ์˜ˆ์ธก์„ ๊ฒ€์ฆํ–ˆ์œผ๋‚˜, ์‹ค์ œ ๋ณต์žกํ•œ ์‹ค์ œ ๋ฐ์ดํ„ฐ์…‹์—์„œ์˜ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ๊ณผ ํ™•์žฅ์„ฑ์— ๋Œ€ํ•œ ์ถ”๊ฐ€์ ์ธ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘