Sign In

Second-Order Multi-Level Variance Correction for Modality Competition in Multimodal Models

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Yishun Lu, Wes Armour

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๋ฐ ํ…์ŠคํŠธ ์ดํ•ด๋ฅผ ํ†ตํ•ฉํ•˜๋Š” ์ž๊ธฐํšŒ๊ท€ ๋ชจ๋ธ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ๊ฒฝ์Ÿ ๋ฌธ์ œ๋กœ ์ธํ•œ ์ตœ์ ํ™” ๋ถˆ์•ˆ์ •์„ฑ์„ ํ•ด๊ฒฐํ•˜๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด 1์ฐจ ์˜ตํ‹ฐ๋งˆ์ด์ €์˜ ํ•œ๊ณ„๋ฅผ ์ง€์ ํ•˜๊ณ , 2์ฐจ ์‚ฌ์ „ ์กฐ๊ฑดํ™” ๊ธฐ๋ฒ•์ธ SOAP์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹ค๋‹จ๊ณ„ ๋ถ„์‚ฐ ๋ณด์ •์„ ์ ์šฉํ•œ ML-FOP-SOAP ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์ œ์•ˆ ๋ฐฉ๋ฒ•์€ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ๊ฐ„ ์ถฉ๋Œ์„ ํšจ๊ณผ์ ์œผ๋กœ ์–ต์ œํ•˜์—ฌ ์‹œ๊ฐ ์ƒ์„ฑ๊ณผ ํ…์ŠคํŠธ ์ดํ•ด ๊ฐ„์˜ ์ƒ์ถฉ ๊ด€๊ณ„๋ฅผ ์ค„์ด๊ณ , ๋Œ€๊ทœ๋ชจ ๋ฐฐ์น˜์—์„œ๋„ ์•ˆ์ •์ ์ธ ํ•™์Šต์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
๋‹ค์ค‘ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ๋ชจ๋ธ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ๊ฒฝ์Ÿ์œผ๋กœ ์ธํ•œ ์ตœ์ ํ™” ๋ถˆ์•ˆ์ •์„ฑ์„ 2์ฐจ ์˜ตํ‹ฐ๋งˆ์ด์ œ์ด์…˜ ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ํšจ๊ณผ์ ์œผ๋กœ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์ œ์•ˆ๋œ ML-FOP-SOAP๋Š” ์‹œ๊ฐ ์ƒ์„ฑ๊ณผ ํ…์ŠคํŠธ ์ดํ•ด ๊ฐ„์˜ ์„ฑ๋Šฅ ์ƒ์ถฉ์„ ์™„ํ™”ํ•˜๊ณ , ๋Œ€๊ทœ๋ชจ ๋ฐฐ์น˜ ํ•™์Šต์—์„œ๋„ ์•ˆ์ •์ ์ด๊ณ  ํšจ์œจ์ ์ธ ํ•™์Šต์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๋Œ€๊ทœ๋ชจ ๊ทธ๋ž˜๋””์–ธํŠธ ์ถ•์  ์ƒํ™ฉ์—์„œ๋„ ๊ณ„์ธต์  ํด๋”ฉ ์ „๋žต์„ ํ†ตํ•ด ๋ฏธ์„ธํ•œ ๋ถ„์‚ฐ์„ ๋‚ฎ์€ ์˜ค๋ฒ„ํ—ค๋“œ๋กœ ํฌ์ฐฉํ•  ์ˆ˜ ์žˆ๋Š” ์‹ค์šฉ์„ฑ์„ ๊ฐ–์ถ”๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
์‹คํ—˜ ๊ฒฐ๊ณผ, ๊ธฐ์กด AdamW ๋Œ€๋น„ ์ƒ˜ํ”Œ ํšจ์œจ์„ฑ์„ ์ตœ๋Œ€ 1.4๋ฐฐ, ์‹ค์ œ ํ•™์Šต ์‹œ๊ฐ„์„ ์ตœ๋Œ€ 1.5๋ฐฐ ๋‹จ์ถ•์‹œํ‚ค๋Š” ์„ฑ๊ณผ๋ฅผ ๋ณด์ด๋ฉฐ, ๋‹ค์ค‘ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ ์Šค์ผ€์ผ๋ง์— ๋Œ€ํ•œ ๊ฐ•๋ ฅํ•œ ์˜ตํ‹ฐ๋งˆ์ด์ €๋กœ์„œ์˜ ๊ฐ€๋Šฅ์„ฑ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
โ€ข
(ํ•œ๊ณ„์  ๋˜๋Š” ํ–ฅํ›„ ๊ณผ์ œ) ๋‹ค์–‘ํ•œ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ์กฐํ•ฉ ๋ฐ ๋ณต์žกํ•œ ํƒœ์Šคํฌ์— ๋Œ€ํ•œ ์ถ”๊ฐ€์ ์ธ ๊ฒ€์ฆ ๋ฐ ์ตœ์ ํ™”๊ฐ€ ํ•„์š”ํ•˜๋ฉฐ, ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋ก ์˜ ์ด๋ก ์  ๋ถ„์„์„ ๋”์šฑ ์‹ฌํ™”ํ•  ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
๐Ÿ‘