Sign In

JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models

Created by
  • Haebom
Category
Empty

์ €์ž

Alexandra Dragomir, Ioana Pintilie, Antonio Barbalau, Marius Dragoi, Florin Brad, Cristian Daniel Paduraru, Alexandru Tifrea, Elena Burceanu, Radu Tudor Ionescu

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(LLM)์˜ ์ง€์† ํ•™์Šต(CL)์„ ์œ„ํ•œ ํšจ์œจ์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ ์–ด๋Œ‘ํ„ฐ ๊ธฐ๋ฐ˜ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ์กด ๋ฐฉ์‹์˜ ์น˜๋ช…์ ์ธ ๋ง๊ฐ์„ ์™„ํ™”ํ•˜๊ธฐ ์œ„ํ•ด, JumpLoRA๋Š” JumpReLU ๊ฒŒ์ดํŒ…์„ ํ™œ์šฉํ•˜์—ฌ Low-Rank Adaptation(LoRA) ๋ธ”๋ก์— ๋™์ ์œผ๋กœ ํฌ์†Œ์„ฑ์„ ์œ ๋„ํ•˜์—ฌ ๋งค๊ฐœ๋ณ€์ˆ˜ ๊ฒฉ๋ฆฌ๋ฅผ ๋‹ฌ์„ฑํ•˜๊ณ  ์ž‘์—… ๊ฐ„ ๊ฐ„์„ญ์„ ๋ฐฉ์ง€ํ•ฉ๋‹ˆ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, JumpLoRA๋Š” IncLoRA์˜ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ค๊ณ  ์„ ๋„์ ์ธ CL ๋ฐฉ๋ฒ•์ธ ELLA๋ฅผ ๋Šฅ๊ฐ€ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
JumpLoRA๋Š” JumpReLU ๊ฒŒ์ดํŒ…์„ ํ†ตํ•ด LoRA ๋ธ”๋ก์— ๋™์  ํฌ์†Œ์„ฑ์„ ๋„์ž…ํ•˜์—ฌ ์ž‘์—… ๊ฐ„ ๊ฐ„์„ญ์„ ํšจ๊ณผ์ ์œผ๋กœ ์ค„์ž…๋‹ˆ๋‹ค.
โ€ข
์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์€ ๊ธฐ์กด LoRA ๊ธฐ๋ฐ˜ CL ๋ฐฉ๋ฒ•๋ก ๊ณผ ๋†’์€ ๋ชจ๋“ˆ์„ฑ๊ณผ ํ˜ธํ™˜์„ฑ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ํŠนํžˆ IncLoRA์˜ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
โ€ข
ํ–ฅํ›„ ์—ฐ๊ตฌ๋Š” JumpLoRA์˜ ํฌ์†Œ์„ฑ ์œ ๋„ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ๋”์šฑ ์ตœ์ ํ™”ํ•˜๊ณ  ๋‹ค์–‘ํ•œ LLM ์•„ํ‚คํ…์ฒ˜ ๋ฐ CL ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ์˜ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ํƒ์ƒ‰ํ•  ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
๐Ÿ‘