Sign In

EvoPref: Multi-Objective Evolutionary Optimization Discovers Diverse LLM Alignments Beyond Gradient Descent

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Dongxin Guo, Jikun Wu, Siu Ming Yiu

๐Ÿ’ก ๊ฐœ์š”

๊ธฐ์กด ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ• ๊ธฐ๋ฐ˜ LLM ์ •๋ ฌ ๋ฐฉ์‹์€ ๊ณผ๋„ํ•œ ํŽธํ–ฅ ์ˆ˜๋ ด์œผ๋กœ ์ธํ•ด ํ–‰๋™ ๋‹ค์–‘์„ฑ์„ ๊ฐ„๊ณผํ•˜๋Š” 'ํŽธํ–ฅ ๋ถ•๊ดด' ๋ฌธ์ œ๋ฅผ ๊ฒช์Šต๋‹ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋„์›€, ๋ฌดํ•ดํ•จ, ์ •์งํ•จ์ด๋ผ๋Š” ๋‹ค์ค‘ ๋ชฉํ‘œ์— ๋Œ€ํ•ด LoRA ์–ด๋Œ‘ํ„ฐ์˜ ๊ฐœ์ฒด๊ตฐ์„ ์œ ์ง€ํ•˜๋ฉฐ ๋น„์—ด๋“ฑ ์ •๋ ฌ ์œ ์ „ ์•Œ๊ณ ๋ฆฌ์ฆ˜ II (NSGA-II) ์„ ํƒ๊ณผ ์•„์นด์ด๋ธŒ ๊ธฐ๋ฐ˜ ๋‹ค์–‘์„ฑ ๋ณด์กด์„ ์‚ฌ์šฉํ•˜๋Š” ๋‹ค์ค‘ ๋ชฉํ‘œ ์ง„ํ™” ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ EvoPref๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. EvoPref๋Š” ๊ธฐ์กด ๋ฐฉ๋ฒ•๋ก  ๋Œ€๋น„ 18% ํ–ฅ์ƒ๋œ ํŽธํ–ฅ ์ปค๋ฒ„๋ฆฌ์ง€์™€ 47% ๊ฐ์†Œ๋œ ๋ถ•๊ดด์œจ์„ ๋‹ฌ์„ฑํ•˜๋ฉฐ, ๊ฒฝ์Ÿ๋ ฅ ์žˆ๋Š” ์ •๋ ฌ ํ’ˆ์งˆ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
EvoPref๋Š” ๊ธฐ์กด ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ• ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ• ๋Œ€๋น„ ํ›จ์”ฌ ๋” ๋‹ค์–‘ํ•œ LLM ์ •๋ ฌ์„ ๋ฐœ๊ฒฌํ•˜๋ฉฐ, ํŠนํžˆ ํŽธํ–ฅ ๋ถ•๊ดด ๋ฌธ์ œ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๋‹ค์ค‘ ๋ชฉํ‘œ ์ง„ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ๋‹ค์–‘์„ฑ ๋ณด์กด ๊ธฐ๋ฒ•์ด LLM ์ •๋ ฌ์˜ ํฌ๊ด„์„ฑ๊ณผ ์•ˆ์ •์„ฑ์„ ๋†’์ด๋Š” ๋ฐ ํ•„์ˆ˜์ ์ž„์„ ์ž…์ฆํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์ด๋ก ์  ๋ถ„์„์€ ์•„์นด์ด๋ธŒ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์ด ๋‹จ์ผ ๊ถค์  ์ตœ์ ํ™”๋ณด๋‹ค ๋ถ•๊ดด์—์„œ ๋ฒ—์–ด๋‚˜๋Š” ๋ฐ ๋” ํšจ๊ณผ์ ์ธ ์ด์œ ๋ฅผ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
ํ–ฅํ›„ ์—ฐ๊ตฌ์—์„œ๋Š” EvoPref๋ฅผ ๋‹ค์–‘ํ•œ LLM ๋ชจ๋ธ ๋ฐ ์ •๋ ฌ ๋ชฉํ‘œ์— ์ ์šฉํ•˜๊ณ , ์ง„ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ํƒ์ƒ‰ ๊ณต๊ฐ„์„ ๋”์šฑ ์ตœ์ ํ™”ํ•˜๋Š” ๋ฐฉ์•ˆ์„ ๋ชจ์ƒ‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๐Ÿ‘