Sign In

Spend Search Where It Pays: Value-Guided Structured Sampling and Optimization for Generative Recommendation

Created by
  • Haebom
Category
Empty

์ €์ž

Jie Jiang, Yangru Huang, Zeyu Wang, Changping Wang, Yuling Xiong, Jun Zhang, Huan Yu

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๊ธฐ์กด ์ƒ์„ฑ ์ถ”์ฒœ ๋ชจ๋ธ์˜ ํ™•๋ฅ -๋ณด์ƒ ๋ถˆ์ผ์น˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด V-STAR๋ผ๋Š” ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. V-STAR๋Š” Value-Guided Efficient Decoding (VED)๊ณผ Sibling-GRPO๋ฅผ ํ†ตํ•ด ํƒ์ƒ‰ ํšจ์œจ์„ฑ์„ ๋†’์ด๊ณ  ์˜์‚ฌ ๊ฒฐ์ •์— ์ง‘์ค‘๋œ ํ•™์Šต ์‹ ํ˜ธ๋ฅผ ์ œ๊ณตํ•จ์œผ๋กœ์จ ์ถ”์ฒœ์˜ ์ •ํ™•์„ฑ๊ณผ ๋‹ค์–‘์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, V-STAR๋Š” ์ตœ์‹  ๊ธฐ์ˆ  ๋Œ€๋น„ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ƒ์„ฑ ์ถ”์ฒœ ๋ชจ๋ธ์˜ ํƒ์ƒ‰ ๋ถ€์กฑ ๋ฐ ๋ณด์ƒ ์••์ถ• ๋ฌธ์ œ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ•ด๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํŠธ๋ฆฌ ๊ตฌ์กฐ ๊ธฐ๋ฐ˜ ์ƒ˜ํ”Œ๋ง๊ณผ ์ƒ๋Œ€์  ์žฅ์  ๊ณ„์‚ฐ์„ ํ†ตํ•ด ์ถ”์ฒœ ์‹œ์Šคํ…œ์˜ ํšจ์œจ์„ฑ๊ณผ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค.
โ€ข
๋ณต์žกํ•œ ํŠธ๋ฆฌ ํƒ์ƒ‰ ๊ณผ์ •์„ ๋” ํšจ์œจ์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค.
โ€ข
์ œ์•ˆ๋œ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ๋ณต์žก์„ฑ๊ณผ ์‹ค์ œ ์„œ๋น„์Šค ์ ์šฉ ์‹œ์˜ ํ™•์žฅ์„ฑ ๋ฐ ์ถ”๊ฐ€์ ์ธ ์ตœ์ ํ™” ๋ฐฉ์•ˆ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘