Sign In

Latent Action Reparameterization for Efficient Agent Inference

μž‘μ„±μž
  • Haebom
μΉ΄ν…Œκ³ λ¦¬
Empty

μ €μž

Wenhao Huang, Qingwen Zeng, Qiyue Chen, Zijie Guo, Yu Sun, Cheng Yang, Siru Ouyang, Jiri Gesi, Fang Wu, Jiayi Zhang, Huaming Chen, Bang Liu, Xiangru Tang, Chenglin Wu

πŸ’‘ κ°œμš”

이 논문은 LLM μ—μ΄μ „νŠΈμ˜ κΈ΄ ν…μŠ€νŠΈ μ•‘μ…˜ μ‹œν€€μŠ€λ‘œ μΈν•œ 높은 μΆ”λ‘  λΉ„μš© 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄, 닀단계 의미둠적 행동을 λ‚˜νƒ€λ‚΄λŠ” μ••μΆ•λœ 잠재 μ•‘μ…˜ 곡간을 ν•™μŠ΅ν•˜λŠ” Latent Action Reparameterization (LAR) ν”„λ ˆμž„μ›Œν¬λ₯Ό μ œμ•ˆν•©λ‹ˆλ‹€. LARλŠ” μ—μ΄μ „νŠΈμ˜ 행동을 잠재 λ‹¨μœ„λ‘œ μž¬λ§€κ°œλ³€μˆ˜ν™”ν•˜μ—¬ 효과적인 μ˜μ‚¬κ²°μ • λ²”μœ„λ₯Ό λ‹¨μΆ•μ‹œν‚€λ©΄μ„œλ„ μ›λž˜ μ•‘μ…˜ κ³΅κ°„μ˜ ν‘œν˜„λ ₯을 μœ μ§€ν•©λ‹ˆλ‹€.

πŸ”‘ μ‹œμ‚¬μ  및 ν•œκ³„

β€’
LLM μ—μ΄μ „νŠΈ μΆ”λ‘  νš¨μœ¨μ„±μ„ 높이기 μœ„ν•΄ μ•‘μ…˜ κ³΅κ°„μ˜ ν‘œν˜„ ν•™μŠ΅μ΄ μ€‘μš”ν•˜λ‹€λŠ” 것을 λ³΄μ—¬μ€λ‹ˆλ‹€.
β€’
LARλŠ” κ³ μ •λœ μ»΄ν“¨νŒ… μ˜ˆμ‚° ν•˜μ—μ„œ μ—μ΄μ „νŠΈμ˜ 효과적인 μ•‘μ…˜ λ²”μœ„λ₯Ό 쀄이고 μΆ”λ‘  νš¨μœ¨μ„±μ„ ν–₯μƒμ‹œν‚€λ©°, λ™μ‹œμ— μž‘μ—… 성곡λ₯ μ„ μœ μ§€ν•˜κ±°λ‚˜ κ°œμ„ ν•©λ‹ˆλ‹€.
β€’
μˆ˜μž‘μ—… 맀크둜 λ˜λŠ” 계측적 μ œμ–΄ 방식과 달리, LARλŠ” μ—μ΄μ „νŠΈ κΆ€μ μ—μ„œ 직접 잠재 μ•‘μ…˜μ„ ν•™μŠ΅ν•˜κ³  λͺ¨λΈμ— ν†΅ν•©ν•˜μ—¬ 좔상적인 μ•‘μ…˜ ν‘œν˜„μœΌλ‘œ κ³„νš 및 싀행을 κ°€λŠ₯ν•˜κ²Œ ν•©λ‹ˆλ‹€.
πŸ‘