Sign In

When Should a Robot Think? Resource-Aware Reasoning via Reinforcement Learning for Embodied Robotic Decision-Making

μž‘μ„±μž
  • Haebom
μΉ΄ν…Œκ³ λ¦¬
Empty

μ €μž

Jun Liu, Pu Zhao, Zhenglun Kong, Xuan Shen, Peiyan Dong, Fan Yang, Lin Cui, Hao Tang, Geng Yuan, Wei Niu, Wenbin Zhang, Xue Lin, Gaowen Liu, Yanzhi Wang, Dong Huang

πŸ’‘ κ°œμš”

λ‘œλ΄‡μ΄ ν™˜κ²½κ³Ό μƒν˜Έμž‘μš©ν•  λ•Œ LLM 기반 μ—μ΄μ „νŠΈλŠ” κ³ μˆ˜μ€€ μΆ”λ‘  및 μ˜μ‚¬κ²°μ •μ— μ€‘μš”ν•˜μ§€λ§Œ, LLM ν˜ΈμΆœμ€ μƒλ‹Ήν•œ μ§€μ—°κ³Ό μžμ› μ†Œλͺ¨λ₯Ό μœ λ°œν•©λ‹ˆλ‹€. λ³Έ 논문은 λ‘œλ΄‡μ΄ μ–Έμ œ μΆ”λ‘ ν•˜κ³  μ–Έμ œ 행동해야 ν•˜λŠ”μ§€μ— λŒ€ν•œ 근본적인 문제λ₯Ό ν•΄κ²°ν•˜κ³ μž ν•©λ‹ˆλ‹€. 이λ₯Ό μœ„ν•΄ RARRL(Resource-Aware Reasoning via Reinforcement Learning)μ΄λΌλŠ” 계측적 ν”„λ ˆμž„μ›Œν¬λ₯Ό μ œμ•ˆν•˜λ©°, μ΄λŠ” κ΄€μ°°, μ‹€ν–‰ 기둝, 남은 μžμ›μ„ 기반으둜 μΆ”λ‘  μ—¬λΆ€, μΆ”λ‘  μ—­ν• , 계산 μ˜ˆμ‚°μ„ μ μ‘μ μœΌλ‘œ κ²°μ •ν•©λ‹ˆλ‹€.

πŸ”‘ μ‹œμ‚¬μ  및 ν•œκ³„

β€’
λ‘œλ΄‡μ΄ LLM 좔둠을 μ–Έμ œ, μ–΄λ–»κ²Œ μ‚¬μš©ν• μ§€ μžμ› μ œμ•½μ„ κ³ λ €ν•˜μ—¬ λ™μ μœΌλ‘œ κ²°μ •ν•˜λŠ” 것이 효율적이고 μ‹ λ’°μ„± μžˆλŠ” λ‘œλ΄‡ μ‹œμŠ€ν…œ ꡬ좕에 ν•„μˆ˜μ μž…λ‹ˆλ‹€.
β€’
RARRL은 λ‹€μ–‘ν•œ μƒν™©μ—μ„œ 좔둠을 μ΅œμ ν™”ν•˜μ—¬ μž‘μ—… 성곡λ₯ μ„ 높이고 μ§€μ—° μ‹œκ°„μ„ 쀄이며 견고성을 ν–₯μƒμ‹œμΌ°μŠ΅λ‹ˆλ‹€.
β€’
ν˜„μž¬ μ—°κ΅¬λŠ” νŠΉμ • λ‘œλ΄‡ ν”Œλž«νΌ 및 ν™˜κ²½μ— λŒ€ν•œ μ‹€ν—˜μœΌλ‘œ μ§„ν–‰λ˜μ—ˆμœΌλ©°, λ‹€μ–‘ν•œ λ‘œλ΄‡ ν•˜λ“œμ›¨μ–΄ 및 λ³΅μž‘ν•œ μ‹€μ œ ν™˜κ²½μœΌλ‘œμ˜ μΌλ°˜ν™”λŠ” μΆ”κ°€ 연ꡬ가 ν•„μš”ν•©λ‹ˆλ‹€.
πŸ‘