Sign In

AgentNoiseBench: Benchmarking Robustness of Tool-Using LLM Agents Under Noisy Condition

Created by
  • Haebom
Category
Empty

μ €μž

Ruipeng Wang, Yuxin Chen, Yukai Wang, Chang Wu, Junfeng Fang, Xiaodong Cai, Qi Gu, Hui Su, An Zhang, Xiang Wang, Xunliang Cai, Tat-Seng Chua

πŸ’‘ κ°œμš”

λ³Έ 논문은 μ‹€μ œ ν™˜κ²½μ—μ„œ LLM 기반 μ—μ΄μ „νŠΈμ˜ μ„±λŠ₯ μ €ν•˜ 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄, λ…Έμ΄μ¦ˆ ν™˜κ²½μ—μ„œμ˜ 견고성을 μ²΄κ³„μ μœΌλ‘œ ν‰κ°€ν•˜λŠ” AgentNoiseBench ν”„λ ˆμž„μ›Œν¬λ₯Ό μ œμ•ˆν•œλ‹€. μ‚¬μš©μžμ™€ λ„κ΅¬μ—μ„œ λ°œμƒν•˜λŠ” λ…Έμ΄μ¦ˆλ₯Ό λΆ„μ„ν•˜κ³  μ œμ–΄ κ°€λŠ₯ν•œ λ°©μ‹μœΌλ‘œ κΈ°μ‘΄ λ²€μΉ˜λ§ˆν¬μ— μ£Όμž…ν•˜μ—¬, λ‹€μ–‘ν•œ LLM μ—μ΄μ „νŠΈμ˜ μ„±λŠ₯ λ³€ν™”λ₯Ό κ΄‘λ²”μœ„ν•˜κ²Œ ν‰κ°€ν–ˆλ‹€. 연ꡬ κ²°κ³Ό, ν˜„μž¬ μ—μ΄μ „νŠΈλ“€μ€ ν˜„μ‹€μ μΈ λ…Έμ΄μ¦ˆμ— λ―Όκ°ν•˜κ²Œ λ°˜μ‘ν•˜λ©° μ„±λŠ₯ 변동을 λ³΄μž„μ„ ν™•μΈν–ˆλ‹€.

πŸ”‘ μ‹œμ‚¬μ  및 ν•œκ³„

β€’
μ‹€μ œ ν™˜κ²½μ˜ 비이상적이고 λΆˆμ™„μ „ν•œ νŠΉμ„±μ„ κ³ λ €ν•œ LLM μ—μ΄μ „νŠΈ 견고성 ν‰κ°€μ˜ μ€‘μš”μ„±μ„ κ°•μ‘°ν•œλ‹€.
β€’
μ‚¬μš©μžμ™€ λ„κ΅¬μ—μ„œ λ°œμƒν•˜λŠ” λ…Έμ΄μ¦ˆ μœ ν˜•μ„ λΆ„λ₯˜ν•˜κ³  이λ₯Ό μ œμ–΄ κ°€λŠ₯ν•œ 벀치마크둜 κ΅¬ν˜„ν•˜μ—¬ ν–₯ν›„ κ΄€λ ¨ μ—°κ΅¬μ˜ κΈ°λ°˜μ„ λ§ˆλ ¨ν•œλ‹€.
β€’
λ‹€μ–‘ν•œ λͺ¨λΈ μ•„ν‚€ν…μ²˜μ™€ 규λͺ¨μ—μ„œμ˜ μ„±λŠ₯ λ³€ν™”λ₯Ό 보여주며, ν˜„μž¬ LLM μ—μ΄μ „νŠΈμ˜ ν˜„μ‹€μ μΈ ν™˜κ²½ 적응λ ₯에 λŒ€ν•œ 톡찰을 μ œκ³΅ν•œλ‹€.
β€’
(ν•œκ³„μ  λ˜λŠ” ν–₯ν›„ 과제) λ…Έμ΄μ¦ˆμ˜ μ’…λ₯˜ 및 강도에 λ”°λ₯Έ μ—μ΄μ „νŠΈ μ„±λŠ₯ μ €ν•˜ λ©”μ»€λ‹ˆμ¦˜μ— λŒ€ν•œ 더 심측적인 뢄석과, λ…Έμ΄μ¦ˆμ— κ°•κ±΄ν•œ μ—μ΄μ „νŠΈ κ°œλ°œμ„ μœ„ν•œ μƒˆλ‘œμš΄ ν•™μŠ΅ 및 평가 방법둠 연ꡬ가 ν•„μš”ν•˜λ‹€.
πŸ‘