Sign In

SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning

Created by
  • Haebom
Category
Empty

μ €μž

Borong Zhang, Yuhao Zhang, Jiaming Ji, Yingshan Lei, Yishuai Cai, Josef Dai, Yuanpei Chen, Yaodong Yang

πŸ’‘ κ°œμš”

λ³Έ 논문은 λ‘œλ΄‡ μ •μ±…μœΌλ‘œ ν™œμš©λ  수 μžˆλŠ” Vision-Language-Action (VLA) λͺ¨λΈμ˜ μ‹€μ œ 배포 μ‹œ λ°œμƒν•˜λŠ” μ•ˆμ „ 문제λ₯Ό ν•΄κ²°ν•˜κ³ μž ν•©λ‹ˆλ‹€. 이λ₯Ό μœ„ν•΄ μ•ˆμ „ μš”κ΅¬μ‚¬ν•­μ„ μ²΄κ³„μ μœΌλ‘œ λͺ¨λΈλ§ν•˜κ³ , λ‹€μ–‘ν•œ μœ„ν—˜ 행동을 λŠ₯λ™μ μœΌλ‘œ λ°œκ΅΄ν•˜λ©°, μ œμ•½μ΄ κ°€ν•΄μ§„ κ°•ν™”ν•™μŠ΅μ„ 톡해 VLA 정책을 μ œμ•½ν•˜κ³ , μ—„κ²©ν•œ 평가λ₯Ό 톡해 μ•ˆμ „μ„±μ„ 보μž₯ν•˜λŠ” 톡합 μ•ˆμ „ 접근법(ISA)을 μ œμ•ˆν•©λ‹ˆλ‹€. ISAλŠ” μ œμ•½ 마λ₯΄μ½”ν”„ κ²°μ • κ³Όμ •(CMDP) νŒ¨λŸ¬λ‹€μž„μ„ ν™œμš©ν•˜μ—¬ μ•ˆμ „ μœ„ν—˜μ— λŒ€ν•œ μ΅œμ†Œ-μ΅œλŒ€ κ΄€μ μ—μ„œ VLAλ₯Ό μ΅œμ ν™”ν•©λ‹ˆλ‹€.

πŸ”‘ μ‹œμ‚¬μ  및 ν•œκ³„

β€’
μ œμ•ˆλœ ISA 접근법은 κΈ°μ‘΄ μ΅œμ²¨λ‹¨ 방법 λŒ€λΉ„ μ•ˆμ „ μœ„λ°˜ λˆ„μ  λΉ„μš©μ„ 83.58% κ°μ†Œμ‹œν‚€λ©΄μ„œλ„ μž‘μ—… 성곡λ₯ μ„ 3.85% λ†’μ΄λŠ” 효과적인 μ•ˆμ „-μ„±λŠ₯ νŠΈλ ˆμ΄λ“œμ˜€ν”„λ₯Ό λ‹¬μ„±ν•©λ‹ˆλ‹€.
β€’
VLA λͺ¨λΈμ€ μž₯기적인 μœ„ν—˜κ³Ό 극단적인 μ‹€νŒ¨ μ‹œλ‚˜λ¦¬μ˜€λ₯Ό μ™„ν™”ν•  수 μžˆλŠ” κ°•λ ₯ν•œ μ•ˆμ „ 보μž₯ λŠ₯λ ₯을 κ°–μΆ”κ²Œ λ©λ‹ˆλ‹€.
β€’
ν•™μŠ΅λœ μ•ˆμ „ 행동은 λ‹€μ–‘ν•œ 뢄포 μ™Έ(out-of-distribution) κ΅λž€μ— λŒ€ν•΄μ„œλ„ κ²¬κ³ ν•˜κ²Œ μΌλ°˜ν™”λ©λ‹ˆλ‹€.
β€’
λ³Έ μ—°κ΅¬λŠ” λͺ¨λ°”일 μ‘°μž‘κ³Ό 같은 μž₯기적 μž‘μ—…μ„ λŒ€μƒμœΌλ‘œ νš¨κ³Όμ„±μ„ ν‰κ°€ν•˜μ˜€μœΌλ©°, κ΄€λ ¨ 데이터, λͺ¨λΈ, 그리고 μƒˆλ‘œμš΄ 벀치마크 ν™˜κ²½μ„ κ³΅κ°œν•©λ‹ˆλ‹€.
πŸ‘