Sign In

OmniVL-Guard Pro: A Tool-Augmented Agent for Omnibus Vision-Language Forensics

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Jinjie Shen, Zheng Huang, Yuchen Zhang, Yujiao Wu, Yaxiong Wang, Lechao Cheng, Shengeng Tang, Tianrui Hui, Nan Pu, Zhun Zhong

๐Ÿ’ก ๊ฐœ์š”

๊ธฐ์กด ๋น„์ „-์–ธ์–ด ์œ„๋ณ€์กฐ ํƒ์ง€ ๋ฐ ๊ธฐ๋ฐ˜ ํƒ์ง€ ๋ฐฉ๋ฒ•์€ ๋ชจ๋ธ ์ž์ฒด๋งŒ์œผ๋กœ ๊ฒ€์ฆ์ด ์™„๋ฃŒ๋˜๋Š” ํ์‡„ ์„ธ๊ณ„ ๊ฐ€์ •์„ ๋”ฐ๋ฅด์ง€๋งŒ, ๋ชจ๋ธ์˜ ํ•œ๊ณ„๋กœ ์ธํ•ด ์‹ค์ œ ์˜คํ”ˆ์›”๋“œ ํ™˜๊ฒฝ์—์„œ์˜ ์‹ค์‹œ๊ฐ„ ๊ฒ€์ฆ ๋ฐ ์„ธ๋ฐ€ํ•œ ์กฐ์ž‘ ๋ถ„์„์— ์–ด๋ ค์›€์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด OmniVL-Guard Pro๋ฅผ ์ œ์•ˆํ•˜๋ฉฐ, ์ด๋Š” ์‹ค์‹œ๊ฐ„ ์ด๋ฒคํŠธ ๊ฒ€์ƒ‰, ์ด๋ฏธ์ง€ ํ™•๋Œ€, ์ด์ƒ ์ง•ํ›„ ํƒ์ง€ ๋“ฑ ๋‹ค์–‘ํ•œ ์™ธ๋ถ€ ๋„๊ตฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ํ์‡„ ์„ธ๊ณ„ ์˜ˆ์ธก์„ ๋„˜์–ด์„  ์˜คํ”ˆ์›”๋“œ ๋‹จ์„œ ๊ธฐ๋ฐ˜ ์ถ”๋ก ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
๋„๊ตฌ ํ™œ์šฉ์„ ํ†ตํ•œ ๋น„์ „-์–ธ์–ด ํฌ๋ Œ์‹ ๋Šฅ๋ ฅ ํ™•์žฅ: ํ์‡„์ ์ธ ๋ชจ๋ธ์˜ ํ•œ๊ณ„๋ฅผ ๋ฒ—์–ด๋‚˜, ์™ธ๋ถ€ ๋„๊ตฌ์™€์˜ ์—ฐ๋™์„ ํ†ตํ•ด ํ˜„์‹ค ์„ธ๊ณ„์˜ ๋ณต์žกํ•˜๊ณ  ๋™์ ์ธ ์œ„๋ณ€์กฐ ํƒ์ง€ ๋ฌธ์ œ์— ํšจ๊ณผ์ ์œผ๋กœ ๋Œ€์ฒ˜ํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๊ฐ•ํ™”๋œ ์ถ”๋ก  ๋Šฅ๋ ฅ ๋ฐ ์ œ๋กœ์ƒท ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ: Tree-Structured Self-Evolving Tool Trajectory Generation๊ณผ Checker-Guided Agentic Reinforcement Learning (CGARL) ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ๊ณ ํ’ˆ์งˆ์˜ ๋„๊ตฌ ์‚ฌ์šฉ ๊ถค์ ์„ ์ƒ์„ฑํ•˜๊ณ , ์˜ฌ๋ฐ”๋ฅธ ๊ฒฐ๊ณผ์™€ ์™œ๊ณก๋œ ์ถ”๋ก ์„ ๊ตฌ๋ถ„ํ•˜์—ฌ ํ•™์Šตํ•จ์œผ๋กœ์จ ๊ฐ•๋ ฅํ•œ ์ถ”๋ก  ๋Šฅ๋ ฅ๊ณผ ๋›ฐ์–ด๋‚œ ์ œ๋กœ์ƒท ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์˜คํ”ˆ์›”๋“œ ํ™˜๊ฒฝ์—์„œ์˜ ์‹ค์ œ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ: ๋‹ค์–‘ํ•œ ์‹คํ—˜์—์„œ ์ตœ์ฒจ๋‹จ ์„ฑ๋Šฅ์„ ์ž…์ฆํ•˜์˜€์œผ๋ฉฐ, ๊ณต๊ฐœ๋  ๋ฐ์ดํ„ฐ์…‹๊ณผ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด ๋น„์ „-์–ธ์–ด ํฌ๋ Œ์‹ ๋ถ„์•ผ์˜ ๋ฐœ์ „์— ๊ธฐ์—ฌํ•  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋ฉ๋‹ˆ๋‹ค.
๐Ÿ‘