Sign In

COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Junyu Wang, Changjia Zhu, Yuanbo Zhou, Lingyao Li, Xu He, Mingkui Wei, Junjie Xiong

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(MLLM)์ด ๊ธฐ์กด ์‹œ๊ฐ์  ์บก์ฑ ์˜ ๋ณด์•ˆ์„ ์–ด๋–ป๊ฒŒ ์•ฝํ™”์‹œํ‚ค๋Š”์ง€ ๋ถ„์„ํ•ฉ๋‹ˆ๋‹ค. ์ƒ์šฉ ๋ฐ ์˜คํ”ˆ์†Œ์Šค MLLM 7์ข…์„ ๋Œ€์ƒ์œผ๋กœ 18๊ฐ€์ง€ ์‹ค์ œ ์บก์ฑ  ๊ณผ์ œ ์œ ํ˜•์— ๋Œ€ํ•œ ํ•ด๊ฒฐ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ–ˆ์œผ๋ฉฐ, MLLM์ด ํŠน์ • ์œ ํ˜•์˜ ์บก์ฑ ๋ฅผ ์ธ๊ฐ„ ์ˆ˜์ค€์˜ ๋น„์šฉ๊ณผ ์ง€์—ฐ ์‹œ๊ฐ„์œผ๋กœ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ฐํ˜”์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์บก์ฑ  ๊ฐ•ํ™” ๋ฐฉ์•ˆ์„ ์ œ์‹œํ•˜๊ณ , ์‹ค์ œ๋กœ ์ทจ์•ฝํ•œ ์บก์ฑ  ์œ ํ˜•์„ ์„ฑ๊ณต์ ์œผ๋กœ ๋ฐฉ์–ดํ•˜๋Š” ์‚ฌ๋ก€๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
MLLM์€ ์ธ์‹ ๊ธฐ๋ฐ˜ ๋ฐ ์ƒํ˜ธ์ž‘์šฉ์ด ์ ์€ ์บก์ฑ  ๊ณผ์ œ์— ๋Œ€ํ•ด ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ ์ˆ˜์ค€์˜ ๋น„์šฉ๊ณผ ์ง€์—ฐ ์‹œ๊ฐ„์œผ๋กœ ๋†’์€ ํ•ด๊ฒฐ ์„ฑ๊ณต๋ฅ ์„ ๋ณด์ž…๋‹ˆ๋‹ค.
โ€ข
์บก์ฑ  ์„ค๊ณ„ ์‹œ ์ •๊ตํ•œ ์œ„์น˜ ์ธ์‹, ๋‹ค๋‹จ๊ณ„ ๊ณต๊ฐ„ ์ถ”๋ก , ํ”„๋ ˆ์ž„ ๊ฐ„ ์ผ๊ด€์„ฑ ์š”๊ตฌ ์‚ฌํ•ญ์„ ๊ฐ•ํ™”ํ•˜๋Š” ๊ฒƒ์ด MLLM ๊ธฐ๋ฐ˜ ๊ณต๊ฒฉ์— ๋Œ€ํ•œ ํšจ๊ณผ์ ์ธ ๋ฐฉ์–ด ์ „๋žต์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํ˜„์žฌ MLLM์€ ๋ฏธ์„ธํ•œ ์œ„์น˜ ํŒŒ์•…, ๋ณต์žกํ•œ ๊ณต๊ฐ„ ์ถ”๋ก , ๋˜๋Š” ์—ฌ๋Ÿฌ ํ”„๋ ˆ์ž„์— ๊ฑธ์นœ ์ผ๊ด€์„ฑ์„ ์š”๊ตฌํ•˜๋Š” ์บก์ฑ ์—๋Š” ์—ฌ์ „ํžˆ ์–ด๋ ค์›€์„ ๊ฒช์Šต๋‹ˆ๋‹ค.
๐Ÿ‘