Sign In

EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Jaeseong Kim, Chaehwan Lim, Sang Hyun Gil, Suan Lee

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ์‹ค์ œ ๊ณต๊ณต ์„œ๋น„์Šค ์‹œํ—˜์—์„œ ์ถ”์ถœํ•œ ๋Œ€๊ทœ๋ชจ ๋‹ค๊ตญ์–ด ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ฒค์น˜๋งˆํฌ์ธ EuraGovExam์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฒค์น˜๋งˆํฌ๋Š” ํ•œ๊ตญ, ์ผ๋ณธ, ๋Œ€๋งŒ, ์ธ๋„, ์œ ๋Ÿฝ ์—ฐํ•ฉ ๋“ฑ 5๊ฐœ ์ง€์—ญ์˜ 8,000๊ฐœ ์ด์ƒ์˜ ์‹ค์ œ ์‹œํ—˜ ๋ฌธ์ œ๋ฅผ ํฌํ•จํ•˜๋ฉฐ, ์ด๋ฏธ์ง€ ๋‚ด ํ…์ŠคํŠธ์™€ ์‹œ๊ฐ ์š”์†Œ๋ฅผ ๋ชจ๋‘ ๊ณ ๋ คํ•œ ๋ณต์žกํ•œ ๋ ˆ์ด์•„์›ƒ์„ ํŠน์ง•์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ์ตœ์ฒจ๋‹จ ๋น„์ „-์–ธ์–ด ๋ชจ๋ธ(VLM)์กฐ์ฐจ 86%์˜ ์ •ํ™•๋„๋ฅผ ๊ธฐ๋กํ•˜๋ฉฐ, ์ด๋Š” ํ˜„์žฌ ๋ชจ๋ธ์˜ ํ•œ๊ณ„๋ฅผ ์ง„๋‹จํ•˜๋Š” ๋ฐ ์œ ์šฉํ•จ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
์‹ค์ œ ๊ณต๊ณต ์„œ๋น„์Šค ์‹œํ—˜์˜ ๋ณต์žก์„ฑ๊ณผ ๋ฌธํ™”์  ํ˜„์‹ค์„ฑ์„ ๋ฐ˜์˜ํ•˜์—ฌ VLM ํ‰๊ฐ€์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ๊ธฐ์ค€์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
โ€ข
์‹œ๊ฐ์  ๋ณต์žก์„ฑ, ๋‹ค๊ตญ์–ด ์ง€์›, ๋ ˆ์ด์•„์›ƒ ์ธ์‹ ๋Šฅ๋ ฅ์„ ๋™์‹œ์— ์š”๊ตฌํ•˜์—ฌ ๊ธฐ์กด ๋ฒค์น˜๋งˆํฌ๋ฅผ ๋›ฐ์–ด๋„˜๋Š” ๋„์ „ ๊ณผ์ œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
โ€ข
ํ˜„์žฌ ์ตœ์ฒจ๋‹จ VLM์˜ ์„ฑ๋Šฅ์„ ํšจ๊ณผ์ ์œผ๋กœ ์ธก์ •ํ•˜๊ณ  ๊ฐœ์„ ์ ์„ ๋„์ถœํ•˜๋Š” ๋ฐ ๊ธฐ์—ฌํ•ฉ๋‹ˆ๋‹ค.
โ€ข
EuraGovExam ๋ฒค์น˜๋งˆํฌ์˜ ๋‚œ์ด๋„๊ฐ€ ๋†’์•„, ๋ชจ๋ธ์˜ ํ•™์Šต ๋ฐ ๋ฏธ์„ธ ์กฐ์ •์„ ์œ„ํ•ด ์ถ”๊ฐ€์ ์ธ ์—ฐ๊ตฌ์™€ ๊ฐœ๋ฐœ์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๐Ÿ‘