Sign In

Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition

Created by
  • Haebom
Category
Empty

์ €์ž

Yushuo Zheng (Shanghai Jiao Tong University, Shanghai Artificial Intelligence Laboratory), Huiyu Duan (Shanghai Jiao Tong University), Zicheng Zhang (Shanghai Jiao Tong University, Shanghai Artificial Intelligence Laboratory), Yucheng Zhu (Shanghai Jiao Tong University), Xiongkuo Min (Shanghai Jiao Tong University), Guangtao Zhai (Shanghai Jiao Tong University, Shanghai Artificial Intelligence Laboratory)

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ LLM์ด ๊ฒฝ์ œ ๋ฐ ๋ฌด์—ญ ๊ฒฝ์Ÿ ํ™˜๊ฒฝ์—์„œ ์–ผ๋งˆ๋‚˜ ํšจ๊ณผ์ ์œผ๋กœ ์ž์›์„ ๊ด€๋ฆฌํ•˜๊ณ  ํš๋“ํ•˜๋Š”์ง€์— ๋Œ€ํ•œ ๋ถˆ๋ถ„๋ช…ํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Market-Bench๋ผ๋Š” ํฌ๊ด„์ ์ธ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. LLM์„ ๊ณต๊ธ‰๋ง ๊ฒฝ์ œ ๋ชจ๋ธ ๋‚ด ์†Œ๋งค์—…์ฒด ์—์ด์ „ํŠธ๋กœ ์„ค์ •ํ•˜์—ฌ, ์˜ˆ์‚ฐ ์ œ์•ฝ ๊ฒฝ๋งค์—์„œ์˜ ์ƒํ’ˆ ์กฐ๋‹ฌ ๋Šฅ๋ ฅ๊ณผ ๊ฐ€๊ฒฉ ์„ค์ •, ๋งˆ์ผ€ํŒ… ์Šฌ๋กœ๊ฑด ์ƒ์„ฑ, ํŒ๋งค ๊ด€๋ฆฌ ๋“ฑ ์†Œ๋งค ๋‹จ๊ณ„๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋ฒค์น˜๋งˆํฌ ๊ฒฐ๊ณผ, LLM ๋ชจ๋ธ ๊ฐ„ ์„ฑ๋Šฅ ๊ฒฉ์ฐจ๊ฐ€ ํฌ๊ณ  ์†Œ์ˆ˜์˜ ๋ชจ๋ธ๋งŒ์ด ์ง€์†์ ์ธ ์ž๋ณธ ์„ฑ์žฅ์„ ๋‹ฌ์„ฑํ•˜๋Š” '์Šน์ž๋…์‹' ํ˜„์ƒ์ด ๋‚˜ํƒ€๋‚ฌ์œผ๋ฉฐ, ์ด๋Š” ์˜๋ฏธ๋ก ์  ์œ ์‚ฌ์„ฑ ์ ์ˆ˜์™€ ์‹ค์ œ ๊ฒฝ์ œ ์„ฑ๊ณผ ๊ฐ„์˜ ๋ถˆ์ผ์น˜๋ฅผ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
LLM์˜ ์‹ค์ œ ๊ฒฝ์ œ ํ™œ๋™ ๋ฐ ์‹œ์žฅ ๊ฒฝ์Ÿ์—์„œ์˜ ์ž ์žฌ๋ ฅ๊ณผ ํ•œ๊ณ„๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
โ€ข
LLM ์—์ด์ „ํŠธ์˜ ์กฐ๋‹ฌ, ๊ฐ€๊ฒฉ ์ฑ…์ •, ๋งˆ์ผ€ํŒ… ์ „๋žต ์ˆ˜๋ฆฝ ๋Šฅ๋ ฅ์„ ๋‹ค์–‘ํ•œ ๊ฒฝ์ œ ์ง€ํ‘œ๋กœ ์ž๋™ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
โ€ข
LLM์˜ ์˜๋ฏธ๋ก ์  ์ดํ•ด ๋Šฅ๋ ฅ๊ณผ ์‹ค์ œ ์‹œ์žฅ์—์„œ์˜ ๊ฒฝ์ œ์  ์˜์‚ฌ๊ฒฐ์ • ๋Šฅ๋ ฅ ๊ฐ„์˜ ๊ดด๋ฆฌ๊ฐ€ ์กด์žฌํ•จ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
โ€ข
๋ฒค์น˜๋งˆํฌ์—์„œ ์‚ฌ์šฉ๋œ ๊ฒฝ์ œ ๋ชจ๋ธ์˜ ๋ณต์žก์„ฑ๊ณผ ํ˜„์‹ค์„ฑ์„ ๋”์šฑ ๋†’์—ฌ, ๋”์šฑ ์ •๊ตํ•˜๊ณ  ํ˜„์‹ค์ ์ธ ์‹œ์žฅ ์ƒํ™ฉ์„ ๋ฐ˜์˜ํ•˜๋Š” ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘