Sign In

AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

An Luo, Jin Du, Xun Xian, Robert Specht, Fangqiao Tian, Ganghua Wang, Xuan Bi, Charles Fleming, Ashish Kundu, Jayanth Srinivasa, Mingyi Hong, Rui Zhang, Tianxi Li, Galin Jones, Jie Ding

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ๋„๋ฉ”์ธ ํŠนํ™” ๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธ์Šค ์ž‘์—…์—์„œ ์ธ๊ฐ„๊ณผ AI ํ˜‘์—…์˜ ๋ฏธ๋ž˜๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ๋ฒค์น˜๋งˆํฌ ๋ฐ ๋Œ€ํšŒ์ธ AgentDS๋ฅผ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ์ƒ์—…, ์‹ํ’ˆ ์ƒ์‚ฐ, ์˜๋ฃŒ ๋“ฑ 6๊ฐœ ์‚ฐ์—…์— ๊ฑธ์นœ 17๊ฐœ์˜ ๋„์ „ ๊ณผ์ œ๋ฅผ ํ†ตํ•ด AI ์—์ด์ „ํŠธ ๋‹จ๋… ์„ฑ๋Šฅ๊ณผ ์ธ๊ฐ„-AI ํ˜‘์—… ์„ฑ๋Šฅ์„ ๋น„๊ตํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ํ˜„์žฌ AI ์—์ด์ „ํŠธ๋Š” ๋„๋ฉ”์ธ ํŠนํ™” ์ถ”๋ก ์— ์–ด๋ ค์›€์„ ๊ฒช๊ณ  ์žˆ์œผ๋ฉฐ, ์ธ๊ฐ„-AI ํ˜‘์—…์ด ๊ฐ€์žฅ ๋›ฐ์–ด๋‚œ ์„ฑ๊ณผ๋ฅผ ๋ณด์˜€์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
ํ˜„์žฌ AI ์—์ด์ „ํŠธ๋งŒ์œผ๋กœ๋Š” ๋„๋ฉ”์ธ ํŠนํ™” ๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธ์Šค ์ž‘์—…์—์„œ ์ธ๊ฐ„ ์ „๋ฌธ๊ฐ€์˜ ์„ฑ๋Šฅ์„ ๋”ฐ๋ผ์žก๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
โ€ข
๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธ์Šค ๋ถ„์•ผ์—์„œ ์ธ๊ฐ„์˜ ์ „๋ฌธ์„ฑ์€ ์—ฌ์ „ํžˆ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๋ฉฐ, AI์™€์˜ ํ˜‘์—…์„ ํ†ตํ•ด ์‹œ๋„ˆ์ง€๋ฅผ ์ฐฝ์ถœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
ํ–ฅํ›„ AI ๊ฐœ๋ฐœ์€ ๋„๋ฉ”์ธ ํŠนํ™” ์ถ”๋ก  ๋Šฅ๋ ฅ ๊ฐ•ํ™” ๋ฐ ์ธ๊ฐ„๊ณผ์˜ ํšจ๊ณผ์ ์ธ ํ˜‘์—… ๋ฐฉ์•ˆ ๋ชจ์ƒ‰์— ์ดˆ์ ์„ ๋งž์ถฐ์•ผ ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘