Sign In

Task-conditioned probing of instruction-tuned multimodal LLMs: Region-specific brain alignment patterns under naturalistic stimuli

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Subba Reddy Oota, Khushbu Pahwa, Prachi Jindal, Satya Sai Srinath Namburi, Maneesh Singh, Tanmoy Chakraborty, Bapi S. Raju, Manish Gupta

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ์—ฐ๊ตฌ๋Š” ์ง€์‹œ์–ด์— ๋”ฐ๋ผ ๋ฏธ์„ธ ์กฐ์ •๋œ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(IT-MLLM)์ด ์ž์—ฐ์Šค๋Ÿฌ์šด ์˜ํ™” ์‹œ์ฒญ ์ค‘ ๋‡Œ ํ™œ๋™๊ณผ ์–ผ๋งˆ๋‚˜ ์ž˜ ์ผ์น˜ํ•˜๋Š”์ง€ ํƒ์ƒ‰ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ๋น„๋””์˜ค ๋ฐ ์˜ค๋””์˜ค IT-MLLM์„ ์‚ฌ์šฉํ•˜์—ฌ 13๊ฐ€์ง€ ๋น„๋””์˜ค ์ž‘์—… ์ง€์‹œ์–ด์— ๋”ฐ๋ฅธ ๋‡Œ ์ •๋ ฌ ํŒจํ„ด์„ ๋ถ„์„ํ•œ ๊ฒฐ๊ณผ, IT-MLLM์ด ๊ธฐ์กด ๋ชจ๋ธ๋ณด๋‹ค ๋‡Œ ํ™œ๋™์„ ๋” ์ž˜ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ, IT-MLLM์€ ์ž‘์—…๋ณ„๋กœ ๊ตฌ๋ถ„๋˜๋Š” ํ‘œํ˜„์„ ์ƒ์„ฑํ•˜๋ฉฐ ๋‡Œ์˜ ํŠน์ • ์˜์—ญ๊ณผ ๋†’์€ ์ •๋ ฌ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
์ง€์‹œ์–ด ๋ฏธ์„ธ ์กฐ์ •(Instruction-tuning)์ด ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(MLLM)์˜ ๋‡Œ ์ •๋ ฌ์„ ๊ฐ•ํ™”์‹œํ‚ค๋ฉฐ, ์ด๋Š” ๋‹จ์ˆœํžˆ ํ‘œ๋ฉด์  ์˜๋ฏธ๋ฅผ ๋„˜์–ด ๊ธฐ๋Šฅ์  ์ž‘์—… ์š”๊ตฌ์— ๋”ฐ๋ผ ํ‘œํ˜„์„ ๊ตฌ์„ฑํ•จ์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
โ€ข
IT-MLLM์€ ๋‡Œ์˜ ๋‹ค์–‘ํ•œ ์˜์—ญ์—์„œ ์ž‘์—…๋ณ„๋กœ ๊ณ ์œ ํ•œ ํ‘œํ˜„์„ ํ˜•์„ฑํ•˜๋ฉฐ, ์ด๋Š” ์ธ๊ฐ„์˜ ๋‡Œ์™€ MLLM ๊ฐ„์˜ ์ •๋ณด ์ฒ˜๋ฆฌ ๋ฐฉ์‹์„ ์ดํ•ดํ•˜๋Š” ๋ฐ ์ค‘์š”ํ•œ ๋‹จ์„œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
โ€ข
in-context learning (ICL) ๋ชจ๋ธ์€ ํ…์ŠคํŠธ ์˜๋ฏธ์™€ ๊ฐ•ํ•œ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๋ณด์ธ ๋ฐ˜๋ฉด, IT ๋ชจ๋ธ์€ ์ง€์‹œ์–ด ํ…์ŠคํŠธ ์˜๋ฏธ์™€์˜ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ๋‚ฎ์•˜์œผ๋ฉฐ, ์ด๋Š” ์ž‘์—… ์กฐ๊ฑด์— ๋”ฐ๋ฅธ ํ‘œํ˜„ ๊ณต๊ฐ„์˜ ๋ถ„๋ฆฌ๊ฐ€ ๋‡Œ ์ •๋ ฌ ์ฆ๊ฐ€์™€ ๊ด€๋ จ ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
โ€ข
ํ–ฅํ›„ ์—ฐ๊ตฌ์—์„œ๋Š” ๋” ๋‹ค์–‘ํ•œ ์ž์—ฐ์Šค๋Ÿฌ์šด ์ž๊ทน๊ณผ ์ž‘์—… ์ง€์‹œ์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ IT-MLLM์˜ ๋‡Œ ์ •๋ ฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์‹ฌ์ธต์ ์œผ๋กœ ๋ถ„์„ํ•˜๊ณ , ์ด๋Ÿฌํ•œ ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜์—ฌ ๋‡Œ-์ปดํ“จํ„ฐ ์ธํ„ฐํŽ˜์ด์Šค ๊ฐœ๋ฐœ ๋“ฑ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๐Ÿ‘