Sign In

Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation

Created by
  • Haebom
Category
Empty

์ €์ž

Lina Conti, Dennis Fucci, Marco Gaido, Matteo Negri, Guillaume Wisniewski, Luisa Bentivogli

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ์—ฐ๊ตฌ๋Š” ์Œ์„ฑ ๋ฒˆ์—ญ(Speech Translation, ST) ๋ชจ๋ธ์—์„œ ํ™”์ž์˜ ์Œ์„ฑ์  ํŠน์ง•์ด ์„ฑ๋ณ„ ํ• ๋‹น์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ์กฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ, ์˜์–ด์™€ ๊ฐ™์ด ํ‘œ๊ธฐ๊ฐ€ ์„ฑ๋ณ„์„ ๊ตฌ๋ถ„ํ•˜์ง€ ์•Š๋Š” ์–ธ์–ด์—์„œ ๋ฌธ๋ฒ•์ ์œผ๋กœ ์„ฑ๋ณ„์„ ๊ฐ€์ง€๋Š” ์–ธ์–ด๋กœ ๋ฒˆ์—ญ๋  ๋•Œ, ST ๋ชจ๋ธ์ด ์Œ์„ฑ ์ •๋ณด๋ฅผ ์–ด๋–ป๊ฒŒ ํ™œ์šฉํ•˜์—ฌ ํ™”์ž๋ฅผ ์ง€์นญํ•˜๋Š” ์šฉ์–ด์— ์„ฑ๋ณ„์„ ํ• ๋‹นํ•˜๋Š”์ง€ ๋ถ„์„ํ•ฉ๋‹ˆ๋‹ค. ์—ฐ๊ตฌ ๊ฒฐ๊ณผ, ๋ชจ๋ธ์€ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์˜ ์„ฑ๋ณ„ ์—ฐ๊ด€์„ฑ์„ ๋‹จ์ˆœํžˆ ๋ชจ๋ฐฉํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ์Œ์„ฑ ์ •๋ณด์™€ ๋‚ด๋ถ€ ์–ธ์–ด ๋ชจ๋ธ์˜ ํŽธํ–ฅ์ด ์ƒํ˜ธ์ž‘์šฉํ•˜์—ฌ ์„ฑ๋ณ„ ํ• ๋‹น์ด ์ด๋ฃจ์–ด์ง€๋ฉฐ, ์ •ํ™•๋„๊ฐ€ ๋†’์€ ๋ชจ๋ธ์€ ์ฃผํŒŒ์ˆ˜ ์ŠคํŽ™ํŠธ๋Ÿผ ์ „๋ฐ˜์— ๊ฑธ์ณ ๋ถ„ํฌ๋œ ์Œ์„ฑ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์‚ฌ์šฉํ•จ์„ ๋ฐํ˜”์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
์Œ์„ฑ ๋ฒˆ์—ญ ๋ชจ๋ธ์€ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์˜ ์„ฑ๋ณ„ ํŽธํ–ฅ์„ ๋‹ต์Šตํ•˜๋Š” ๊ฒƒ์„ ๋„˜์–ด, ์Œ์„ฑ์  ํŠน์ง•๊ณผ ๋‚ด๋ถ€ ์–ธ์–ด ๋ชจ๋ธ์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•ด ์„ฑ๋ณ„ ํ• ๋‹น์„ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๋†’์€ ์„ฑ๋ณ„ ์ •ํ™•๋„๋ฅผ ๋ณด์ด๋Š” ๋ชจ๋ธ์€ ๋‹จ์ˆœํžˆ ์Œ์„ฑ์˜ ๋†’๋‚ฎ์ด(pitch)์— ์˜์กดํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, 1์ธ์นญ ๋Œ€๋ช…์‚ฌ๋ฅผ ํ†ตํ•ด ํ™”์ž์™€ ์„ฑ๋ณ„์ด ํ• ๋‹น๋œ ์šฉ์–ด๋ฅผ ์—ฐ๊ฒฐํ•˜๋ฉฐ ์ฃผํŒŒ์ˆ˜ ์ŠคํŽ™ํŠธ๋Ÿผ ์ „๋ฐ˜์˜ ์Œ์„ฑ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค.
โ€ข
๋ณธ ์—ฐ๊ตฌ๋Š” ํŠน์ • ์–ธ์–ด ์Œ(en-es/fr/it)์— ๊ตญํ•œ๋˜์–ด ์žˆ์œผ๋ฉฐ, ๋‹ค์–‘ํ•œ ์Œ์„ฑ์  ํŠน์ง• ๋ฐ ์–ธ์–ด์  ๋งฅ๋ฝ์—์„œ์˜ ์„ฑ๋ณ„ ํ• ๋‹น ๋ฉ”์ปค๋‹ˆ์ฆ˜์— ๋Œ€ํ•œ ์ถ”๊ฐ€์ ์ธ ํƒ๊ตฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘