Sign In

When Critics Disagree: Adaptive Reward Poisoning Attacks in RIS-Aided Wireless Control System

์ž‘์„ฑ์ž
  • Haebom
์นดํ…Œ๊ณ ๋ฆฌ
Empty

์ €์ž

Deemah H. Tashman, Soumaya Cherkaoui

๐Ÿ’ก ๊ฐœ์š”

๋ณธ ๋…ผ๋ฌธ์€ ์žฌ๊ตฌ์„ฑ ๊ฐ€๋Šฅํ•œ ์ง€๋Šฅ ํ‘œ๋ฉด(RIS)์œผ๋กœ ์ง€์›๋˜๋Š” ๋ฌด์„  ์ œ์–ด ์‹œ์Šคํ…œ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ํ•™์Šต ๊ธฐ๋ฐ˜ ์‹œ์Šคํ…œ์˜ ์œ„ํ—˜์„ฑ์„ ์ค„์ด๊ธฐ ์œ„ํ•ด, Soft Actor-Critic (SAC) ์—์ด์ „ํŠธ๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” Disagreement-Guided Reward Poisoning (DGRP) ๊ณต๊ฒฉ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. DGRP๋Š” SAC์˜ ์ด์ค‘ ๋น„ํ‰๊ฐ€(dual critics) ๊ฐ„์˜ ๋ถˆ์ผ์น˜๊ฐ€ ํฐ ์ƒํƒœ์—์„œ ๋ณด์ƒ์„ ์กฐ์ž‘ํ•˜์—ฌ, ๊ฐ€์น˜ ์ถ”์ •์„ ์™œ๊ณกํ•˜๊ณ  ์ •์ฑ…์„ ์ตœ์ ํ™”๋˜์ง€ ์•Š์€ ๋ฐฉํ–ฅ์œผ๋กœ ์œ ๋„ํ•ฉ๋‹ˆ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, DGRP๋Š” RIS์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ํฌ๊ฒŒ ์ €ํ•ดํ•˜๊ณ  ํ†ต์‹  ํ’ˆ์งˆ์„ ์ €ํ•˜์‹œํ‚ค๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค.

๐Ÿ”‘ ์‹œ์‚ฌ์  ๋ฐ ํ•œ๊ณ„

โ€ข
RIS ์ง€์› ๋ฌด์„  ์ œ์–ด ์‹œ์Šคํ…œ์—์„œ ๋”ฅ ๊ฐ•ํ™” ํ•™์Šต ์—์ด์ „ํŠธ์˜ ๊ฒฌ๊ณ ์„ฑ์„ ํ‰๊ฐ€ํ•  ๋•Œ, ๋น„ํ‰๊ฐ€ ๊ฐ„์˜ ๋ถˆ์ผ์น˜๋ฅผ ์ธ์ง€ํ•˜๋Š” ๊ณต๊ฒฉ ์œ„ํ˜‘์„ ๋ฐ˜๋“œ์‹œ ๊ณ ๋ คํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
โ€ข
DGRP๋Š” ์ „ํ†ต์ ์ธ ๋ณด์ƒ ์กฐ์ž‘ ๊ณต๊ฒฉ๋ณด๋‹ค ๋” ํฐ ์„ฑ๋Šฅ ์ €ํ•˜๋ฅผ ์œ ๋ฐœํ•˜๋ฉฐ, RIS์˜ ์ด์ ์„ ํšจ๊ณผ์ ์œผ๋กœ ๋ฌด๋ ฅํ™”์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
๊ณต๊ฒฉ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ์˜ํ–ฅ์— ๋Œ€ํ•œ ์ถ”๊ฐ€์ ์ธ ๋ถ„์„์€ DGRP์˜ ํšจ๊ณผ๋ฅผ ์ตœ์ ํ™”ํ•˜๋Š” ๋ฐ ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
โ€ข
๋ณธ ์—ฐ๊ตฌ๋Š” ์ œ์•ˆ๋œ DGRP ๊ณต๊ฒฉ์˜ ํšจ๊ณผ๋ฅผ ์ฃผ๋กœ ์ด๋ก ์  ๋ถ„์„๊ณผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์œผ๋กœ ๊ฒ€์ฆํ•˜์˜€์œผ๋ฉฐ, ์‹ค์ œ ์‹œ์Šคํ…œ์—์„œ์˜ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ๊ณผ ๋ฐฉ์–ด ์ „๋žต์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๊ฐ€ ์ถ”๊ฐ€์ ์œผ๋กœ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿ‘