image

paper

  • Membership Inference : ํ•ด๋‹น ๋ฐ์ดํ„ฐ๊ฐ€ ๋ชจ๋ธ์˜ training data์— ์žˆ๋Š”์ง€ ์—†๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๊ณต๊ฒฉ. ๊ฐ€๋ น ์˜๋ฃŒ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ์— ํŠน์ • ๋ฐ์ดํ„ฐ๊ฐ€ ํ•™์Šต ๋ฐ์ดํ„ฐ๋กœ ์กด์žฌํ•œ๋‹ค๋Š” ๊ฒƒ๋งŒ์œผ๋กœ๋„ ์‹ฌ๊ฐํ•œ ํ”„๋ผ์ด๋ฒ„์‹œ ์œ ์ถœ์ด ๋  ์ˆ˜ ์žˆ์Œ.
  • ์ด๋Ÿฌํ•œ ๊ณต๊ฒฉ์˜ ๊ฐ€์ •์€ ์•„๋ž˜์™€ ๊ฐ™์Œ. 1) ๊ณต๊ฒฉ์„ ํ•˜๋Š” ๋ชจ๋ธ์€ ๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ด๋ผ๊ณ  ๊ฐ€์ • 2) ML as Service๋กœ input๊ณผ output์„ ์–ป์„ ์ˆ˜ ์žˆ์Œ. 3) ๊ณต๊ฒฉํ•˜๊ณ ์ž ํ•˜๋Š” ๋ชจ๋ธ์˜ ํŠธ๋ ˆ์ด๋‹ ๋ฐ์ดํ„ฐ์…‹์˜ ์ผ๋ถ€๋ฅผ ์•Œ๊ณ  ์žˆ์Œ.
  • Membership Inference Attack์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์•„๋ž˜์™€ ๊ฐ™์Œ. image

(1) ์‹ค์ œ ๋ชจ๋ธ(target model)์˜ ๊ฒฐ๊ณผ๊ฐ’์„ ๋”ฐ๋ผํ•˜๋Š” shadow ๋ชจ๋ธ๋“ค์„ ์ •์˜ํ•จ.(target model์˜ ์•„ํ‚คํ…์ณ๋ฅผ ์•ˆ๋‹ค๋ฉด ๋˜‘๊ฐ™์ด ๋งŒ๋“ฆ) (2) ์•Œ๊ณ ์žˆ๋Š” ํŠธ๋ ˆ์ด๋‹ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒน์น˜์ง€ ์•Š๊ฒŒ subset์„ ๋งŒ๋“ค๊ณ , ๊ฐ๊ฐ์„ shadow ๋ชจ๋ธ๋“ค๋กœ ํ•™์Šตํ•จ. (3) ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•˜์—ฌ ์‹ค์ œ ๋ ˆ์ด๋ธ”๊ฐ’, shadow ๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’์„ input์œผ๋กœ ์ฃผ๊ณ  ํ•ด๋‹น shadow ๋ชจ๋ธ์˜ ํ•ด๋‹น ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ์ด ์กด์žฌํ–ˆ๋Š”์ง€("in", "out") ๋ถ„๋ฅ˜ํ•˜๋Š” attack model์„ ํ•™์Šตํ•จ.

image

results : ๋Œ€๋ถ€๋ถ„์˜ ๋ฐ์ดํ„ฐ์—์„œ ๋†’์€ precision, recall. membership attack์€ black box(๋ชจ๋ธ์„ ๋ชจ๋ฅด๊ณ , ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ prior assumption์ด ํ‹€๋ ธ์„ ๋•Œ๋„) ํ™˜๊ฒฝ์—์„œ๋„ ์ž˜ ์ž‘๋™ํ•จ.
image

confidence๊ฐ€ member, non-member์ผ ๋•Œ ํ™•์‹คํžˆ ๋‹ค๋ฆ„. image