image paper problem : ๋ชจ๋ธ์ด ์ปค์ง์— ๋”ฐ๋ผ training data๋ฅผ ์™ธ์šฐ๋Š” ์ผ์ด ์ƒ๊ธด๋‹ค. ์ด๋Ÿฌํ•œ ํ˜„์ƒ์ด 1) ๋ชจ๋ธ ํฌ๊ธฐ 2) ๋ฐ์ดํ„ฐ ๋ฐ˜๋ณต ํšŸ์ˆ˜ 3) ์ฃผ์–ด์ง€๋Š” context์˜ ๊ธธ์ด์— ๋”ฐ๋ผ ์–ผ๋งˆ๋‚˜ ์ฆ๊ฐ€ํ•˜๋Š”์ง€๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ํ‰๊ฐ€ํ•ด๋ณธ๋‹ค.

conclusion : image

  1. Model scale: Within a model family, larger models memorize 2-5ร— more data than smaller models.
  2. Data duplication: Examples repeated more often are more likely to be extractable.
  3. Context: It is orders of magnitude easier to extract sequences when given a longer surrounding context. -> ์ข‹์€ ์ชฝ์œผ๋กœ ํ•ด์„ํ•˜๋ฉด ๊ทธ๋งŒํผ adversarial attack์„ ํ•˜๊ธฐ ์–ด๋ ต๋‹ค๋Š” ๋œป์ž„. Practitioners building language generation APIs could (until stronger attacks are developed) significantly reduce extraction risk by restricting the maximum prompt length available to users.

details :

  • ์ด์ „ ๋…ผ๋ฌธ์—์„œ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ memorization ํ•˜๋Š” ๋น„์œจ์ด ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ 0.00000015%๋ผ๊ณ  ํ–ˆ์ง€๋งŒ, ์ด ๋…ผ๋ฌธ์„ ํ†ตํ•ด ์ตœ์†Œ 1%์˜ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ memorization ํ•œ ๊ฒƒ์„ ํ™•์ธํ–ˆ๋‹ค.
  • memorization์„ ์ •์˜ํ•˜๋Š” ๊ฑด ๋Œ€์ถฉ ์„ธ๊ฐ€์ง€๊ฐ€ ์žˆ๋Š”๋“ฏ
  1. One leading general memorization definition is differential privacy (Dwork et al., 2006), which is formulated around the idea that removing any userโ€™s data from the training set should not change the trained model significantly.
  2. counterfactual memorization (Feldman and Zhang, 2020; Zhang et al., 2021)
  3. k๊ฐœ์˜ context token์ด ์ฃผ์–ด์กŒ์„ ๋•Œ, greedy decoding์„ ํ†ตํ•ด ๋‚˜์˜ค๋Š” string s๊ฐ€ training data๋‚ด์— ์žˆ๋Š” ๊ฒฝ์šฐ <- ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ฑ„ํƒํ•œ ์ •์˜ if a modelโ€™s training dataset contains the sequence โ€œMy phone number is 555-6789โ€, and given the length k = 4 prefix โ€œMy phone number isโ€, the most likely output is โ€œ555-6789โ€, then we call this sequence extractable (with 4 words of context).
  • ์ „์ฒด sequence๋ฅผ query๋กœ ์‚ฌ์šฉํ•˜๋Š”๊ฒƒ์€ ์‚ฌ์‹ค์ƒ ๋ถˆ๊ฐ€๋Šฅ ํ•˜๋ฏ€๋กœ 5๋งŒ ์ฟผ๋ฆฌ๋ฅผ ๋ฝ‘์•˜๋Š”๋ฐ ์ด๋•Œ, ๊ธธ์ด๊ฐ€ 50, 100, … 500์ธ ์‹œํ€€์Šค์— ๋Œ€ํ•ด ๋ฐ˜๋ณต๋œ ์‹œํ€€์Šค์˜ ๊ธธ์ด ๋ณ„๋กœ 1000๊ฐœ์”ฉ ๋ฝ‘์•˜๋‹ค.
  • ๋ชจ๋ธ์€ GPT-Neo(125M, 1.3B, 2.7B, 6B), ๋ฐ์ดํ„ฐ์…‹์€ Pile dataset(825GB, ์ฑ…, ์›น, ์˜คํ”ˆ์†Œ์Šค ์ฝ”๋“œ)์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ๋ชจ๋ธ๊ณผ ๋ฐ์ดํ„ฐ์…‹์€ ๊ณต๊ฐœ๋œ ๊ฒƒ๋“ค ์ค‘ ๊ฐ€์žฅ ํฐ ๊ฒƒ๋“ค์ด๋‹ค. ์ด ๋•Œ, ๋ชจ๋ธํฌ๊ธฐ - memorization ๊ด€๊ณ„๋Š” log-linearํ•จ.
  • beam search(b=100)์„ ํ•ด๋„ ์•„์ฃผ ์กฐ๊ธˆ extracted memorization์ด ๋Š˜์—ˆ๋‹ค. (ํ‰๊ท  2%, ์ตœ๋Œ€ 5.6%) 45%์˜ ๊ฒฝ์šฐ beam search์™€ greedy๋Š” ๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ๋ƒˆ๋‹ค.
  • T5์™€ C4๋กœ๋„ ์‹คํ—˜์„ ์ง„ํ–‰. ์ด๋•Œ๋Š” masked LM์„ ์™„๋ฒฝํ•˜๊ฒŒ ๋ณต๊ตฌํ–ˆ์„ ๊ฒฝ์šฐ memorizationํ–ˆ๋‹ค๊ณ  ์ •์˜ํ–ˆ๋‹ค. ์ „์ฒด์ ์ธ ๊ฒฝํ–ฅ์€ GPT-Neo์™€ ๊ฐ™์•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ชจ๋ธํฌ๊ธฐ - memorization์€ non-linearํ•˜์ง€ ์•Š์•˜๊ณ , 140๋ฒˆ ์ดํ•˜๋กœ ๋ฐ˜๋ณต๋œ ์‹œํ€€์Šค๊ฐ€ (๋” ๋ฐ˜๋ณต๋œ ์‹œํ€€์Šค๋ณด๋‹ค) ์œ ์˜๋ฏธํ•˜๊ฒŒ ์™ธ์›Œ์งˆ ํ™•๋ฅ ์ด ๋†’์•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Š” ํ•ด๋‹น ์‹œํ€€์Šค์— ๊ณต๋ฐฑ์ด ๋งŽ์•„์„œ ๋” ์‰ฌ์›Œ์„œ ๊ทธ๋žฌ๋‹ค(…)
  • 50 ํ† ํฐ ์ด์ƒ์˜ ์‹œํ€€์Šค์— ๋Œ€ํ•ด ๋ฐ˜๋ณต์„ ์ œ๊ฑฐํ•œ C4๋กœ๋„ ํ•™์Šต์„ ํ–ˆ๋Š”๋ฐ ์™ธ์šธ ํ™•๋ฅ ์ด 1/3 ์ค„์–ด๋“ค์—ˆ๋‹ค.

next papers :

  • training data extraction attacks (adversarial attack in LM)
  • GitHub Copilot: Parrot or crow?
  • Membership inference attacks against machine learning models.
  • Understanding unintended memorization in federated learning.
  • Calibrating noise to sensitivity in private data analysis

thinkings :

  • GAN๊ณผ ๋น„์Šทํ•œ ๋ฐฉ์‹์œผ๋กœ extraction์„ ๋ชปํ•˜๊ฒŒ ํ•˜๋Š” ๋ชจ๋ธ ์™ ์ง€ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค.
  • privacy๋Š” ๋ณดํ†ต ์ˆซ์ž์™€ ๊ด€๋ จ๋˜์–ด ์žˆ์ง€ ์•Š์„๊นŒ….
  • ๊ฒฐ๊ตญ augmentation์œผ๋กœ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™๊ธฐ๋„ ํ•œ๋ฐ..๊ทผ๋ณธ์ ์ธ ํ•ด๊ฒฐ๋ฒ•์„ ์ƒ๊ฐํ•ด๋ณผ๊นŒ
  • memorization != overfitting์ด๋ผ๊ณ  ํ•˜๋„ค https://bair.berkeley.edu/blog/2019/08/13/memorization/
  • decoding ๋ฐฉ์‹์„ ๋ฐ”๊ฟˆ์— ๋”ฐ๋ผ tackleํ•  ์ˆ˜ ์—†์„๊นŒ?
  • ๋˜๋Š” teacher force๋ฅผ ์‚ฌ์šฉํ•จ์— ๋”ฐ๋ผ ๋” memorization?