[17] Membership Inference Attacks Against Machine Learning Models

Membership Inference: An attack that determines whether or not the data is present in the model’s training data. For example, in the case of medical data, the mere existence of certain data as training data can be a serious privacy breach.
The assumptions of these attacks are 1) The model you are attacking is a multi-classification model 2) You can get input and output from ML as a Service. 3) You know part of the training dataset of the model you want to attack.
The algorithm for the Membership Inference Attack is shown below.

(1) Define shadow models that mimic the output of the real model (target model) (or the same if you know the architecture of the target model). (2) Create non-overlapping subsets of the known training data and train each with shadow models. (3) Train an attack model for the entire dataset, given the actual label values and the predictions of the shadow model as input, and classify whether the data sample in the shadow model was present ("in", "out").

results : High precision, recall on most data. membership attack works well even in black box environments (when you don’t know the model and your prior assumptions about the dataset are wrong).

Definitely different when confidence is member, non-member.