Python package to create adversarial agents for membership inference attacks against machine learning models using Scikit-learn learners.
Implementation of the work done by Shokri et al (paper)
Find some examples in notebooks/
The main classes and functions are:
To synthesize data only using a black-box like model target_model
and predictions using the algorithm proposed by Shokri et al
from mblearn import synthetize
x = synthesize(target_model, fixed_class, k_max)
Train predict_proba
method.
from mblearn import ShadowModels
shadows = ShadowModels(n_models, data, target_classes, learner)
shadow_data = shadows.results
Using the data generated with the shadow models, trains a attack models on each label of the shadow dataset.
from mblearn import AttackModels
attacker = AttackModels(target_classes, attack_learner)
# train the attacker with the shadow data
attacker.fit(shadow_data)
# query the target model and get the predicted class prob vector
X = target_model.predict_proba(test_data)
# especulate about the class this test_data belongs to
y = 0
# get the prediction:
# True if `test_data` is classified as a member of
# the private model training set for the given class
# False otherwise
attacker.predict(X, y)
R. Shokri, M. Stronati, and V. Shmatikov. Membership inference attacks against machine learning models. Security and Privacy (SP), 2017 IEEE Symposium , IEEE, 2017.
Y. Long, V. Bindschaedler, L Wang, D. Bu, et al. Understanding Membership Inferences on Well-Generalized Learning Models. arXiv preprint arXiv:1802.04889, 2018.
S. Truex, L. Liu, M. E. Gursoy, L. Yu, W. Wei. Towards Demystifying Membership Inference Attacks. arXiv preprint arXiv:1807.09173, 2018.
The maturity of the package is far from alpha. This is just a proof of concept and all the interface and inner wheels will change in the next few months.