This work aims to integrate MPC with FL, especially by using Sharemind MPC, for secure FL aggregation. This repository builds off SecreC language, the domain specific language for Sharemind MPC. To have experiments, therefore, you need Sharemind MPC, its standard library and several licensed components (see below). Please refer and contact to https://docs.sharemind.cyber.ee/ for this environment settings. You also need Docker.
There will be three FL clients, one control server, and three Sharemind MPC servers with two hundred communication rounds by default. You can set another communication round number in conf.yml
. There you can also specify the number of clients, however, currently codes do not support numbers other than 3.
The instructions below are only for reproducing the results of the paper. This repository assumes that users have three independent machine and control them by own. Therefore, some procedures are simplified and do not reflect reality (ex. key exchanges).
Also, to perform the following instructions with using Sharemind MPC, some licensed components are needed (in addtion to SecreC Standard Library). Those are
- Sharemind Application Server (for the
license.p7b
) - SecreC Analytics Library for
analytics_common.sc
andoption.sc
. - Sharemind CSV Importer
-
Create a virtual env and install Python libraries.
$python3 -m venv venv $source ./venv/bin/activate $pip install -r ml.requirements.txt
-
Create necessary folders
$./init.sh
-
Prepare the local datasets
See below for dataset details.# For MNIST, (Fashion-MNIST), CIFAR10 (script downloads and splits into train&test datasets) (venv)$python ./src/ML/split.py # For CASA (Human Activity Recognition from Continuous Ambient Sensor Data) # Download the dataset (127 MB): wget http://archive.org/download/train_20211025/train.csv # or curl -L http://archive.org/download/train_20211025/train.csv -o train.csv (venv)$python ./src/ML/create_data_partitions.py
-
Generate the model shape
(venv)$python ./src/ML/gen_sample.py
This script generates
./client/model.txt
which is sent to each Sharemind server to tell the model architecture in order to let them create the init model. -
Generate the tailored access control confs based on the model
(venv)$python ./src/ML/gen_access_ctl.py
-
Create keys.
$./fake-key-change.sh
-
Add the
license-dev.p7b
to each./server/server*/
. Sharemind servers can run now but we need tailored access control. -
Exchange keys and send all necessary files to each server. Before the command below, enable ssh communication between the host and each machine.
$./init-cluster.sh
-
Run Sharemind MPC
# Login each server, move to the project folder and run $sh run.sh
-
Run FL with Sharemind MPC
# This splits the terminal into four. Three for FL clients and one for the control server. # The Python environments are already activated. # For the first time, one panel starts building a docker container and this takes a time. /client$./open_tmux.sh # Change the client-panel (ctrl+b and press the arrow-key) and start FL. $python ./src/ML/main.py
-
You can check the scores (http://172.20.0.3:8443/status) or each graph.
-
To create a comparison graph, save the http://172.20.0.3:8443/status as a
./result/FL-Sharemind/result.json
.
-
If necessary, you can check the central learning and FL but without using Sharemind.
$python3 src/ML/central.py $python3 src/ML/FL-plain.py # plotting the results $python3 src/ML/result_plot.py
These three will be downloaded via Keras library once ./src/ML/split.py
is executed.
- Human Activity Recognition from Continuous Ambient Sensor Data (CASA)
Original dataset can be found here but we use the pre-processed one that one of our authors Sadi AlAwadi, Halmstad University, made.
Currently, there is, at least, one crucial bug. When you run the first, at the end of the first communication round, the program will crash. Please run again (the problem disappears).