Cleaned up the source code of AsyncFilter (#378)

* Cleaned up the code. * Added configuration files and a readme file. * Cleaned up the comments. * Updated the readme. * Update examples.md
TL-System · Oct 1, 2024 · 48d4660 · 48d4660
1 parent 399cfe2
commit 48d4660
Show file tree

Hide file tree

Showing 12 changed files with 1,106 additions and 224 deletions.
diff --git a/docs/examples.md b/docs/examples.md
@@ -538,6 +538,19 @@ FedSaw is proposed to improve training performance in three-layer federated lear
 python examples/three_layer_fl/fedsaw/fedsaw.py -c examples/three_layer_fl/fedsaw/fedsaw_MNIST_lenet5.yml
 ```
 ````
+#### Poisoning Detection Algorithms 
+````{admonition} **AsyncFilter**
+AsyncFilter is proposed to defend against untargeted poisoning attacks in asynchronous federated learning with a server filter. With statistical analysis, AsyncFilter identifies potential poisoned model updates and filters them out before the server aggregation stage. 
+
+```shell
+python examples/detector/detector.py -c examples/detector/asyncfilter_fashion_6.yml
+```
+
+```{note}
+Kang et al., &ldquo;[AsyncFilter: Detecting Poisoning Attacks in Asynchronous Federated Learning](http://iqua.ece.toronto.edu/papers/ykang-middleware25.pdf)
+&rdquo: in the Proceedings of the 25th ACM/IFIP International Middleware Conference (Middleware), 2024.
+```
+````
 
 #### Model Pruning Algorithms
 

diff --git a/examples/detector/README.md b/examples/detector/README.md
@@ -0,0 +1,83 @@
+# Reproducing AsyncFilter
+
+## Setting up your Python environment
+
+It is recommended that [Miniforge](https://github.com/conda-forge/miniforge) is used to manage Python packages. Before using *Plato*, first install Miniforge, update your `conda` environment, and then create a new `conda` environment with Python 3.9 using the command:
+
+```shell
+conda update conda -y
+conda create -n plato -y python=3.9
+conda activate plato
+```
+
+where `plato` is the preferred name of your new environment.
+
+The next step is to install the required Python packages. PyTorch should be installed following the advice of its [getting started website](https://pytorch.org/get-started/locally/). The typical command in Linux with CUDA GPU support, for example, would be:
+
+```shell
+pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117  --extra-index-url https://download.pytorch.org/whl/cu117
+```
+
+In macOS (without GPU support), the recommended command would be:
+
+```shell
+pip install torch==1.13.1 torchvision==0.14.1
+```
+Additionally, install scikit-learn package: 
+
+```shell
+pip install scikit-learn
+```
+## Installing Plato
+
+Navigate to the Plato directory and install the latest version from GitHub as a local pip package:
+
+```shell
+cd ../..
+pip install .
+```
+
+# Running experiments in plato/examples/detector folder
+Navigate to the examples/detector folder to start running experiments:
+```shell
+cd examples/detector
+```
+
+## Set up the configuration file
+A variety of configuration files are provided for different experiments. Below are examples for reproducing key experiments from the paper:
+
+### Example 1: Section 5.2 - Running AsyncFilter on CIFAR-10
+#### Download the dataset
+
+```shell
+python detector.py -c asyncfilter_cifar_2.yml -d
+```
+
+#### Run the experiments
+```shell
+python detector.py -c asyncfilter_cifar_2.yml
+``` 
+### Example 2: Section 5.3 - Running AsyncFilter Under LIE Attack on CINIC-10 (Concentration Factor: 0.01)
+#### Download the dataset
+
+```shell
+python detector.py -c asyncfilter_cinic_3.yml -d
+```
+#### Run the experiments
+```shell
+python detector.py -c asyncfilter_cinic_3.yml
+```
+### Example 3: Section 5.6 - Running AsyncFilter Under LIE Attack on FashionMNIST (Server Staleness Limit: 10)
+
+#### Download the dataset
+
+```shell
+python detector.py -c asyncfilter_fashionmnist_6.yml -d
+```
+#### Run the experiments
+```shell
+python detector.py -c asyncfilter_fashionmnist_6.yml
+```
+
+### Customizing Experiments
+For further experimentation, you can modify the configuration files to suit your requirements and reproduce the results.
diff --git a/examples/detector/aggregations.py b/examples/detector/aggregations.py
@@ -77,12 +77,12 @@ def bulyan(updates, baseline_weights, weights_attacked):
     """Aggregate weight updates from the clients using bulyan."""
 
     total_clients = Config().clients.total_clients
-    num_attackers = len(Config().clients.attacker_ids)  # ?
+    num_attackers = len(Config().clients.attacker_ids)
 
     remaining_weights = flatten_weights(weights_attacked)
     bulyan_cluster = []
 
-    # Search for bulyan cluster based on distance
+    # Search for bulyan cluster based on distances
     while (len(bulyan_cluster) < (total_clients - 2 * num_attackers)) and (
         len(bulyan_cluster) < (total_clients - 2 - num_attackers)
     ):
@@ -104,7 +104,7 @@ def bulyan(updates, baseline_weights, weights_attacked):
             : len(remaining_weights) - 2 - num_attackers
         ]
 
-        # Add candidate into bulyan cluster
+        # Add candidates into bulyan cluster
         bulyan_cluster = (
             remaining_weights[indices[0]][None, :]
             if not len(bulyan_cluster)
@@ -149,7 +149,7 @@ def krum(updates, baseline_weights, weights_attacked):
 
     remaining_weights = flatten_weights(weights_attacked)
 
-    num_attackers_selected = 2  # ?
+    num_attackers_selected = 2  
 
     distances = []
     for weight in remaining_weights:
@@ -339,7 +339,7 @@ def afa(updates, baseline_weights, weights_attacked):
                     bad_set.append(remove_id)
 
         else:
-            for counter, weight in enumerate(flattened_weights):  #  we for loop this
+            for counter, weight in enumerate(flattened_weights): 
                 if cos_sims[counter] > (model_median + epsilon * model_std):
                     remove_set.append(1)
                     remove_id = (
@@ -353,7 +353,7 @@ def afa(updates, baseline_weights, weights_attacked):
                     temp_tensor2 = flattened_weights_copy[delete_id + 1 :]
                     flattened_weights_copy = torch.cat(
                         (temp_tensor1, temp_tensor2), dim=0
-                    )  # but we changes it in the loop, maybe we should get a copy
+                    )  
                     bad_set.append(remove_id)
 
         epsilon += delta_ep

diff --git a/examples/detector/asyncfilter_cifar_2.yml b/examples/detector/asyncfilter_cifar_2.yml
@@ -0,0 +1,88 @@
+clients:
+    # Type
+    type: simple
+
+    # The total number of clients
+    total_clients: 100
+
+    # The number of clients selected in each round
+    per_round: 100
+
+    # Should the clients compute test accuracy locally?
+    do_test: true
+    random_seed: 1
+    speed_simulation: true
+
+    # The distribution of client speeds
+    simulation_distribution:
+        distribution: zipf # zipf is used.
+        s: 1.2
+    sleep_simulation: true
+
+    # If we are simulating client training times, what is the average training time?
+    avg_training_time: 10
+    attack_type: LIE
+    lambada_value: 2
+    attacker_ids: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 #,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50
+
+server:
+    address: 127.0.0.1
+    port: 5002
+    random_seed: 1
+    sychronous: false
+    simulate_wall_time: true
+    minimum_clients_aggregated: 40
+    staleness_bound: 10
+    checkpoint_path: results/CIFAR/test/checkpoint
+    model_path: results/CIFAR/test/model
+
+
+data:
+    # The training and testing dataset
+    datasource: CIFAR10
+
+    # Number of samples in each partition
+    partition_size: 10000
+
+    # IID or non-IID?
+    sampler: noniid
+    concentration: 0.1
+    random_seed: 1
+
+trainer:
+    # The type of the trainer
+    type: basic
+
+    # The maximum number of training rounds
+    rounds: 100
+
+    # The maximum number of clients running concurrently
+    max_concurrency: 2
+
+    # The target accuracy
+    target_accuracy: 0.88
+
+    # The machine learning model
+    model_name: vgg_16
+
+    # Number of epoches for local training in each communication round
+    epochs: 5
+    batch_size: 128
+    optimizer: Adam
+
+algorithm:
+    # Aggregation algorithm
+    type: fedavg
+
+parameters:
+    model:
+        num_classes: 10
+
+    optimizer:
+        lr: 0.01
+        weight_decay: 0.0
+results:
+    # Write the following parameter(s) into a CSV
+    types: round, accuracy, elapsed_time, comm_time, round_time
+    result_path: /data/ykang/plato/results/asyncfilter/cifar
+
diff --git a/examples/detector/asyncfilter_cinic_2.yml b/examples/detector/asyncfilter_cinic_2.yml
@@ -0,0 +1,96 @@
+clients:
+    # Type
+    type: simple
+
+    # The total number of clients
+    total_clients: 100
+
+    # The number of clients selected in each round
+    per_round: 100
+
+    # Should the clients compute test accuracy locally?
+    do_test: true
+    random_seed: 1
+
+    # The distribution of client speeds
+    simulation_distribution:
+        distribution: zipf # zipf is used.
+        s: 1.2
+    sleep_simulation: true
+    speed_simulation: true
+
+    # If we are simulating client training times, what is the average training time?
+    avg_training_time: 10
+    attack_type: LIE
+    lambada_value: 2
+    attacker_ids: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 #,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50
+
+
+server:
+    address: 127.0.0.1
+    port: 6332
+    random_seed: 1
+    sychronous: false
+    simulate_wall_time: true
+    minimum_clients_aggregated: 40
+    detector_type: AsyncFilter
+    staleness_bound: 20
+    checkpoint_path: results/CIFAR/test/checkpoint
+    model_path: results/CIFAR/test/model
+
+
+data:
+    # The training and testing dataset
+    datasource: CINIC10
+
+    # Where the dataset is located
+    data_path: data/CINIC-10
+
+    #
+    download_url: http://iqua.ece.toronto.edu/baochun/CINIC-10.tar.gz
+
+    # Number of samples in each partition
+    partition_size: 10000
+
+    # IID or non-IID?
+    sampler: noniid
+    concentration: 0.1
+    random_seed: 1
+
+trainer:
+    # The type of the trainer
+    type: basic
+
+    # The maximum number of training rounds
+    rounds: 100
+
+    # The maximum number of clients running concurrently
+    max_concurrency: 4
+
+    # The target accuracy
+    target_accuracy: 0.88
+
+    # The machine learning model
+    model_name: vgg_16
+
+    # Number of epoches for local training in each communication round
+    epochs: 5
+    batch_size: 128
+    optimizer: SGD
+
+algorithm:
+    # Aggregation algorithm
+    type: fedavg
+
+parameters:
+    model:
+        num_classes: 10
+
+    optimizer:
+        lr: 0.01
+        momentum: 0.5
+        weight_decay: 0.0
+results:
+    # Write the following parameter(s) into a CSV
+    types: round, accuracy, elapsed_time, comm_time, round_time
+    result_path: /data/ykang/plato/results/asyncfilter/cinic