Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs/SK-943 | Updated README for FedSimSiam example #660

Merged
merged 3 commits into from
Jul 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 29 additions & 83 deletions examples/FedSimSiam/README.rst
Original file line number Diff line number Diff line change
@@ -1,18 +1,23 @@
**Note: If you are new to FEDn, we recommend that you start with the MNIST-Pytorch example instead: https://github.com/scaleoutsystems/fedn/tree/master/examples/mnist-pytorch**

FEDn Project: FedSimSiam on CIFAR-10
------------------------------------

This is an example FEDn Project that runs the federated self-supervised learning algorithm FedSimSiam on
the CIFAR-10 dataset. This is a standard example often used for benchmarking. To be able to run this example, you
need to have GPU access.
This is an example FEDn Project that trains the federated self-supervised learning algorithm FedSimSiam on
the CIFAR-10 dataset. CIFAR-10 is a popular benchmark dataset that contains images of 10 different classes, such as cars, dogs, and ships.
In short, FedSimSiam trains an encoder to learn useful feature embeddings for images, without the use of labels.
After the self-supervised training stage, the resulting encoder can be downloaded and trained for a downstream task (e.g., image classification) via supervised learning on labeled data.
To learn more about self-supervised learning and FedSimSiam, have a look at our blog-post: https://www.scaleoutsystems.com/post/federated-self-supervised-learning-and-autonomous-driving

To run the example, follow the steps below. For a more detailed explanation, follow the Quickstart Tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html

**Note: We recommend all new users to start by following the Quickstart Tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html**
**Note: To be able to run this example, you need to have GPU access.**

Prerequisites
-------------

- `Python 3.8, 3.9, 3.10 or 3.11 <https://www.python.org/downloads>`__
- `A FEDn Studio account <https://fedn.scaleoutsystems.com/signup>`__
- Change the dependencies in the 'client/python_env.yaml' file to match your cuda version.
- `Python >=3.8, <=3.12 <https://www.python.org/downloads>`__
- `A project in FEDn Studio <https://fedn.scaleoutsystems.com/signup>`__

Creating the compute package and seed model
-------------------------------------------
Expand All @@ -36,90 +41,31 @@ Create the compute package:

fedn package create --path client

This should create a file 'package.tgz' in the project folder.
This creates a file 'package.tgz' in the project folder.

Next, generate a seed model (the first model in a global model trail):
Next, generate the seed model:

.. code-block::

fedn run build --path client

This will create a seed model called 'seed.npz' in the root of the project. This step will take a few minutes, depending on hardware and internet connection (builds a virtualenv).

Using FEDn Studio
-----------------

Follow the instructions to register for FEDN Studio and start a project (https://fedn.readthedocs.io/en/stable/studio.html).

In your Studio project:

- Go to the 'Sessions' menu, click on 'New session', and upload the compute package (package.tgz) and seed model (seed.npz).
- In the 'Clients' menu, click on 'Connect client' and download the client configuration file (client.yaml)
- Save the client configuration file to the FedSimSiam example directory (fedn/examples/FedSimSiam)

To connect a client, run the following command in your terminal:

.. code-block::

fedn client start -in client.yaml --secure=True --force-ssl


Running the example
-------------------
This will create a model file 'seed.npz' in the root of the project. This step will take a few minutes, depending on hardware and internet connection (builds a virtualenv).

After everything is set up, go to 'Sessions' and click on 'New Session'. Click on 'Start run' and the example will execute. You can follow the training progress on 'Events' and 'Models', where you
can monitor the training progress. The monitoring is done using a kNN classifier that is fitted on the feature embeddings of the training images that are obtained by
FedSimSiam's encoder, and evaluated on the feature embeddings of the test images. This process is repeated after each training round.
Running the project on FEDn Studio
----------------------------------

This is a common method to track FedSimSiam's training progress, as FedSimSiam aims to minimize the distance between the embeddings of similar images.
A high accuracy implies that the feature embeddings for images within the same class are indeed close to each other in the
embedding space, i.e., FedSimSiam learned useful feature embeddings.
To learn how to set up your FEDn Studio project and connect clients, take the quickstart tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html.


Running FEDn in local development mode:
---------------------------------------

Follow the steps above to install FEDn, generate 'package.tgz' and 'seed.tgz'.

Start a pseudo-distributed FEDn network using docker-compose:
.. code-block::

docker compose \
-f ../../docker-compose.yaml \
-f docker-compose.override.yaml \
up

This starts up local services for MongoDB, Minio, the API Server, one Combiner and two clients.
You can verify the deployment using these urls:

- API Server: http://localhost:8092/get_controller_status
- Minio: http://localhost:9000
- Mongo Express: http://localhost:8081

Upload the package and seed model to FEDn controller using the APIClient:

.. code-block::

from fedn import APIClient
client = APIClient(host="localhost", port=8092)
client.set_active_package("package.tgz", helper="numpyhelper")
client.set_active_model("seed.npz")


You can now start a training session with 100 rounds using the API client:

.. code-block::

client.start_session(rounds=100)

Clean up
--------

You can clean up by running

.. code-block::
When running the example in FEDn Studio, you can follow the training progress of FedSimSiam under 'Models'.
After each training round, a kNN classifier is fitted to the feature embeddings of the training images obtained
by FedSimSiam's encoder and evaluated on the feature embeddings of the test images.
This is a common method to track FedSimSiam's training progress,
as FedSimSiam aims to minimize the distance between the embeddings of similar images.
If training progresses as intended, accuracy increases as the feature embeddings for
images within the same class are getting closer to each other in the embedding space.
In the figure below we can see that the kNN accuracy increases over the training rounds,
indicating that the training of FedSimSiam is proceeding as intended.

docker-compose \
-f ../../docker-compose.yaml \
-f docker-compose.override.yaml \
down -v
.. image:: figs/fedsimsiam_monitoring.png
:width: 50%
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading