diff --git a/README.rst b/README.rst index a13d7463f..7c32fc7bd 100644 --- a/README.rst +++ b/README.rst @@ -9,50 +9,48 @@ .. |pic3| image:: https://readthedocs.org/projects/fedn/badge/?version=latest&style=flat :target: https://fedn.readthedocs.io -FEDn --------- +FEDn: An enterprise-ready federated learning framework +------------------------------------------------------- -FEDn empowers its users to create federated learning applications that seamlessly transition from local proofs-of-concept to secure distributed deployments. +Our goal is to provide a federated learning framework that is both secure, scalable and easy to use. We believe that that minimal code change should be needed to progress from early proof-of-concepts to production. This is reflected in our core design principles: -Leverage a flexible pseudo-local sandbox to rapidly transition your existing ML project to a federated setting. Test and scale in real-world scenarios using FEDn Studio - a fully managed, secure deployment of all server-side components (SaaS). +- **Data-scientist friendly**. A ML-framework agnostic design lets data scientists implement use-cases using their framework of choice. A UI and a Python API enables users to manage complex FL experiments and track metrics in real time. -We develop the FEDn framework following these core design principles: +- **Secure by design.** FL clients do not need to open any ingress ports. Industry-standard communication protocols (gRPC) and token-based authentication and RBAC (JWT) provides flexible integration in a range of production environments. -- **Seamless transition from proof-of-concepts to real-world FL**. FEDn has been designed to make the journey from R&D to real-world deployments as smooth as possibe. Develop your federated learning use case in a pseudo-local environment, then deploy it to FEDn Studio (cloud or on-premise) for real-world scenarios. No code change is required to go from development and testing to production. +- **Cloud native.** By following cloud native design principles, we ensure a wide range of deployment options including private cloud and on-premise infrastructure. Reference deployment here: https://fedn.scaleoutsystems.com. -- **Designed for scalability and resilience.** FEDn enables model aggregation through multiple aggregation servers sharing the workload. A hierarchical architecture makes the framework well suited borh for cross-silo and cross-device use-cases. FEDn seamlessly recover from failures in all critical components, and manages intermittent client-connections, ensuring robust deployment in production environments. +- **Scalability and resilience.** Multiple aggregation servers (combiners) can share the workload. FEDn seamlessly recover from failures in all critical components and manages intermittent client-connections. -- **Secure by design.** FL clients do not need to open any ingress ports, facilitating distributed deployments across a wide variety of settings. Additionally, FEDn utilizes secure, industry-standard communication protocols and supports token-based authentication and RBAC for FL clients (JWT), providing flexible integration in production environments. - -- **Developer and data scientist friendly.** Extensive event logging and distributed tracing enables developers to monitor experiments in real-time, simplifying troubleshooting and auditing. Machine learning metrics can be accessed via both a Python API and visualized in an intuitive UI that helps the data scientists analyze and communicate ML-model training progress. +- **Developer friendly.** Extensive event logging and distributed tracing enables developers to monitor the sytem in real-time, simplifying troubleshooting and auditing. +We provide a fully managed deployment free of charge for for testing, academic, and personal use. Sign up for a `FEDn Studio account `__ and take the `Quickstart tutorial `__. Features ========= -Core FL framework (this repository): +Federated learning: - Tiered federated learning architecture enabling massive scalability and resilience. - Support for any ML framework (examples for PyTorch, Tensforflow/Keras and Scikit-learn) - Extendable via a plug-in architecture (aggregators, load balancers, object storage backends, databases etc.) - Built-in federated algorithms (FedAvg, FedAdam, FedYogi, FedAdaGrad, etc.) -- CLI and Python API. +- UI, CLI and Python API. - Implement clients in any language (Python, C++, Kotlin etc.) - No open ports needed client-side. -- Flexible deployment of server-side components using Docker / docker compose. -FEDn Studio - From development to FL in production: +From development to FL in production: - Secure deployment of server-side / control-plane on Kubernetes. -- UI with dashboards for orchestrating experiments and visualizing results +- UI with dashboards for orchestrating FL experiments and for visualizing results - Team features - collaborate with other users in shared project workspaces. - Features for the trusted-third party: Manage access to the FL network, FL clients and training progress. - REST API for handling experiments/jobs. - View and export logging and tracing information. - Public cloud, dedicated cloud and on-premise deployment options. -Available clients: +Available client APIs: - Python client (this repository) - C++ client (`FEDn C++ client `__) diff --git a/docs/projects.rst b/docs/projects.rst index 2b86faa62..2cf31f23f 100644 --- a/docs/projects.rst +++ b/docs/projects.rst @@ -4,7 +4,7 @@ Develop your own project ================================================ This guide explains how a FEDn project is structured, and details how to develop your own -projects for your own use-cases. +projects. A FEDn project is a convention for packaging/wrapping machine learning code to be used for federated learning with FEDn. At the core, a project is a directory of files (often a Git repository), containing your machine learning code, FEDn entry points, and a specification @@ -71,11 +71,12 @@ to specify the environment: 1. Provide a ``python_env`` in the ``fedn.yaml`` file. In this case, FEDn will create an isolated virtual environment and install the project dependencies into it before starting up the client. FEDn currently supports Virtualenv environments, with packages on PyPI. 2. Manage the environment manually. Here you have several options, such as managing your own virtualenv, running in a Docker container, etc. Remove the ``python_env`` tag from ``fedn.yaml`` to handle the environment manually. -**Entry Points** +Entry Points +------------- There are up to four Entry Points to be specified. -**Build Entrypoint (build, optional):** +**build (optional):** This entrypoint is intended to be called **once** for building artifacts such as initial seed models. However, it not limited to artifacts, and can be used for any kind of setup that needs to be done before the client starts up. @@ -85,16 +86,14 @@ To invoke the build entrypoint using the CLI: fedn build -- - -**Startup Entrypoint (startup, optional):** - +**startup (optional):** This entrypoint is called **once**, immediately after the client starts up and the environment has been initalized. It can be used to do runtime configurations of the local execution environment. For example, in the quickstart tutorial example, the startup entrypoint invokes a script that downloads the MNIST dataset and creates a partition to be used by that client. This is a convenience useful for automation of experiments and not all clients will specify such a script. -**Training Entrypoint (train, mandatory):** +**train (mandatory):** This entrypoint is invoked every time the client recieves a new model update request. The training entry point must be a single-input single-output (SISO) program. It will be invoked by FEDn as such: @@ -105,7 +104,7 @@ This entrypoint is invoked every time the client recieves a new model update req where 'model_in' is the file containing the current global model to be updated, and 'model_out' is a path to write the new model update to. Download and upload of these files are handled automatically by the FEDn client, the user only specifies how to read and parse the data contained in them (see examples) . -**Validation Entrypoint (validate, optional):** +**validate (optional):** The validation entry point works in a similar was as the trainig entrypoint. It can be used to specify how a client should validate the current global model on local test/validation data. It should read a model update from file, validate it (in any way suitable to the user), and write a **json file** containing validation data: diff --git a/examples/FedSimSiam/README.rst b/examples/FedSimSiam/README.rst index e62537e75..5831fd3ea 100644 --- a/examples/FedSimSiam/README.rst +++ b/examples/FedSimSiam/README.rst @@ -1,18 +1,23 @@ + **Note: If you are new to FEDn, we recommend that you start with the MNIST-Pytorch example instead: https://github.com/scaleoutsystems/fedn/tree/master/examples/mnist-pytorch** + FEDn Project: FedSimSiam on CIFAR-10 ------------------------------------ -This is an example FEDn Project that runs the federated self-supervised learning algorithm FedSimSiam on -the CIFAR-10 dataset. This is a standard example often used for benchmarking. To be able to run this example, you -need to have GPU access. +This is an example FEDn Project that trains the federated self-supervised learning algorithm FedSimSiam on +the CIFAR-10 dataset. CIFAR-10 is a popular benchmark dataset that contains images of 10 different classes, such as cars, dogs, and ships. +In short, FedSimSiam trains an encoder to learn useful feature embeddings for images, without the use of labels. +After the self-supervised training stage, the resulting encoder can be downloaded and trained for a downstream task (e.g., image classification) via supervised learning on labeled data. +To learn more about self-supervised learning and FedSimSiam, have a look at our blog-post: https://www.scaleoutsystems.com/post/federated-self-supervised-learning-and-autonomous-driving + +To run the example, follow the steps below. For a more detailed explanation, follow the Quickstart Tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html - **Note: We recommend all new users to start by following the Quickstart Tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html** +**Note: To be able to run this example, you need to have GPU access.** Prerequisites ------------- -- `Python 3.8, 3.9, 3.10 or 3.11 `__ -- `A FEDn Studio account `__ -- Change the dependencies in the 'client/python_env.yaml' file to match your cuda version. +- `Python >=3.8, <=3.12 `__ +- `A project in FEDn Studio `__ Creating the compute package and seed model ------------------------------------------- @@ -36,41 +41,31 @@ Create the compute package: fedn package create --path client -This should create a file 'package.tgz' in the project folder. +This creates a file 'package.tgz' in the project folder. -Next, generate a seed model (the first model in a global model trail): +Next, generate the seed model: .. code-block:: fedn run build --path client -This will create a seed model called 'seed.npz' in the root of the project. This step will take a few minutes, depending on hardware and internet connection (builds a virtualenv). - -Using FEDn Studio ------------------ - -Follow the instructions to register for FEDN Studio and start a project (https://fedn.readthedocs.io/en/stable/studio.html). - -In your Studio project: - -- Go to the 'Sessions' menu, click on 'New session', and upload the compute package (package.tgz) and seed model (seed.npz). -- In the 'Clients' menu, click on 'Connect client' and download the client configuration file (client.yaml) -- Save the client configuration file to the FedSimSiam example directory (fedn/examples/FedSimSiam) - -To connect a client, run the following command in your terminal: - -.. code-block:: +This will create a model file 'seed.npz' in the root of the project. This step will take a few minutes, depending on hardware and internet connection (builds a virtualenv). - fedn client start -in client.yaml --secure=True --force-ssl +Running the project on FEDn Studio +---------------------------------- +To learn how to set up your FEDn Studio project and connect clients, take the quickstart tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html. -Running the example -------------------- -After everything is set up, go to 'Sessions' and click on 'New Session'. Click on 'Start run' and the example will execute. You can follow the training progress on 'Events' and 'Models', where you -can monitor the training progress. The monitoring is done using a kNN classifier that is fitted on the feature embeddings of the training images that are obtained by -FedSimSiam's encoder, and evaluated on the feature embeddings of the test images. This process is repeated after each training round. +When running the example in FEDn Studio, you can follow the training progress of FedSimSiam under 'Models'. +After each training round, a kNN classifier is fitted to the feature embeddings of the training images obtained +by FedSimSiam's encoder and evaluated on the feature embeddings of the test images. +This is a common method to track FedSimSiam's training progress, +as FedSimSiam aims to minimize the distance between the embeddings of similar images. +If training progresses as intended, accuracy increases as the feature embeddings for +images within the same class are getting closer to each other in the embedding space. +In the figure below we can see that the kNN accuracy increases over the training rounds, +indicating that the training of FedSimSiam is proceeding as intended. -This is a common method to track FedSimSiam's training progress, as FedSimSiam aims to minimize the distance between the embeddings of similar images. -A high accuracy implies that the feature embeddings for images within the same class are indeed close to each other in the -embedding space, i.e., FedSimSiam learned useful feature embeddings. \ No newline at end of file +.. image:: figs/fedsimsiam_monitoring.png + :width: 50% diff --git a/examples/FedSimSiam/figs/fedsimsiam_monitoring.png b/examples/FedSimSiam/figs/fedsimsiam_monitoring.png new file mode 100644 index 000000000..236ef29c1 Binary files /dev/null and b/examples/FedSimSiam/figs/fedsimsiam_monitoring.png differ diff --git a/examples/huggingface/README.rst b/examples/huggingface/README.rst index 3d5653b7b..eaaad3254 100644 --- a/examples/huggingface/README.rst +++ b/examples/huggingface/README.rst @@ -1,3 +1,6 @@ + + **Note: If you are new to FEDn, we recommend that you start with the MNIST-Pytorch example instead: https://github.com/scaleoutsystems/fedn/tree/master/examples/mnist-pytorch** + Hugging Face Transformer Example -------------------------------- @@ -11,20 +14,21 @@ Federated learning is a privacy preserving machine learning technique that enabl Fine-tuning large language models (LLMs) on various data sources enhances both accuracy and generalizability. In this example, the Enron email spam dataset is split among two clients. The BERT-tiny model is fine-tuned on the client data using federated learning to predict whether an email is spam or not. -Execute the following steps to run the example: -Prerequisites -------------- +In FEDn studio, you can visualize the training progress by plotting test loss and accuracy, as shown in the plot below. +After running the example for only a few rounds in FEDn studio, the BERT-tiny model - fine-tuned via federated learning - +is able to detect spam emails on the test dataset with high accuracy. -Using FEDn Studio: +.. image:: figs/hf_figure.png + :width: 50% -- `Python 3.8, 3.9, 3.10 or 3.11 `__ -- `A FEDn Studio account `__ +To run the example, follow the steps below. For a more detailed explanation, follow the Quickstart Tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html -If using pseudo-distributed mode with docker-compose: +Prerequisites +------------- -- `Docker `__ -- `Docker Compose `__ +- `Python >=3.8, <=3.12 `__ +- `A project in FEDn Studio `__ Creating the compute package and seed model ------------------------------------------- @@ -48,100 +52,17 @@ Create the compute package: fedn package create --path client -This should create a file 'package.tgz' in the project folder. +This creates a file 'package.tgz' in the project folder. -Next, generate a seed model (the first model in a global model trail): +Next, generate the seed model: .. code-block:: fedn run build --path client -This will create a seed model called 'seed.npz' in the root of the project. This step will take a few minutes, depending on hardware and internet connection (builds a virtualenv). - - - -Using FEDn Studio (recommended) -------------------------------- - -Follow the instructions to register for FEDN Studio and start a project (https://fedn.readthedocs.io/en/stable/studio.html). - -In your Studio project: - -- Go to the 'Sessions' menu, click on 'New session', and upload the compute package (package.tgz) and seed model (seed.npz). -- In the 'Clients' menu, click on 'Connect client' and download the client configuration file (client.yaml) -- Save the client configuration file to the huggingface example directory (fedn/examples/huggingface) - -To connect a client, run the following command in your terminal: - -.. code-block:: - - fedn client start -in client.yaml --secure=True --force-ssl - - -Alternatively, if you prefer to use Docker, run the following: - -.. code-block:: - - docker run \ - -v $PWD/client.yaml:/app/client.yaml \ - -e CLIENT_NUMBER=0 \ - -e FEDN_PACKAGE_EXTRACT_DIR=package \ - ghcr.io/scaleoutsystems/fedn/fedn:0.9.0 client start -in client.yaml --secure=True --force-ssl - - -Running the example -------------------- - -After everything is set up, go to 'Sessions' and click on 'New Session'. Click on 'Start run' and the example -will execute. You can follow the training progress on 'Events' and 'Models', where you can view the calculated metrics. - +This will create a model file 'seed.npz' in the root of the project. This step will take a few minutes, depending on hardware and internet connection (builds a virtualenv). +Running the project on FEDn +---------------------------- -Running FEDn in local development mode: ---------------------------------------- - -Create the compute package and seed model as explained above. Then run the following command: - - -.. code-block:: - - docker-compose \ - -f ../../docker-compose.yaml \ - -f docker-compose.override.yaml \ - up - - -This starts up local services for MongoDB, Minio, the API Server, one Combiner and two clients. You can verify the deployment using these urls: - -- API Server: http://localhost:8092/get_controller_status -- Minio: http://localhost:9000 -- Mongo Express: http://localhost:8081 - - -Upload the package and seed model to FEDn controller using the APIClient: - -.. code-block:: - - from fedn import APIClient - client = APIClient(host="localhost", port=8092) - client.set_active_package("package.tgz", helper="numpyhelper") - client.set_active_model("seed.npz") - - -You can now start a training session with 5 rounds (default) using the API client: - -.. code-block:: - - client.start_session() - -Clean up --------- - -You can clean up by running - -.. code-block:: - - docker-compose \ - -f ../../docker-compose.yaml \ - -f docker-compose.override.yaml \ - down -v +To learn how to set up your FEDn Studio project and connect clients, take the quickstart tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html. diff --git a/examples/huggingface/figs/hf_figure.png b/examples/huggingface/figs/hf_figure.png new file mode 100644 index 000000000..896f93d98 Binary files /dev/null and b/examples/huggingface/figs/hf_figure.png differ diff --git a/fedn/cli/run_cmd.py b/fedn/cli/run_cmd.py index 123f17320..0aa069046 100644 --- a/fedn/cli/run_cmd.py +++ b/fedn/cli/run_cmd.py @@ -4,7 +4,6 @@ import click import yaml - from fedn.common.exceptions import InvalidClientConfig from fedn.common.log_config import logger from fedn.network.clients.client import Client @@ -44,7 +43,100 @@ def run_cmd(ctx): """:param ctx: """ pass +@run_cmd.command("validate") +@click.option("-p", "--path", required=True, help="Path to package directory containing fedn.yaml") +@click.option("-i", "--input", required=True, help="Path to input model" ) +@click.option("-o", "--output", required=True,help="Path to write the output JSON containing validation metrics") +@click.pass_context +def validate_cmd(ctx, path,input,output): + """Execute 'validate' entrypoint in fedn.yaml. + + :param ctx: + :param path: Path to folder containing fedn.yaml + :type path: str + """ + path = os.path.abspath(path) + yaml_file = os.path.join(path, "fedn.yaml") + if not os.path.exists(yaml_file): + logger.error(f"Could not find fedn.yaml in {path}") + exit(-1) + config = _read_yaml_file(yaml_file) + # Check that validate is defined in fedn.yaml under entry_points + if "validate" not in config["entry_points"]: + logger.error("No validate command defined in fedn.yaml") + exit(-1) + + dispatcher = Dispatcher(config, path) + _ = dispatcher._get_or_create_python_env() + dispatcher.run_cmd("validate {} {}".format(input, output)) + + # delete the virtualenv + if dispatcher.python_env_path: + logger.info(f"Removing virtualenv {dispatcher.python_env_path}") + shutil.rmtree(dispatcher.python_env_path) +@run_cmd.command("train") +@click.option("-p", "--path", required=True, help="Path to package directory containing fedn.yaml") +@click.option("-i", "--input", required=True, help="Path to input model parameters" ) +@click.option("-o", "--output", required=True,help="Path to write the updated model parameters ") +@click.pass_context +def train_cmd(ctx, path,input,output): + """Execute 'train' entrypoint in fedn.yaml. + + :param ctx: + :param path: Path to folder containing fedn.yaml + :type path: str + """ + path = os.path.abspath(path) + yaml_file = os.path.join(path, "fedn.yaml") + if not os.path.exists(yaml_file): + logger.error(f"Could not find fedn.yaml in {path}") + exit(-1) + + config = _read_yaml_file(yaml_file) + # Check that train is defined in fedn.yaml under entry_points + if "train" not in config["entry_points"]: + logger.error("No train command defined in fedn.yaml") + exit(-1) + + dispatcher = Dispatcher(config, path) + _ = dispatcher._get_or_create_python_env() + dispatcher.run_cmd("train {} {}".format(input, output)) + + # delete the virtualenv + if dispatcher.python_env_path: + logger.info(f"Removing virtualenv {dispatcher.python_env_path}") + shutil.rmtree(dispatcher.python_env_path) +@run_cmd.command("startup") +@click.option("-p", "--path", required=True, help="Path to package directory containing fedn.yaml") +@click.pass_context +def startup_cmd(ctx, path): + """Execute 'startup' entrypoint in fedn.yaml. + + :param ctx: + :param path: Path to folder containing fedn.yaml + :type path: str + """ + path = os.path.abspath(path) + yaml_file = os.path.join(path, "fedn.yaml") + if not os.path.exists(yaml_file): + logger.error(f"Could not find fedn.yaml in {path}") + exit(-1) + + config = _read_yaml_file(yaml_file) + # Check that startup is defined in fedn.yaml under entry_points + if "startup" not in config["entry_points"]: + logger.error("No startup command defined in fedn.yaml") + exit(-1) + + dispatcher = Dispatcher(config, path) + _ = dispatcher._get_or_create_python_env() + dispatcher.run_cmd("startup") + + # delete the virtualenv + if dispatcher.python_env_path: + logger.info(f"Removing virtualenv {dispatcher.python_env_path}") + shutil.rmtree(dispatcher.python_env_path) @run_cmd.command("build") @click.option("-p", "--path", required=True, help="Path to package directory containing fedn.yaml")