A curated list of system-level optimization approaches on synchronous federated learning.
This repository serves as a complement of the survey below.
@article{jiang2022towards,
author={Jiang, Zhifeng and Wang, Wei and Li, Bo and Yang, Qiang},
journal={IEEE Transactions on Big Data},
title={Towards Efficient Synchronous Federated Training: A Survey on System Optimization Strategies},
year={2023},
volume={9},
number={2},
pages={437-454},
doi={10.1109/TBDATA.2022.3177222}}
If you feel this repository is helpful, please help to cite the survey above.
Search keywords like conference name (e.g., OSDI
), target phase (e.g., Client Selection
), or performance metric (e.g., Communication Cost
) over the webpage to quickly locate related papers.
Recent Optimization Approaches:
- Optimizing the Selection Phase: At the beginning of each round, the server waits for a sufficient number of clients with eligible status (i.e., currently charging and connected to an unmetered network) to check in. The server then selects a subset of them based on certain strategies (e.g., randomly or selectively) for participation, and notifies the others to reconnect later.
- Optimizing the Configuration Phase: The server next sends the global model status and configuration profiles (e.g., the number of local epochs or the reporting deadline) to each of the selected clients. Based on the instructed configuration, the clients perform local model training independently with their private data.
- Optimizing the Reporting Phase: The server then waits for the participating clients to report local updates until reaching the predefined deadline. The current round is aborted if no enough clients report in time. Otherwise, the server aggregates the received local updates, uses the aggregate to update the global model status, and concludes the round.
Measuring and Benchmarking Tools:
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2021 | AutoFL: Enabling heterogeneity-aware energy efficient federated learning | Co-design (Fine-grained) | ACM MICRO | Link |
2021 | Oort: Efficient federated learning via guided participant selection | Co-design (Fine-grained) | USENIX OSDI | Link |
2021 | Client selection for federated learning with non-IID data in mobile edge computing | Partial optimization (Statistics-oriented) | IEEE Access | Link |
2020 | TiFL: A tier-based federated learning system | Co-design (Coarse-grained) | ACM HDPC | Link |
2020 | Optimizing federated learning on non-IID data with reinforcement learning | Partial optimization (statistics-oriented) | IEEE INFOCOM | Link |
2019 | Client selection for federated learning with heterogeneous resources in mobile edge | Partial optimization (system-oriented) | IEEE ICC | Link |
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2021 | Communication-efficient federated learning with adaptive parameter freezing | Parameter-level | IEEE ICDCS | Link |
2020 | Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation | Layer-level | IEEE TNNLS | Link |
2019 | CMFL: Mitigating communication overhead for federated learning | Client-level | IEEE ICDCS | Link |
2018 | Efficient decentralized deep learning by dynamic model averaging | Client-level | ECML-PKDD | Link |
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2020 | FetchSGD: Communication-efficient federated learning with sketching | Sketch | ICML | Link |
2019 | Compressing Gradient Optimizers via Count-Sketches | Sketch | ICML | Link |
2019 | Communication-efficient distributed SGD with sketching | Sketch | NeurIPS | Link |
2019 | Error feedback fixes SignSGD and other gradient compression schemes | Quantization | ICML | Link |
2019 | SignSGD with majority vote is communication efficient and fault tolerant | Quantization | ICLR | Link |
2019 | A distributed synchronous SGD algorithm with global top-k sparsification for low bandwidth networks | Sparsification | IEEE ICDCS | Link |
2018 | Sparsified SGD with memory | Sparsification | NeurIPS | Link |
2018 | Deep gradient compression: Reducing the communication bandwidth for distributed training | Sparsification | ICLR | Link |
2018 | Gradient sparsification for communication-efficient distributed optimization | Sparsification | NeurIPS | Link |
2018 | SketchML: Accelerating distributed machine learning with data sketches | Sketch | ACM SIGMOD | Link |
2018 | Error compensated quantized SGD and its applications to large-scale distributed optimization | Quantization | ICML | Link |
2017 | Gaia: Geo-distributed machine learning approaching LAN speeds | Client-level | USENIX NSDI | Link |
2017 | Sparse communication for distributed gradient descent | Sparsification | ACL EMNLP | Link |
2017 | TernGrad: Ternary gradients to reduce communication in distributed deep learning | Quantization | NeurIPS | Link |
2017 | QSGD: Communication-efficient SGD via gradient quantization and encoding | Quantization | NeurIPS | Link |
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2021 | Accelerating DNN training in wireless federated edge learning systems | Load balancing (Communication) | IEEE JSAC | Link |
2021 | HeteroFL: Computation and communication efficient federated learning for heterogeneous clients | Load balancing (Optimization step) | ICLR | Link |
2021 | Towards efficient scheduling of federated mobile devices under computational and statistical heterogeneity | Load balancing (Data amount) | IEEE TPDS | Link |
2020 | Federated optimization in heterogeneous networks | Load balancing (Optimization step) | MLSys | Link |
2020 | Resource allocation in mobility-aware federated learning networks: A deep reinforcement learning approach | Load balancing (Data amount) | IEEE WF-IoT | Link |
2019 | Efficient training management for mobile crowd-machine learning: A deep reinforcement learning approach | Load balancing (Data amount) | IEEE WCL | Link |
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2021 | Breaking the centralized barrier for cross-device federated learning | Client bias reduction | NeurIPS | Link |
2021 | Federated learning based on dynamic regularization | Client bias reduction | ICLR | Link |
2020 | Federated learning via posterior averaging: A new perspective and practical algorithms | Client bias reduction | ICLR | Link |
2020 | SCAFFOLD: Stochastic controlled averaging for federated learning | Client bias reduction | ICML | Link |
2020 | Federated optimization in heterogeneous networks | Client bias reduction | MLSys | Link |
2020 | Accelerating federated learning via momentum gradient descent | Optimizer state synchronization | IEEE TPDS | Link |
2020 | Federated accelerated stochastic gradient descent | Optimizer state synchronization | NeurIPS | Link |
2019 | FedDANE: A federated Newton-type method | Client bias reduction | IEEE ACSSC | Link |
2019 | On the linear speedup analysis of communication efficient momentum SGD for distributed nonconvex optimization | Optimizer state synchronization | ICML | Link |
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2022 | LightSecAgg: Rethinking secure aggregation in federated learning | Lightweight privacy-preserving aggregation | MLSys | Link |
2021 | Flashe: Additively symmetric homomorphic encryption for cross-silo federated learning | Lightweight privacy-preserving aggregation | arXiv | Link |
2021 | Turbo-aggregate: Breaking the quadratic aggregation barrier in secure federated learning | Lightweight privacy-preserving aggregation | IEEE JSAIT | Link |
2020 | FastSecAgg: Scalable secure aggregation for privacy-preserving federated learning | Lightweight privacy-preserving aggregation | ICML Workshop | Link |
2020 | Secure single-server aggregation with (poly) logarithmic overhead | Lightweight privacy-preserving aggregation | ACM CCS | Link |
2020 | BatchCrypt: Efficient homomorphic encryption for cross-silo federated learning | Lightweight privacy-preserving aggregation | USENIX ATC | Link |
2020 | Accelerating federated learning over reliability-agnostic clients in mobile edge computing systems | Hierarchical aggregation | IEEE TPDS | Link |
2020 | Hierarchical federated learning across heterogeneous cellular networks | Hierarchical aggregation | IEEE ICASSP | Link |
2020 | Client-edge-cloud hierarchical federated learning | Hierarchical aggregation | IEEE ICC | Link |
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2021 | Adaptive federated optimization | Server-side optimizer | ICLR | Link |
2020 | SlowMo: Improving communication-efficient distributed SGD with slow momentum | Server-side optimizer | ICLR | Link |
2019 | Measuring the effects of nonidentical data distribution for federated visual classification | Server-side optimizer | NeurIPS Workshop | Link |
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2021 | Characterizing impacts of heterogeneity in federated learning upon large-scale smartphone data | Mobile | ACM WWW | Link |
Year | Title | Category | Venue | Paper Link |
---|---|---|---|---|
2022 | The OARF benchmark suite: Characterization and implications for federated learning systems | Training datasets | ACM TIST | Link |
2022 | FedScale: Benchmarking model and system performance of federated learning | Training datasets | ICML Workshop | Link |
2021 | FATE: An industrial grade platform for collaborative learning with data protection | Production systems and simulation platforms | ACM JMLR | Link |
2020 | Flower: A friendly federated learning research framework | Production systems and simulation platforms | arXiv | Link |
2020 | FedML: A research library and benchmark for federated machine learning | Production systems and simulation platforms | NeurIPS Workshop | Link |
2018 | Leaf: A benchmark for federated settings | Training datasets | arXiv | Link |