Skip to content

Commit

Permalink
Update efa-cheatsheet.md (#406)
Browse files Browse the repository at this point in the history
Co-authored-by: Ben Friebe <[email protected]>
  • Loading branch information
KeitaW and benfriebe authored Aug 13, 2024
1 parent 4d16b6d commit 9f19c41
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions 1.architectures/efa-cheatsheet.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ versions of your libfabric.
| `NCCL_MIN_CHANNELS=xxx` | Recommend to leave it out to use the default. For e.g., on p4d/p4de, the number of channels should be 8, which is the minimum for a 4-NIC platform. The reduction message is split by number of GPUs in the job, then the number of channels, so having more channels than necessary causes smaller messages which causes EFA to be starved for data. |
| `NCCL_BUFFSIZE=xxx` | Recommend to leave it out to use the default. |
| `RDMAV_FORK_SAFE=1` | Do not use. This is a RDMA-core environment variable. Prefer `FI_EFA_FORK_SAFE` (if it still makes sense for your Linux kernel version). The two looks the same, but actually behaves very differently, especially on newer kernels, where `RDMAV_FORK_SAFE=1` can break things. |
| `NCCL_SHM_USE_CUDA_MEMCPY=1` | Setting this when you run NCCL on g6/g5. It gives x2 performance in comparison to default memcpy |
| `RDMAV_*` | Do not use |
| NCCL version | Recommend one of the stable releases. |

Expand Down

0 comments on commit 9f19c41

Please sign in to comment.