Why does vLLM use a custom all-reduce method? #6159

SamKG · 2024-07-05T20:01:25Z

SamKG
Jul 5, 2024

Hello,
Was hoping someone could help shed some light on this -
why does vLLM choose to use a custom all-reduce method? Is there a benefit to doing this over just using the NCCL APIs?

Answered by simon-mo

Jul 5, 2024

See perf result here #2192. In certain cases, the custom topology drastically boosts performance compared to nccl's implementation. vLLM still uses nccl in majority of cases.

View full answer

simon-mo · 2024-07-05T20:05:45Z

simon-mo
Jul 5, 2024
Maintainer

See perf result here #2192. In certain cases, the custom topology drastically boosts performance compared to nccl's implementation. vLLM still uses nccl in majority of cases.

3 replies

SamKG Jul 5, 2024
Author

thanks, that helps a ton!!

SamKG Jul 10, 2024
Author

@simon-mo Just wanted to quickly clarify something: I've observed strange scaling in performance for the 1-stage all-reduce kernels (both nccl and for vllm):

For small problem sizes, it actually performs slower than expected (this is the region where 1-stage reduce is being used)

Theoretically, we should expect scalings that look (roughly) like the below:

Is this expected?

SamKG Jul 10, 2024
Author

cc @hanzhi713

cduk · 2024-11-04T13:22:36Z

cduk
Nov 4, 2024

In a set-up where 4 GPUs are connected by PCIe, but each pair of GPUs are connected by NVLink (112 GB/s bi-directional). Is there a way to specify a reduction first on each pairwise bound set of GPUs before reducing across the slower PCIe link?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does vLLM use a custom all-reduce method? #6159

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Why does vLLM use a custom all-reduce method? #6159

SamKG Jul 5, 2024

Replies: 2 comments · 3 replies

simon-mo Jul 5, 2024 Maintainer

SamKG Jul 5, 2024 Author

SamKG Jul 10, 2024 Author

SamKG Jul 10, 2024 Author

cduk Nov 4, 2024

SamKG
Jul 5, 2024

Replies: 2 comments 3 replies

simon-mo
Jul 5, 2024
Maintainer

SamKG Jul 5, 2024
Author

SamKG Jul 10, 2024
Author

SamKG Jul 10, 2024
Author

cduk
Nov 4, 2024