Possibility to add multiple users / concurrent user requests? #222

mgiessing · 2024-07-03T10:23:34Z

Hi there :-)

Is there a possibility to configure multiple users / concurrent request sessions?
I'd like to simulate how the different backends behave if not 1 user, but e.g. 8 users concurrently access the LLM.

I know there is the possibility to configure batches, but there should be a performance difference if e.g. 1 user sends a batch with 8 requests or 8 users independently send a batch with 1 request each. Please correct me if that is not true :-)

Thanks a lot and appreciate the work on optimum-benchmark!

IlyasMoutawwakil · 2024-07-03T11:36:21Z

Yes that's possible, it will have to be integrated on a backend level but for example if you look at the py-txi backend, you'll see that it has an async method (that's converted into a sync one for our batched inference scenario). That method can be used with a scenario that specifically targets server-like concurrency, that can have as configuration the number of concurrent users instead of batch size, etc.

Overall this will mostly require an InferenceServerScenario that implements the logic and some async methods (async_forward, async_generate, etc) in the backends that you wanna target.

I have already discussed this with @mht-sharma and it could be a great feature to compare server backends (TGI, vLLM, TRT-LLM) more adequately.

Would love to review a PR if this interests you.

IlyasMoutawwakil added the enhancement New feature or request label Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possibility to add multiple users / concurrent user requests? #222

Possibility to add multiple users / concurrent user requests? #222

mgiessing commented Jul 3, 2024 •

edited

Loading

IlyasMoutawwakil commented Jul 3, 2024

Possibility to add multiple users / concurrent user requests? #222

Possibility to add multiple users / concurrent user requests? #222

Comments

mgiessing commented Jul 3, 2024 • edited Loading

IlyasMoutawwakil commented Jul 3, 2024

mgiessing commented Jul 3, 2024 •

edited

Loading