Benchmark Performance to measure response times for Inference #450

ratnopamc · 2024-02-24T01:10:31Z

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use a tool to load test and benchmark Inference latency, throughput, response times to scale Pod and create new nodes at load.

Use a bench marking tool like fmbt for this purpose.

vara-bonthu added the enhancement New feature or request label Feb 26, 2024

Provide feedback