Need fp8 support to use H100/H200 and halve GPU costs #3151

samueldashadrach · 2025-01-02T03:11:45Z

H100/H200 have dedicated fp8 tensor cores with twice as many FLOPs as fp16. This will halve GPU cost. Developers will use whichever library has lower cost, assuming the cost reduction comes without the dev having to put additional effort.

Please consider prioritising fp8 support, your library risks going outdated otherwise.

samueldashadrach · 2025-01-02T03:15:46Z

Update: I'm doing embedding inference using best models on MTEB benchmark, to be more specific.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need fp8 support to use H100/H200 and halve GPU costs #3151

Need fp8 support to use H100/H200 and halve GPU costs #3151

samueldashadrach commented Jan 2, 2025

samueldashadrach commented Jan 2, 2025

Need fp8 support to use H100/H200 and halve GPU costs #3151

Need fp8 support to use H100/H200 and halve GPU costs #3151

Comments

samueldashadrach commented Jan 2, 2025

samueldashadrach commented Jan 2, 2025