Skip to content

0.45.0: LLM.int8() support for H100; faster 4-bit/8-bit inference #951

0.45.0: LLM.int8() support for H100; faster 4-bit/8-bit inference

0.45.0: LLM.int8() support for H100; faster 4-bit/8-bit inference #951

Annotations

1 warning

build-wheels (ubuntu-latest, 3.10, x86_64)

succeeded Dec 5, 2024 in 1m 7s