Skip to content

0.45.0: LLM.int8() support for H100; faster 4-bit/8-bit inference #951

0.45.0: LLM.int8() support for H100; faster 4-bit/8-bit inference

0.45.0: LLM.int8() support for H100; faster 4-bit/8-bit inference #951

Annotations

1 warning

audit-wheels

succeeded Dec 5, 2024 in 18s