fix auto_gptq layer error device #2134

ZX-ModelCloud · 2024-12-21T05:45:47Z

Fix the device error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! mentioned by MekkCyber commented.

test_quantization
RUN_SLOW=1 pytest tests/gptq/test_quantization.py

cpu tests
cuda tests

Qubitium · 2024-12-21T07:24:53Z

@ZX-ModelCloud Move this PR to draft. May not be needed. This is actually related to deficiency in autogptq unable to pass cpu tests as optimum force move layer to gpu.

Gptqmodel has no such restrictions. We may bypass this by disabling cpu only tests for AutoGPTQ.

fix auto_gptq layer error device

d3199ea

ZX-ModelCloud mentioned this pull request Dec 21, 2024

fix test_serialization_big_model_inference failed jiqing-feng/transformers#8

Closed

ZX-ModelCloud marked this pull request as draft December 21, 2024 07:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix auto_gptq layer error device #2134

fix auto_gptq layer error device #2134

ZX-ModelCloud commented Dec 21, 2024

Qubitium commented Dec 21, 2024

fix auto_gptq layer error device #2134

Are you sure you want to change the base?

fix auto_gptq layer error device #2134

Conversation

ZX-ModelCloud commented Dec 21, 2024

Qubitium commented Dec 21, 2024