Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix auto_gptq layer error device #2134

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ZX-ModelCloud
Copy link
Contributor

Fix the device error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! mentioned by MekkCyber commented.

test_quantization
RUN_SLOW=1 pytest tests/gptq/test_quantization.py

  • cpu tests
  • cuda tests

@Qubitium
Copy link
Contributor

@ZX-ModelCloud Move this PR to draft. May not be needed. This is actually related to deficiency in autogptq unable to pass cpu tests as optimum force move layer to gpu.

Gptqmodel has no such restrictions. We may bypass this by disabling cpu only tests for AutoGPTQ.

@ZX-ModelCloud ZX-ModelCloud marked this pull request as draft December 21, 2024 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants