-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix XPU inference #2383
Fix XPU inference #2383
Conversation
cc @SunMarc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the support of xpu for big model inference. Could you have a look @abhilash1910 since you did the initial integration with xpu ?
src/accelerate/utils/modeling.py
Outdated
target_device = device | ||
|
||
if is_xpu_available() and isinstance(device, int): | ||
target_device = f"xpu:{device}" | ||
|
||
with safe_open(checkpoint_file, framework="pt", device=target_device) as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure, the model won't be loaded in the correct xpu device if we don't have the prefix "xpu:". Is that right ? cc @statelesshz since you added the support for npu and it might also be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can see it here: huggingface/safetensors#428 file bindings/python/src/lib.rs
line 263:
name if name.starts_with("xpu:") => {
let tokens: Vec<_> = name.split(':').collect();
if tokens.len() == 2 {
let device: usize = tokens[1].parse()?;
Ok(Device::Xpu(device))
} else {
Err(SafetensorError::new_err(format!(
"device {name} is invalid"
)))
}
}
If you supply only a number, it will treat as CUDA device (the same file, line 278):
} else if let Ok(number) = ob.extract::<usize>() {
Ok(Device::Cuda(number))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think the changes are correct . It requests a "xpu" device ; since the addition of safetensors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect ! Thanks for the explanation !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! Thanks @notsyncing for adding this; with the safetensors being added, this was something in the pipeline. Thanks @SunMarc for highlighting.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @notsyncing and @abhilash1910 for reviewing. Before merging this, could you add a check to raise an error if the user don't have the right version of safetensors with xpu by using compare_version
and safentsors_version = version.parse(importlib.metadata.version("safentsors"))
. Also don't forget to run make style
.
fd785b7
to
f8ea86f
Compare
Though it will complain about "Device xpu is not recognized, available devices are integers(for GPU/XPU), 'mps', 'cpu' and 'disk'", but you cannot just put 0 as device, or it will treat 0 as CUDA device, then complains again that torch is not compiled with CUDA enabled. You will need safetensors >= 0.4.2 if using safetensors files.
@SunMarc I have added the check of safetensors version, and fixed style errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! Thanks for iterating.
What does this PR do?
With
device_map = {"xpu": "16GB", "cpu": "32GB"}
on an Intel XPU, it will complain aboutbut you cannot just put 0 as device, or it will treat 0 as CUDA device, then complains again that torch is not compiled with CUDA enabled.
You will need safetensors >= 0.4.2 if using safetensors files (huggingface/safetensors#428).
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.