Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] better support offloading when side loading is enabled. #4855
[Core] better support offloading when side loading is enabled. #4855
Changes from 8 commits
c810d48
c14fc20
46b0874
6c842c7
2a27542
b3fb9a7
773ff91
cd2d963
b8b5422
3d06c51
340887e
7bcf71d
6b88f4e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't one of these two hooks styles hook into every sub module as well, so shouldn't one of the checks be recursive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muellerzr to help here a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muellerzr to help here a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to do
remove_hook_from_module(component, recursive=True)
CC @SunMarc too for a second glance :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But is it required here? Sorry for not making my comment clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess trying to understand just what we're aiming to achieve (solid guess based on context, let me know if I'm accurate):
device_map="auto"
or some form ofdevice_map
Is this accurate? Otherwise may need a bit more info/context I'm missing somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3922 (comment)
We want to be able to detect if a
torch.nn.Module
has hooks and we want to remove them. That is the bit relevant toaccelerate
. Then after loading some auxiliary weights, we want to load the appropriate hooks back in.Let me know if that helps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understood from the codebase, if we have
is_sequential_cpu_offload
, it means that the components were offloaded usingcpu_offload
which places recursively the hooks on each submodules. In the case ofis_model_cpu_offload
, we usecpu_offload_from_hook
which place only one hook on the module, so that the entire model will be offloaded when another hook is triggered.I would then suggest using
remove_hook_from_module(component, recursive=True)
for the first case andremove_hook_from_module(component, recursive=False)
for the second case if you don't want to just recursively remove all the hooks for both cases !There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the log statement might be a bit noisy. It'd be nice if we expected the user to do additional things with the placed accelerate hooks and should be aware if they expected some state to be maintained or something but we definitely don't want the user to touch the hooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's relatively simple given the context the message is being raised from. If you have a better suggestion, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think my main point is the log is a bit noisy given that it leaks what is supposed to be an internal implementation detail, I think it's not really something that should be exposed to an end user