Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] better support offloading when side loading is enabled. #4855

Merged
merged 13 commits into from
Sep 5, 2023

Conversation

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 31, 2023

The documentation is not available anymore as the PR was closed or merged.

@sayakpaul
Copy link
Member Author

I think for tests, we will need to add them as SLOW tests requiring GPUs. Given the importance of these use cases, I won't mind adding them. Any objections add these new SLOW tests? @patrickvonplaten @williamberman.

Comment on lines 767 to 779
# Remove any existing hooks.
is_model_cpu_offload = False
is_sequential_cpu_offload = False
for _, component in self.components.items():
if isinstance(component, nn.Module):
if hasattr(component, "_hf_hook"):
is_model_cpu_offload = isinstance(getattr(component, "_hf_hook"), CpuOffload)
is_sequential_cpu_offload = isinstance(getattr(component, "_hf_hook"), AlignDevicesHook)
logger.info(
"Accelerate hooks detected. Since you have called `load_textual_inversion()`, the previous hooks will be first removed. Then the textual inversion parameters will be loaded and the hooks will be applied again."
)
remove_hook_from_module(component)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't one of these two hooks styles hook into every sub module as well, so shouldn't one of the checks be recursive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@muellerzr to help here a bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@muellerzr to help here a bit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to do remove_hook_from_module(component, recursive=True)

CC @SunMarc too for a second glance :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is it required here? Sorry for not making my comment clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess trying to understand just what we're aiming to achieve (solid guess based on context, let me know if I'm accurate):

  • Given a model that may be loaded in via device_map="auto" or some form of device_map
  • We wish to remove the hooks when wanting to load the full model into memory/remove it from BMI (big model inference)
  • With the potential of placing it back later

Is this accurate? Otherwise may need a bit more info/context I'm missing somehow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3922 (comment)

We want to be able to detect if a torch.nn.Module has hooks and we want to remove them. That is the bit relevant to accelerate. Then after loading some auxiliary weights, we want to load the appropriate hooks back in.

Let me know if that helps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understood from the codebase, if we have is_sequential_cpu_offload, it means that the components were offloaded using cpu_offload which places recursively the hooks on each submodules. In the case of is_model_cpu_offload, we use cpu_offload_from_hook which place only one hook on the module, so that the entire model will be offloaded when another hook is triggered.
I would then suggest using remove_hook_from_module(component, recursive=True) for the first case and remove_hook_from_module(component, recursive=False) for the second case if you don't want to just recursively remove all the hooks for both cases !

Comment on lines +1237 to +1239
logger.info(
"Accelerate hooks detected. Since you have called `load_lora_weights()`, the previous hooks will be first removed. Then the LoRA parameters will be loaded and the hooks will be applied again."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the log statement might be a bit noisy. It'd be nice if we expected the user to do additional things with the placed accelerate hooks and should be aware if they expected some state to be maintained or something but we definitely don't want the user to touch the hooks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's relatively simple given the context the message is being raised from. If you have a better suggestion, let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think my main point is the log is a bit noisy given that it leaks what is supposed to be an internal implementation detail, I think it's not really something that should be exposed to an end user

@williamberman
Copy link
Contributor

design makes sense! few quick questions

if isinstance(component, nn.Module):
if hasattr(component, "_hf_hook"):
is_model_cpu_offload = isinstance(getattr(component, "_hf_hook"), CpuOffload)
is_sequential_cpu_offload = isinstance(getattr(component, "_hf_hook"), AlignDevicesHook)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@patrickvonplaten
Copy link
Contributor

Looks good to me!

@sayakpaul
Copy link
Member Author

@patrickvonplaten ready for another round of review.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go for me!

@sayakpaul sayakpaul merged commit e4b8e79 into main Sep 5, 2023
@sayakpaul sayakpaul deleted the improve-side-offloading branch September 5, 2023 01:25
williamberman added a commit to williamberman/diffusers that referenced this pull request Sep 6, 2023
williamberman added a commit that referenced this pull request Sep 9, 2023
…abled… (#4927)

Revert "[Core] better support offloading when side loading is enabled. (#4855)"

This reverts commit e4b8e79.
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
…ngface#4855)

* better support offloading when side loading is enabled.

* load_textual_inversion

* better messaging for textual inversion.

* fixes

* address PR feedback.

* sdxl support.

* improve messaging

* recursive removal when cpu sequential offloading is enabled.

* add: lora tests

* recruse.

* add: offload tests for textual inversion.
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
…abled… (huggingface#4927)

Revert "[Core] better support offloading when side loading is enabled. (huggingface#4855)"

This reverts commit e4b8e79.
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
…ngface#4855)

* better support offloading when side loading is enabled.

* load_textual_inversion

* better messaging for textual inversion.

* fixes

* address PR feedback.

* sdxl support.

* improve messaging

* recursive removal when cpu sequential offloading is enabled.

* add: lora tests

* recruse.

* add: offload tests for textual inversion.
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
…abled… (huggingface#4927)

Revert "[Core] better support offloading when side loading is enabled. (huggingface#4855)"

This reverts commit e4b8e79.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants