Questions regarding implementation #13

a-r-r-o-w · 2024-12-26T14:50:51Z

Hey 👋

I'm Aryan from the HuggingFace Diffusers team. I am working on integrating FasterCache into the library to make it available for all the video models we support. I had some questions regarding the implementation and was hoping to get some help.

In the paper, the section describing CFG Cache has the following:

These biases ensure that both high- and low-frequency differences are accurately captured and compensated during the reuse process. In the subsequent n timesteps (from t − 1 to t − n), we infer only the outputs of the conditional branches and compute the unconditional outputs using the cached ∆HF and ∆LF as follows:

It says that inference is run for the conditional branch, and outputs for the unconditional branch are computed with the given equations. This is the relevant lines of code that seems to be doing what is mentioned:

FasterCache/scripts/latte/fastercache_sample_latte.py

Line 90 in fab32c1

    
           single_output = self.fastercache_model_single_forward(hidden_states[:1],timestep[:1],encoder_hidden_states[:1],added_cond_kwargs,class_labels,cross_attention_kwargs,attention_mask,encoder_attention_mask,use_image_num,enable_temporal_attentions,return_dict)[0]

However, the indexing of the inputs is done as hidden_states[:1],timestep[:1],encoder_hidden_states[:1]. Isn't this corresponding to the unconditional inputs instead of conditional inputs? I think it is unconditional because the order of concatenation of prompts embeds is like: (negative_prompt_embeds, prompt_embeds) here.

Is this incorrect by any chance? Or is unconditional branch being used for approximating output of conditional branch?

Thank you for your time! 🤗

cc @cszy98 @ChenyangSi

The text was updated successfully, but these errors were encountered:

cszy98 · 2024-12-27T07:46:53Z

Thank you for pointing this out and for your detailed observation. The indexing in the code does differ slightly from the description in the paper. I’ll update the implementation to ensure it’s fully aligned with the methodology described. Since the CFG-Cache stores the delta between the conditional and unconditional branches, this change will lead to a slight visual quality improvement in some cases.

Thanks again for your careful review. If you have any further questions or suggestions regarding FasterCache, feel free to let us know. We will do our best to assist.

a-r-r-o-w · 2024-12-27T08:50:04Z

Thank you so much for confirming! We're working on the integration here: huggingface/diffusers#10163

I would love to have your reviews once it's ready for testing (will ping you) 🤗 Looking forward to more amazing research from you!

a-r-r-o-w mentioned this issue Dec 30, 2024

[core] FasterCache huggingface/diffusers#10163

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions regarding implementation #13

Questions regarding implementation #13

a-r-r-o-w commented Dec 26, 2024 •

edited

Loading

cszy98 commented Dec 27, 2024

a-r-r-o-w commented Dec 27, 2024

Questions regarding implementation #13

Questions regarding implementation #13

Comments

a-r-r-o-w commented Dec 26, 2024 • edited Loading

cszy98 commented Dec 27, 2024

a-r-r-o-w commented Dec 27, 2024

a-r-r-o-w commented Dec 26, 2024 •

edited

Loading