-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] DDUF format #10037
base: main
Are you sure you want to change the base?
[FEAT] DDUF format #10037
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting this! Apart from the initial comments I left, I have a few higher level questions:
We're assuming that the archive will contain all the pipeline-related files in the diffusers
format. I am fine with that. This means we'll be shipping DDUF only at the pipeline-level for now, correct?
dduf_format: bool = False, | ||
dduf_filename: Optional[Union[str, os.PathLike]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned over Slack, why do we need two arguments here? How about just dduf_filename
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will just use that then !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that most, if not all, the components of the file name should be inferred. One of the reasons to use DDUF would be to provide confidence about the contents based on filename inspection, so we should try to make it work with something like an optional prefix, but build the rest automatically. So something like dduf_name="my-lora"
resulting in the library saving to "my-lora-stable-diffusion-3-medium-diffusers-fp16.dduf"
, or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok to use some kind of convention, but i don't think it should be an absolute requirement, ie. people can rename their files if they want to (but the metadata should be inside the file anyways)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that is the plan.
I just tried out Serializationfrom diffusers import FluxTransformer2DModel, DiffusionPipeline
from diffusers import BitsAndBytesConfig
from transformers import BitsAndBytesConfig as BnbConfig
from transformers import T5EncoderModel
from huggingface_hub import create_repo, upload_folder
import torch
repo_id = "DDUF/Flux.1-Dev-Tnf4-TE2NF4"
repo_id = create_repo(repo_id=repo_id, exist_ok=True).repo_id
dtype = torch.bfloat16
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=dtype,
)
transformer = FluxTransformer2DModel.from_pretrained(
"black-forest-labs/FLUX.1-dev",
subfolder="transformer",
quantization_config=nf4_config,
torch_dtype=dtype
)
nf4_config = BnbConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=dtype,
)
text_encoder_2 = T5EncoderModel.from_pretrained(
"black-forest-labs/FLUX.1-dev",
subfolder="text_encoder_2",
quantization_config=nf4_config,
torch_dtype=dtype
)
pipeline = DiffusionPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev",
transformer=transformer,
text_encoder_2=text_encoder_2,
torch_dtype=dtype
)
pipeline.enable_model_cpu_offload()
image = pipeline(
"a cat standing by the sea waiting for its companion to come and console it.",
guidance_scale=4.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.manual_seed(0)
).images[0]
image.save("dduf_out.png")
folder = "flux.1-dduf"
pipeline.save_pretrained(folder, dduf_filename="Flux.1-Dev-Tnf4-TE2NF4.dduf")
upload_folder(repo_id=repo_id, folder_path=folder, create_pr=True) Loadingfrom diffusers import DiffusionPipeline
import torch
repo_id = "DDUF/Flux.1-Dev-Tnf4-TE2NF4"
pipeline = DiffusionPipeline.from_pretrained(
repo_id, dduf="Flux.1-Dev-Tnf4-TE2NF4.dduf", torch_dtype=torch.bfloat16
)
pipeline.enable_model_cpu_offload()
image = pipeline(
"a cat standing by the sea waiting for its companion to come and console it.",
guidance_scale=4.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.manual_seed(0)
).images[0]
image.save("dduf_out.png") |
# remove tar archive to free memory | ||
os.remove(tar_file_path) | ||
# rename folder to match the name of the dduf archive | ||
os.rename(extract_to, tar_file_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that expanding the archive and removing it could be problematic.
- If I understand correctly, we would be removing the symlink to the blob, but not the blob itself. We won't be saving disk space.
- If we did remove the blob, I think it'd trigger another download next time we call
.from_pretrained
. - Removing the reference could maybe have consequences when checking for newer versions in the Hub (not sure).
When we change to using zips, I wonder if we could extract files progressively in memory during the load process, and keeping the big .dduf file as is. Having a disk snapshot is convenient, but I can see no easy solution to avoid duplicating space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I will check if it is possible to stream directly from the zip or at least extract progressively in memory as you said
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link ! I'll have a look !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW on the Hub side we want to constraint the types of Zip a lot:
- no compression only (to make sure it's deduplicated well by XetHub)
- single disk (internal simplifications of Zip)
- no encryption etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure restrictions are met, we'll provide tooling in Python to save/load DDUF files in a consistent way: huggingface/huggingface_hub#2692. This will be part of huggingface_hub
to that it can be easily reused by any library without having necessarily diffusers
as a dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW a bit orthogonal but do you have examples of quantized diffusers repos right now?
This is one example: https://huggingface.co/sayakpaul/FLUX.1-Fill-dev-nf4. Note that it doesn't include all the models needed to run Flux as it's not typical to quantize the VAE and the other CLIP-based text encoder. |
@SunMarc I don't think loading is working: from diffusers import DiffusionPipeline
repo_id = "DDUF/stable-diffusion-3-medium-diffusers-DDUF"
filename = "stable-diffusion-3-medium-diffusers.dduf"
pipeline = DiffusionPipeline.from_pretrained(repo_id, dduf=filename) Leads to: ErrorThe above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/fsx/sayak/diffusers/load_dduf.py", line 23, in <module>
pipeline = DiffusionPipeline.from_pretrained(repo_id, dduf=filename)
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/fsx/sayak/diffusers/src/diffusers/pipelines/pipeline_utils.py", line 776, in from_pretrained
cached_folder = cls.download(
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/fsx/sayak/diffusers/src/diffusers/pipelines/pipeline_utils.py", line 1379, in download
config_file = hf_hub_download(
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 925, in _hf_hub_download_to_cache_dir
(url_to_download, etag, commit_hash, expected_size, head_call_error) = _get_metadata_or_catch_error(
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
r = _request_wrapper(
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 301, in _request_wrapper
hf_raise_for_status(response)
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 417, in hf_raise_for_status
raise _format(EntryNotFoundError, message, response) from e
huggingface_hub.errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-674931ef-1fdc6f274e6f862c31db623e;2056ffe1-9bd0-4667-806f-273827e95101)
Entry Not Found for url: https://huggingface.co/DDUF/stable-diffusion-3-medium-diffusers-DDUF/resolve/main/model_index.json. which is expected. Update: I have pushed my updates in |
Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
Merged ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left a single comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for handling the final changes.
@DN6 can you do a review here? it is a single file |
@@ -981,6 +983,18 @@ def test_download_ignore_files(self): | |||
assert not any(f in ["vae/diffusion_pytorch_model.bin", "text_encoder/config.json"] for f in files) | |||
assert len(files) == 14 | |||
|
|||
def test_download_dduf_with_custom_pipeline_raises_error(self): | |||
with self.assertRaises(NotImplementedError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it might be easier to maintain if we introduce a class attribute into the PipelineTesterMixin supports_dduf=False
and set it to True for the pipelines that work with DDUF.
We can make the these tests conditional on that attribute. Similar to how we skip tests for models with custom attention processors
uses_custom_attn_processor = True |
if self.uses_custom_attn_processor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDTY @sayakpaul ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look into it tomorrow, sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is easier to set to True by default then set to to False for pipeline that doesn't work ? This helps finding which pipeline doesn't work with dduf and it makes sure that when adding a model, we try to make it compatible with dduf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK what you think about the last commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for working on this! @SunMarc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Marc!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍🏽 Good to merge after resolving conflicts.
@SunMarc I proceeded with resolving the conflicts. LMK. |
@SunMarc not sure why the test is failing. Could you give it a look? |
What does this PR do?
This PR adds support to DDUF format. More docs about this format here.
Requires transformers PR : https://github.com/huggingface/transformers/pull/35217/files (merged)
Requires huggingface_hub PR: huggingface/huggingface_hub#2692 (released)
Example
Load from the hub
Save to DDUF format
TODO list: