[Single File] LTX support for loading original weights #10135

a-r-r-o-w · 2024-12-05T22:05:25Z

With Diffusers weights:

import torch
from diffusers import LTXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

pipe = LTXImageToVideoPipeline.from_pretrained("Lightricks/LTX-Video", torch_dtype=torch.bfloat16, revision="refs/pr/32")
pipe.to("cuda")

image = load_image("https://huggingface.co/datasets/a-r-r-o-w/tiny-meme-dataset-captioned/resolve/main/images/8.png")
prompt = "A young girl stands calmly in the foreground, looking directly at the camera, as a house fire rages in the background. Flames engulf the structure, with smoke billowing into the air. Firefighters in protective gear rush to the scene, a fire truck labeled '38' visible behind them. The girl's neutral expression contrasts sharply with the chaos of the fire, creating a poignant and emotionally charged scene."
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output.mp4", fps=24)

With original weights:

import torch
from diffusers import AutoencoderKLLTX, LTXImageToVideoPipeline, LTXTransformer3DModel
from diffusers.utils import export_to_video, load_image

single_file_url = "https://huggingface.co/Lightricks/LTX-Video/ltx-video-2b-v0.9.safetensors"
transformer = LTXTransformer3DModel.from_single_file(single_file_url, torch_dtype=torch.bfloat16, revision="refs/pr/32")
vae = AutoencoderKLLTX.from_single_file(single_file_url, torch_dtype=torch.bfloat16, revision="refs/pr/32")
pipe = LTXImageToVideoPipeline.from_pretrained("Lightricks/LTX-Video", transformer=transformer, vae=vae, torch_dtype=torch.bfloat16, revision="refs/pr/32")
pipe.to("cuda")

image = load_image("https://huggingface.co/datasets/a-r-r-o-w/tiny-meme-dataset-captioned/resolve/main/images/8.png")
prompt = "A young girl stands calmly in the foreground, looking directly at the camera, as a house fire rages in the background. Flames engulf the structure, with smoke billowing into the air. Firefighters in protective gear rush to the scene, a fire truck labeled '38' visible behind them. The girl's neutral expression contrasts sharply with the chaos of the fire, creating a poignant and emotionally charged scene."
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=704,
    height=480,
    num_frames=161,
    num_inference_steps=50,
).frames[0]
export_to_video(video, "output.mp4", fps=24)

Discussion: https://huggingface.slack.com/archives/C08275HSG8J/p1733324207024939

cc @yiyixuxu

HuggingFaceDocBuilderDev · 2024-12-05T22:12:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/diffusers/loaders/single_file_model.py

* transformer * make style & make fix-copies * transformer * add transformer tests * 80% vae * make style * make fix-copies * fix * undo cogvideox changes * update * update * match vae * add docs * t2v pipeline working; scheduler needs to be checked * docs * add pipeline test * update * update * make fix-copies * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * update * copy t2v to i2v pipeline * update * apply review suggestions * update * make style * remove framewise encoding/decoding * pack/unpack latents * image2video * update * make fix-copies * update * update * rope scale fix * debug layerwise code * remove debug * Apply suggestions from code review Co-authored-by: YiYi Xu <[email protected]> * propagate precision changes to i2v pipeline * remove downcast * address review comments * fix comment * address review comments * [Single File] LTX support for loading original weights (#10135) * from original file mixin for ltx * undo config mapping fn changes * update * add single file to pipelines * update docs * Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py * Update src/diffusers/models/autoencoders/autoencoder_kl_ltx.py * rename classes based on ltx review * point to original repository for inference * make style * resolve conflicts correctly --------- Co-authored-by: Steven Liu <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

from original file mixin for ltx

428db9f

a-r-r-o-w requested review from DN6 and yiyixuxu December 5, 2024 22:05

DN6 reviewed Dec 6, 2024

View reviewed changes

src/diffusers/loaders/single_file_model.py Outdated Show resolved Hide resolved

yiyixuxu added the close-to-merge label Dec 6, 2024

a-r-r-o-w added 2 commits December 10, 2024 08:01

undo config mapping fn changes

f09f51c

update

ca4b38c

a-r-r-o-w mentioned this pull request Dec 10, 2024

[Single file] Support revision argument when loading single file config #10168

Merged

DN6 approved these changes Dec 10, 2024

View reviewed changes

a-r-r-o-w merged commit 9ba6a06 into ltx-integration Dec 10, 2024
2 checks passed

a-r-r-o-w deleted the ltx-single-file branch December 10, 2024 08:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Single File] LTX support for loading original weights #10135

[Single File] LTX support for loading original weights #10135

a-r-r-o-w commented Dec 5, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 5, 2024

[Single File] LTX support for loading original weights #10135

[Single File] LTX support for loading original weights #10135

Conversation

a-r-r-o-w commented Dec 5, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Dec 5, 2024

a-r-r-o-w commented Dec 5, 2024 •

edited

Loading