Skip to content

IamCreateAI/Ruyi-Models

Repository files navigation

Ruyi-Models

English | 简体中文

Welcome to Ruyi-Models!

Ruyi is an image-to-video model capable of generating cinematic-quality videos at a resolution of 768, with a frame rate of 24 frames per second, totaling 5 seconds and 120 frames. It supports lens control and motion amplitude control. Using a RTX 3090 or RTX 4090, you can generate 512 resolution, 120 frames (or 768 resolution, ~72 frames) videos without any loss of quality.

News

  • Jan 14, 2025: TeaCache and Enhance-A-Video have been added as nodes for ComfyUI, offering faster generation speeds and improved video quality. Simply link these nodes before the sampler node to use them.

  • Jan 6, 2025: FP8 added for ComfyUI node. GPU memory decreases depending on the modes: bf16 default > fp8 lite > fp8 strong > fp8 extreme.

  • Dec 24, 2024: The diffusion model is updated to fix the black lines when creating 3:4 or 4:5 videos.

Table of Contents

Installation Instructions

The installation instructions are simple. Just clone the repo and install the requirements.

git clone https://github.com/IamCreateAI/Ruyi-Models
cd Ruyi-Models
pip install -r requirements.txt

For ComfyUI Users

Method (1): Installation via ComfyUI Manager

Download and install ComfyUI-Manager.

cd ComfyUI/custom_nodes/
git clone https://github.com/ltdrdata/ComfyUI-Manager.git

# install requirements
pip install -r ComfyUI-Manager/requirements.txt

Next, start ComfyUI and open the Manager. Select Custom Nodes Manager, then search for "Ruyi". You should see ComfyUI-Ruyi as shown in the screenshot below. Click "Install" to proceed.

Finally, search for "ComfyUI-VideoHelperSuite" and install it as well.

Method (2): Manual Installation

Download and save this repository to the path ComfyUI/custom_nodes/Ruyi-Models.

# download the repo
cd ComfyUI/custom_nodes/
git clone https://github.com/IamCreateAI/Ruyi-Models.git

# install requirements
pip install -r Ruyi-Models/requirements.txt

Install the dependency ComfyUI-VideoHelperSuite to display video output (skip this step if already installed).

# download ComfyUI-VideoHelperSuite
cd ComfyUI/custom_nodes/
git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git

# install requirements
pip install -r ComfyUI-VideoHelperSuite/requirements.txt
For Windows Users

When using the Windows operating system, a common distribution is ComfyUI_windows_portable_nvidia. When launched with run_nvidia_gpu.bat, it utilizes the embedded Python interpreter included with the package. Therefore, the environment needs to be set up within this built-in Python.

For example, if the extracted directory of the distribution is ComfyUI_windows_portable, you can typically use the following command to download the repository and install the runtime environment:

# download the repo
cd ComfyUI_windows_portable\ComfyUI\custom_nodes
git clone https://github.com/IamCreateAI/Ruyi-Models.git

# install requirements using embedded Python interpreter
..\..\python_embeded\python.exe -m pip install -r Ruyi-Models\requirements.txt

Download Model (Optional)

Download the model and save it to certain path. To directly run our model, it is recommand to save the models into Ruyi-Models/models folder. For ComfyUI users, the path should be ComfyUI/models/Ruyi.

Model Name Type Resolution Max Frames Frames per Second Storage Space Download
Ruyi-Mini-7B Image to Video 512 & 768 120 24 17 GB 🤗

For example, after downloading Ruyi-Mini-7B, the file path structure should be:

📦 Ruyi-Models/models/ or ComfyUI/models/Ruyi/
├── 📂 Ruyi-Mini-7B/
│   ├── 📂 transformers/
│   ├── 📂 vae/
│   └── 📂 ...

This repository supports automatic model downloading, but manual downloading provides more control. For instance, you can download the model to another location and then link it to the ComfyUI/models/Ruyi path using symbolic links or similar methods.

How to Use

We provide two ways to run our model. The first is directly using python code.

python3 predict_i2v.py

Specifically, the script downloads the model to the Ruyi-Models/models folder and uses images from the assets folder as the start and end frames for video inference. You can modify the variables in the script to replace the input images and set parameters such as video length and resolution.

For users with more than 24GB of GPU memory, you can use predict_i2v_80g.py to enhance generation speed. For those with less GPU memory, we offer parameters to optimize memory usage, enabling the generation of higher resolution and longer videos by extending the inference time. The effects of these parameters can be found in the GPU memory optimization section section below.

Or use ComfyUI wrapper in our github repo, the detail of ComfyUI nodes is described in comfyui/README.md.

Showcase

Image to Video Effects

i2v_01.mp4
i2v_02.mp4
i2v_03.mp4
i2v_04.mp4

Camera Control

input
camera_left.mp4
left
camera_right.mp4
right
camera_static.mp4
static
camera_up.mp4
up
camera_down.mp4
down

Motion Amplitude Control

motion_1.mp4
motion 1
motion_2.mp4
motion 2
motion_3.mp4
motion 3
motion_4.mp4
motion 4

GPU Memory Optimization

We provide the options GPU_memory_mode and GPU_offload_steps to reduce GPU memory usage, catering to different user needs.

Generally speaking, using less GPU memory requires more RAM and results in longer generation times. Below is a reference table of expected GPU memory usage and generation times. Note that, the GPU memory reported below is the max_memory_allocated() value. The values read from nvidia-smi may be higher than the reported values because CUDA occupies some GPU memory (usually between 500 - 800 MiB), and PyTorch's caching mechanism also requests additional GPU memory.

Additionally, the community and we have created a detailed table featuring various resolutions and option combinations, which can be found in the gpu_memory_appendix.md. We encourage community members to help us complete the table.

A100 Results

  • Resolution of 512
Num frames normal_mode + 0 steps normal_mode + 10 steps normal_mode + 7 steps normal_mode + 5 steps normal_mode + 1 steps low_gpu_mode + 0 steps
24 frames 16119MiB
01:01s
15535MiB
01:07s
15340MiB
01:13s
15210MiB
01:20s
14950MiB
01:32s
4216MiB
05:14s
48 frames 18398MiB
01:53s
17230MiB
02:15s
16840MiB
02:29s
16580MiB
02:32s
16060MiB
02:54s
4590MiB
09:59s
72 frames 20678MiB
03:00s
18925MiB
03:31s
18340MiB
03:53s
17951MiB
03:57s
17171MiB
04:25s
6870MiB
14:42s
96 frames 22958MiB
04:11s
20620MiB
04:54s
19841MiB
05:10s
19321MiB
05:14s
18281MiB
05:47s
9150MiB
19:17s
120 frames 25238MiB
05:42s
22315MiB
06:34s
21341MiB
06:59s
20691MiB
07:07s
19392MiB
07:41s
11430MiB
24:08s
  • Resolution of 768
Num frames normal_mode + 0 steps normal_mode + 10 steps normal_mode + 7 steps normal_mode + 5 steps normal_mode + 1 steps low_gpu_mode + 0 steps
24 frames 18971MiB
02:06s
17655MiB
02:40s
17217MiB
02:39s
16925MiB
02:41s
16339MiB
03:13s
5162MiB
13:42s
48 frames 24101MiB
04:52s
21469MiB
05:44s
20592MiB
05:51s
20008MiB
06:00s
18837MiB
06:49s
10292MiB
20:58s
72 frames 29230MiB
08:24s
25283MiB
09:45s
25283MiB
09:45s
23091MiB
10:10s
21335MiB
11:10s
15421MiB
39:12s
96 frames 34360MiB
12:49s
29097MiB
14:41s
27343MiB
15:33s
26174MiB
15:44s
23834MiB
16:33s
20550MiB
43:47s
120 frames 39489MiB
18:21s
32911MiB
20:39s
30719MiB
21:34s
29257MiB
21:48s
26332MiB
23:02s
25679MiB
63:01s

RTX 4090 Results

The values marked with --- in the table indicate that an out-of-memory (OOM) error occurred, preventing generation.

  • Resolution of 512
Num frames normal_mode + 0 steps normal_mode + 10 steps normal_mode + 7 steps normal_mode + 5 steps normal_mode + 1 steps low_gpu_mode + 0 steps
24 frames 16366MiB
01:18s
15805MiB
01:26s
15607MiB
01:37s
15475MiB
01:36s
15211MiB
01:39s
4211MiB
03:57s
48 frames 18720MiB
02:21s
17532MiB
02:49s
17136MiB
02:55s
16872MiB
02:58s
16344MiB
03:01s
4666MiB
05:01s
72 frames 21036MiB
03:41s
19254MiB
04:25s
18660MiB
04:34s
18264MiB
04:36s
17472MiB
04:51s
6981MiB
06:36s
96 frames -----MiB
--:--s
20972MiB
06:18s
20180MiB
06:24s
19652MiB
06:36s
18596MiB
06:56s
9298MiB
10:03s
120 frames -----MiB
--:--s
-----MiB
--:--s
21704MiB
08:50s
21044MiB
08:53s
19724MiB
09:08s
11613MiB
13:57s
  • Resolution of 768
Num frames normal_mode + 0 steps normal_mode + 10 steps normal_mode + 7 steps normal_mode + 5 steps normal_mode + 1 steps low_gpu_mode + 0 steps
24 frames 19223MiB
02:38s
17900MiB
03:06s
17448MiB
03:18s
17153MiB
03:23s
16624MiB
03:34s
5251MiB
05:54s
48 frames -----MiB
--:--s
-----MiB
--:--s
20946MiB
07:28s
20352MiB
07:35s
19164MiB
08:04s
10457MiB
10:55s
72 frames -----MiB
--:--s
-----MiB
--:--s
-----MiB
--:--s
-----MiB
--:--s
-----MiB
--:--s
15671MiB
18:52s

License

We’re releasing the model under a permissive Apache 2.0 license.

BibTeX

@misc{createai2024ruyi,
      title={Ruyi-Mini-7B},
      author={CreateAI Team},
      year={2024},
      publisher = {GitHub},
      journal = {GitHub repository},
      howpublished={\url{https://github.com/IamCreateAI/Ruyi-Models}}
}

Welcome Feedback and Collaborative Optimization

We sincerely welcome everyone to actively provide valuable feedback and suggestions, and we hope to work together to optimize our services and products. Your words will help us better understand user needs, allowing us to continuously enhance the user experience. Thank you for your support and attention to our work!

You are welcomed to join our Discord or Wechat Group (Scan QR code to add Ruyi Assistant and join the official group) for further discussion!