English | 简体中文
Welcome to Ruyi-Models!
Ruyi is an image-to-video model capable of generating cinematic-quality videos at a resolution of 768, with a frame rate of 24 frames per second, totaling 5 seconds and 120 frames. It supports lens control and motion amplitude control. Using a RTX 3090 or RTX 4090, you can generate 512 resolution, 120 frames (or 768 resolution, ~72 frames) videos without any loss of quality.
-
Jan 14, 2025: TeaCache and Enhance-A-Video have been added as nodes for ComfyUI, offering faster generation speeds and improved video quality. Simply link these nodes before the sampler node to use them.
-
Jan 6, 2025: FP8 added for ComfyUI node. GPU memory decreases depending on the modes: bf16 default > fp8 lite > fp8 strong > fp8 extreme.
-
Dec 24, 2024: The diffusion model is updated to fix the black lines when creating 3:4 or 4:5 videos.
- Installation Instructions
- Download Model (Optional)
- How to Use
- Showcase
- GPU Memory Optimization
- License
The installation instructions are simple. Just clone the repo and install the requirements.
git clone https://github.com/IamCreateAI/Ruyi-Models
cd Ruyi-Models
pip install -r requirements.txt
Download and install ComfyUI-Manager.
cd ComfyUI/custom_nodes/
git clone https://github.com/ltdrdata/ComfyUI-Manager.git
# install requirements
pip install -r ComfyUI-Manager/requirements.txt
Next, start ComfyUI and open the Manager. Select Custom Nodes Manager, then search for "Ruyi". You should see ComfyUI-Ruyi as shown in the screenshot below. Click "Install" to proceed.
Finally, search for "ComfyUI-VideoHelperSuite" and install it as well.
Download and save this repository to the path ComfyUI/custom_nodes/Ruyi-Models.
# download the repo
cd ComfyUI/custom_nodes/
git clone https://github.com/IamCreateAI/Ruyi-Models.git
# install requirements
pip install -r Ruyi-Models/requirements.txt
Install the dependency ComfyUI-VideoHelperSuite to display video output (skip this step if already installed).
# download ComfyUI-VideoHelperSuite
cd ComfyUI/custom_nodes/
git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
# install requirements
pip install -r ComfyUI-VideoHelperSuite/requirements.txt
When using the Windows operating system, a common distribution is ComfyUI_windows_portable_nvidia. When launched with run_nvidia_gpu.bat
, it utilizes the embedded Python interpreter included with the package. Therefore, the environment needs to be set up within this built-in Python.
For example, if the extracted directory of the distribution is ComfyUI_windows_portable, you can typically use the following command to download the repository and install the runtime environment:
# download the repo
cd ComfyUI_windows_portable\ComfyUI\custom_nodes
git clone https://github.com/IamCreateAI/Ruyi-Models.git
# install requirements using embedded Python interpreter
..\..\python_embeded\python.exe -m pip install -r Ruyi-Models\requirements.txt
Download the model and save it to certain path. To directly run our model, it is recommand to save the models into Ruyi-Models/models folder. For ComfyUI users, the path should be ComfyUI/models/Ruyi.
Model Name | Type | Resolution | Max Frames | Frames per Second | Storage Space | Download |
---|---|---|---|---|---|---|
Ruyi-Mini-7B | Image to Video | 512 & 768 | 120 | 24 | 17 GB | 🤗 |
For example, after downloading Ruyi-Mini-7B, the file path structure should be:
📦 Ruyi-Models/models/ or ComfyUI/models/Ruyi/
├── 📂 Ruyi-Mini-7B/
│ ├── 📂 transformers/
│ ├── 📂 vae/
│ └── 📂 ...
This repository supports automatic model downloading, but manual downloading provides more control. For instance, you can download the model to another location and then link it to the ComfyUI/models/Ruyi path using symbolic links or similar methods.
We provide two ways to run our model. The first is directly using python code.
python3 predict_i2v.py
Specifically, the script downloads the model to the Ruyi-Models/models folder and uses images from the assets folder as the start and end frames for video inference. You can modify the variables in the script to replace the input images and set parameters such as video length and resolution.
For users with more than 24GB of GPU memory, you can use predict_i2v_80g.py to enhance generation speed. For those with less GPU memory, we offer parameters to optimize memory usage, enabling the generation of higher resolution and longer videos by extending the inference time. The effects of these parameters can be found in the GPU memory optimization section section below.
Or use ComfyUI wrapper in our github repo, the detail of ComfyUI nodes is described in comfyui/README.md.
i2v_01.mp4 |
i2v_02.mp4 |
i2v_03.mp4 |
i2v_04.mp4 |
input | camera_left.mp4 |
camera_right.mp4 |
camera_static.mp4 |
camera_up.mp4 |
camera_down.mp4 |
motion_1.mp4 |
motion_2.mp4 |
motion_3.mp4 |
motion_4.mp4 |
We provide the options GPU_memory_mode
and GPU_offload_steps
to reduce GPU memory usage, catering to different user needs.
Generally speaking, using less GPU memory requires more RAM and results in longer generation times. Below is a reference table of expected GPU memory usage and generation times. Note that, the GPU memory reported below is the max_memory_allocated()
value. The values read from nvidia-smi may be higher than the reported values because CUDA occupies some GPU memory (usually between 500 - 800 MiB), and PyTorch's caching mechanism also requests additional GPU memory.
Additionally, the community and we have created a detailed table featuring various resolutions and option combinations, which can be found in the gpu_memory_appendix.md. We encourage community members to help us complete the table.
- Resolution of 512
Num frames | normal_mode + 0 steps | normal_mode + 10 steps | normal_mode + 7 steps | normal_mode + 5 steps | normal_mode + 1 steps | low_gpu_mode + 0 steps |
---|---|---|---|---|---|---|
24 frames | 16119MiB 01:01s |
15535MiB 01:07s |
15340MiB 01:13s |
15210MiB 01:20s |
14950MiB 01:32s |
4216MiB 05:14s |
48 frames | 18398MiB 01:53s |
17230MiB 02:15s |
16840MiB 02:29s |
16580MiB 02:32s |
16060MiB 02:54s |
4590MiB 09:59s |
72 frames | 20678MiB 03:00s |
18925MiB 03:31s |
18340MiB 03:53s |
17951MiB 03:57s |
17171MiB 04:25s |
6870MiB 14:42s |
96 frames | 22958MiB 04:11s |
20620MiB 04:54s |
19841MiB 05:10s |
19321MiB 05:14s |
18281MiB 05:47s |
9150MiB 19:17s |
120 frames | 25238MiB 05:42s |
22315MiB 06:34s |
21341MiB 06:59s |
20691MiB 07:07s |
19392MiB 07:41s |
11430MiB 24:08s |
- Resolution of 768
Num frames | normal_mode + 0 steps | normal_mode + 10 steps | normal_mode + 7 steps | normal_mode + 5 steps | normal_mode + 1 steps | low_gpu_mode + 0 steps |
---|---|---|---|---|---|---|
24 frames | 18971MiB 02:06s |
17655MiB 02:40s |
17217MiB 02:39s |
16925MiB 02:41s |
16339MiB 03:13s |
5162MiB 13:42s |
48 frames | 24101MiB 04:52s |
21469MiB 05:44s |
20592MiB 05:51s |
20008MiB 06:00s |
18837MiB 06:49s |
10292MiB 20:58s |
72 frames | 29230MiB 08:24s |
25283MiB 09:45s |
25283MiB 09:45s |
23091MiB 10:10s |
21335MiB 11:10s |
15421MiB 39:12s |
96 frames | 34360MiB 12:49s |
29097MiB 14:41s |
27343MiB 15:33s |
26174MiB 15:44s |
23834MiB 16:33s |
20550MiB 43:47s |
120 frames | 39489MiB 18:21s |
32911MiB 20:39s |
30719MiB 21:34s |
29257MiB 21:48s |
26332MiB 23:02s |
25679MiB 63:01s |
The values marked with ---
in the table indicate that an out-of-memory (OOM) error occurred, preventing generation.
- Resolution of 512
Num frames | normal_mode + 0 steps | normal_mode + 10 steps | normal_mode + 7 steps | normal_mode + 5 steps | normal_mode + 1 steps | low_gpu_mode + 0 steps |
---|---|---|---|---|---|---|
24 frames | 16366MiB 01:18s |
15805MiB 01:26s |
15607MiB 01:37s |
15475MiB 01:36s |
15211MiB 01:39s |
4211MiB 03:57s |
48 frames | 18720MiB 02:21s |
17532MiB 02:49s |
17136MiB 02:55s |
16872MiB 02:58s |
16344MiB 03:01s |
4666MiB 05:01s |
72 frames | 21036MiB 03:41s |
19254MiB 04:25s |
18660MiB 04:34s |
18264MiB 04:36s |
17472MiB 04:51s |
6981MiB 06:36s |
96 frames | -----MiB --:--s |
20972MiB 06:18s |
20180MiB 06:24s |
19652MiB 06:36s |
18596MiB 06:56s |
9298MiB 10:03s |
120 frames | -----MiB --:--s |
-----MiB --:--s |
21704MiB 08:50s |
21044MiB 08:53s |
19724MiB 09:08s |
11613MiB 13:57s |
- Resolution of 768
Num frames | normal_mode + 0 steps | normal_mode + 10 steps | normal_mode + 7 steps | normal_mode + 5 steps | normal_mode + 1 steps | low_gpu_mode + 0 steps |
---|---|---|---|---|---|---|
24 frames | 19223MiB 02:38s |
17900MiB 03:06s |
17448MiB 03:18s |
17153MiB 03:23s |
16624MiB 03:34s |
5251MiB 05:54s |
48 frames | -----MiB --:--s |
-----MiB --:--s |
20946MiB 07:28s |
20352MiB 07:35s |
19164MiB 08:04s |
10457MiB 10:55s |
72 frames | -----MiB --:--s |
-----MiB --:--s |
-----MiB --:--s |
-----MiB --:--s |
-----MiB --:--s |
15671MiB 18:52s |
We’re releasing the model under a permissive Apache 2.0 license.
@misc{createai2024ruyi,
title={Ruyi-Mini-7B},
author={CreateAI Team},
year={2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished={\url{https://github.com/IamCreateAI/Ruyi-Models}}
}
We sincerely welcome everyone to actively provide valuable feedback and suggestions, and we hope to work together to optimize our services and products. Your words will help us better understand user needs, allowing us to continuously enhance the user experience. Thank you for your support and attention to our work!
You are welcomed to join our Discord or Wechat Group (Scan QR code to add Ruyi Assistant and join the official group) for further discussion!