minisora 开源社区定位为由社区同学自发组织的开源社区(免费不收取任何费用、不割韭菜),minisora 计划探索实现 Sora 的实现路径可发展可能
- 将定期举办Sora的圆桌和社区一起探讨可能性
- 视频生成的现有技术路径探讨
- Sora 技术报告: Video generation models as world simulators
- DiT: Scalable Diffusion Models with Transformers
- Latte: Latte: Latent Diffusion Transformer for Video Generation
- Coming soon...
Paper | Links |
---|---|
1) Diffusion Models Beat GANs on Image Synthesis | Paper, Github |
2) High-Resolution Image Synthesis with Latent Diffusion Models | Paper, Github |
3) Elucidating the Design Space of Diffusion-Based Generative Models | Paper, Github |
4) Denoising Diffusion Probabilistic Models | Paper, Github |
5) Score-Based Generative Modeling through Stochastic Differential Equations | Paper, Github |
Paper | Links |
---|---|
1) UViT: All are Worth Words: A ViT Backbone for Diffusion Models | Paper, Github ModelScope |
2) DiT: Scalable Diffusion Models with Transformers | Paper, Github ModelScope |
3) SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers | Paper, Github ModelScope |
4) FiT: Flexible Vision Transformer for Diffusion Model | Paper, Github |
Paper | Links |
---|---|
1) Animatediff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning | Paper, Github ModelScope |
2) I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models | Paper, Github ModelScope |
4) Imagen Video: High Definition Video Generation with Diffusion Models | Paper |
5) MoCoGAN: Decomposing Motion and Content for Video Generation | Paper |
6) Adversarial Video Generation on Complex Datasets | Paper |
7) Photorealistic Video Generation with Diffusion Models | Paper |
8) VideoGPT: Video Generation using VQ-VAE and Transformers | Paper, Github |
9) Video Diffusion Models | Paper, Github, Project |
10) MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation | Paper, Github, Project, Blog |
Paper | Links |
---|---|
1) World Model on Million-Length Video And Language With RingAttention | Paper, Github |
2) Ring Attention with Blockwise Transformers for Near-Infinite Context | Paper, Github |
3) Extending LLMs' Context Window with 100 Samples | Paper, Github |
4) Efficient Streaming Language Models with Attention Sinks | Paper, Github |