mirror of
https://git.datalinker.icu/ali-vilab/TeaCache
synced 2026-05-02 16:24:28 +08:00
224 lines
9.0 KiB
Plaintext
224 lines
9.0 KiB
Plaintext
Metadata-Version: 2.1
|
|
Name: videosys
|
|
Version: 2.0.0
|
|
Summary: VideoSys
|
|
License: Apache Software License 2.0
|
|
Platform: UNKNOWN
|
|
Classifier: Programming Language :: Python :: 3
|
|
Classifier: License :: OSI Approved :: Apache Software License
|
|
Classifier: Environment :: GPU :: NVIDIA CUDA
|
|
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
Classifier: Topic :: System :: Distributed Computing
|
|
Requires-Python: >=3.6
|
|
Description-Content-Type: text/markdown
|
|
License-File: LICENSE
|
|
Requires-Dist: click
|
|
Requires-Dist: colossalai
|
|
Requires-Dist: contexttimer
|
|
Requires-Dist: diffusers==0.30.0
|
|
Requires-Dist: einops
|
|
Requires-Dist: fabric
|
|
Requires-Dist: ftfy
|
|
Requires-Dist: imageio
|
|
Requires-Dist: imageio-ffmpeg
|
|
Requires-Dist: matplotlib
|
|
Requires-Dist: ninja
|
|
Requires-Dist: numpy<2.0.0
|
|
Requires-Dist: omegaconf
|
|
Requires-Dist: packaging
|
|
Requires-Dist: psutil
|
|
Requires-Dist: pydantic
|
|
Requires-Dist: ray
|
|
Requires-Dist: rich
|
|
Requires-Dist: safetensors
|
|
Requires-Dist: timm
|
|
Requires-Dist: torch>=1.13
|
|
Requires-Dist: tqdm
|
|
Requires-Dist: transformers
|
|
|
|
<p align="center">
|
|
<img width="55%" alt="VideoSys" src="./assets/figures/logo.png?raw=true">
|
|
</p>
|
|
<h3 align="center">
|
|
An easy and efficient system for video generation
|
|
</h3>
|
|
</p>
|
|
|
|
### Latest News 🔥
|
|
- [2024/08] 🔥 Evole from [OpenDiT](https://github.com/NUS-HPC-AI-Lab/VideoSys/tree/v1.0.0) to <b>VideoSys: An easy and efficient system for video generation.</b>
|
|
- [2024/08] 🔥 <b>Release PAB paper: [Real-Time Video Generation with Pyramid Attention Broadcast](https://arxiv.org/abs/2408.12588).</b>
|
|
- [2024/06] Propose Pyramid Attention Broadcast (PAB)[[paper](https://arxiv.org/abs/2408.12588)][[blog](https://oahzxl.github.io/PAB/)][[doc](./docs/pab.md)], the first approach to achieve <b>real-time</b> DiT-based video generation, delivering <b>negligible quality loss</b> without <b>requiring any training</b>.
|
|
- [2024/06] Support [Open-Sora-Plan](https://github.com/PKU-YuanGroup/Open-Sora-Plan) and [Latte](https://github.com/Vchitect/Latte).
|
|
- [2024/03] Propose Dynamic Sequence Parallel (DSP)[[paper](https://arxiv.org/abs/2403.10266)][[doc](./docs/dsp.md)], achieves **3x** speed for training and **2x** speed for inference in Open-Sora compared with sota sequence parallelism.
|
|
- [2024/03] Support [Open-Sora: Democratizing Efficient Video Production for All](https://github.com/hpcaitech/Open-Sora).
|
|
- [2024/02] 🎉 Release [OpenDiT](https://github.com/NUS-HPC-AI-Lab/VideoSys/tree/v1.0.0): An Easy, Fast and Memory-Efficent System for DiT Training and Inference.
|
|
|
|
# About
|
|
|
|
VideoSys is an open-source project that provides a user-friendly and high-performance infrastructure for video generation. This comprehensive toolkit will support the entire pipeline from training and inference to serving and compression.
|
|
|
|
We are committed to continually integrating cutting-edge open-source video models and techniques. Stay tuned for exciting enhancements and new features on the horizon!
|
|
|
|
## Installation
|
|
|
|
Prerequisites:
|
|
|
|
- Python >= 3.10
|
|
- PyTorch >= 1.13 (We recommend to use a >2.0 version)
|
|
- CUDA >= 11.6
|
|
|
|
We strongly recommend using Anaconda to create a new environment (Python >= 3.10) to run our examples:
|
|
|
|
```shell
|
|
conda create -n videosys python=3.10 -y
|
|
conda activate videosys
|
|
```
|
|
|
|
Install VideoSys:
|
|
|
|
```shell
|
|
git clone https://github.com/NUS-HPC-AI-Lab/VideoSys
|
|
cd VideoSys
|
|
pip install -e .
|
|
```
|
|
|
|
|
|
## Usage
|
|
|
|
VideoSys supports many diffusion models with our various acceleration techniques, enabling these models to run faster and consume less memory.
|
|
|
|
<b>You can find all available models and their supported acceleration techniques in the following table. Click `Doc` to see how to use them.</b>
|
|
|
|
<table>
|
|
<tr>
|
|
<th rowspan="2">Model</th>
|
|
<th rowspan="2">Train</th>
|
|
<th rowspan="2">Infer</th>
|
|
<th colspan="2">Acceleration Techniques</th>
|
|
<th rowspan="2">Usage</th>
|
|
</tr>
|
|
<tr>
|
|
<th><a href="https://github.com/NUS-HPC-AI-Lab/VideoSys?tab=readme-ov-file#dyanmic-sequence-parallelism-dsp-paperdoc">DSP</a></th>
|
|
<th><a href="https://github.com/NUS-HPC-AI-Lab/VideoSys?tab=readme-ov-file#pyramid-attention-broadcast-pab-blogdoc">PAB</a></th>
|
|
</tr>
|
|
<tr>
|
|
<td>Open-Sora [<a href="https://github.com/hpcaitech/Open-Sora">source</a>]</td>
|
|
<td align="center">🟡</td>
|
|
<td align="center">✅</td>
|
|
<td align="center">✅</td>
|
|
<td align="center">✅</td>
|
|
<td align="center"><a href="./examples/open_sora/sample.py">Code</a></td>
|
|
</tr>
|
|
<tr>
|
|
<td>Open-Sora-Plan [<a href="https://github.com/PKU-YuanGroup/Open-Sora-Plan">source</a>]</td>
|
|
<td align="center">/</td>
|
|
<td align="center">✅</td>
|
|
<td align="center">✅</td>
|
|
<td align="center">✅</td>
|
|
<td align="center"><a href="./examples/open_sora_plan/sample.py">Code</a></td>
|
|
</tr>
|
|
<tr>
|
|
<td>Latte [<a href="https://github.com/Vchitect/Latte">source</a>]</td>
|
|
<td align="center">/</td>
|
|
<td align="center">✅</td>
|
|
<td align="center">✅</td>
|
|
<td align="center">✅</td>
|
|
<td align="center"><a href="./examples/latte/sample.py">Code</a></td>
|
|
</tr>
|
|
<tr>
|
|
<td>CogVideoX [<a href="https://github.com/THUDM/CogVideo">source</a>]</td>
|
|
<td align="center">/</td>
|
|
<td align="center">✅</td>
|
|
<td align="center">/</td>
|
|
<td align="center">✅</td>
|
|
<td align="center"><a href="./examples/cogvideox/sample.py">Code</a></td>
|
|
</tr>
|
|
</table>
|
|
|
|
## Acceleration Techniques
|
|
|
|
### Pyramid Attention Broadcast (PAB) [[paper](https://arxiv.org/abs/2408.12588)][[blog](https://arxiv.org/abs/2403.10266)][[doc](./docs/pab.md)]
|
|
|
|
Real-Time Video Generation with Pyramid Attention Broadcast
|
|
|
|
Authors: [Xuanlei Zhao](https://oahzxl.github.io/)<sup>1*</sup>, [Xiaolong Jin]()<sup>2*</sup>, [Kai Wang](https://kaiwang960112.github.io/)<sup>1*</sup>, and [Yang You](https://www.comp.nus.edu.sg/~youy/)<sup>1</sup> (* indicates equal contribution)
|
|
|
|
<sup>1</sup>National University of Singapore, <sup>2</sup>Purdue University
|
|
|
|

|
|
|
|
PAB is the first approach to achieve <b>real-time</b> DiT-based video generation, delivering <b>lossless quality</b> without <b>requiring any training</b>. By mitigating redundant attention computation, PAB achieves up to 21.6 FPS with 10.6x acceleration, without sacrificing quality across popular DiT-based video generation models including [Open-Sora](https://github.com/hpcaitech/Open-Sora), [Latte](https://github.com/Vchitect/Latte) and [Open-Sora-Plan](https://github.com/PKU-YuanGroup/Open-Sora-Plan).
|
|
|
|
See its details [here](./docs/pab.md).
|
|
|
|
----
|
|
|
|
### Dyanmic Sequence Parallelism (DSP) [[paper](https://arxiv.org/abs/2403.10266)][[doc](./docs/dsp.md)]
|
|
|
|

|
|
|
|
DSP is a novel, elegant and super efficient sequence parallelism for [Open-Sora](https://github.com/hpcaitech/Open-Sora), [Latte](https://github.com/Vchitect/Latte) and other multi-dimensional transformer architecture.
|
|
|
|
It achieves **3x** speed for training and **2x** speed for inference in Open-Sora compared with sota sequence parallelism ([DeepSpeed Ulysses](https://arxiv.org/abs/2309.14509)). For a 10s (80 frames) of 512x512 video, the inference latency of Open-Sora is:
|
|
|
|
| Method | 1xH800 | 8xH800 (DS Ulysses) | 8xH800 (DSP) |
|
|
| ------ | ------ | ------ | ------ |
|
|
| Latency(s) | 106 | 45 | 22 |
|
|
|
|
See its details [here](./docs/dsp.md).
|
|
|
|
|
|
## Contributing
|
|
|
|
We welcome and value any contributions and collaborations. Please check out [CONTRIBUTING.md](./CONTRIBUTING.md) for how to get involved.
|
|
|
|
## Contributors
|
|
|
|
<a href="https://github.com/NUS-HPC-AI-Lab/VideoSys/graphs/contributors">
|
|
<img src="https://contrib.rocks/image?repo=NUS-HPC-AI-Lab/VideoSys"/>
|
|
</a>
|
|
|
|
## Star History
|
|
|
|
[](https://star-history.com/#NUS-HPC-AI-Lab/VideoSys&Date)
|
|
|
|
## Citation
|
|
|
|
```
|
|
@misc{videosys2024,
|
|
author={VideoSys Team},
|
|
title={VideoSys: An Easy and Efficient System for Video Generation},
|
|
year={2024},
|
|
publisher={GitHub},
|
|
url = {https://github.com/NUS-HPC-AI-Lab/VideoSys},
|
|
}
|
|
|
|
@misc{zhao2024pab,
|
|
title={Real-Time Video Generation with Pyramid Attention Broadcast},
|
|
author={Xuanlei Zhao and Xiaolong Jin and Kai Wang and Yang You},
|
|
year={2024},
|
|
eprint={2408.12588},
|
|
archivePrefix={arXiv},
|
|
primaryClass={cs.CV},
|
|
url={https://arxiv.org/abs/2408.12588},
|
|
}
|
|
|
|
@misc{zhao2024dsp,
|
|
title={DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers},
|
|
author={Xuanlei Zhao and Shenggan Cheng and Chang Chen and Zangwei Zheng and Ziming Liu and Zheming Yang and Yang You},
|
|
year={2024},
|
|
eprint={2403.10266},
|
|
archivePrefix={arXiv},
|
|
primaryClass={cs.DC},
|
|
url={https://arxiv.org/abs/2403.10266},
|
|
}
|
|
|
|
@misc{zhao2024opendit,
|
|
author={Xuanlei Zhao, Zhongkai Zhao, Ziming Liu, Haotian Zhou, Qianli Ma, and Yang You},
|
|
title={OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference},
|
|
year={2024},
|
|
publisher={GitHub},
|
|
url={https://github.com/NUS-HPC-AI-Lab/VideoSys/tree/v1.0.0},
|
|
}
|
|
```
|