TeaCache4Lumina2

TeaCache can speedup Lumina-Image-2.0 without much visual quality degradation, in a training-free manner. The following image shows the experimental results of Lumina-Image-2.0 and TeaCache with different versions: v1(0 (original), 0.2 (1.25x speedup), 0.3 (1.5625x speedup), 0.4 (2.0833x speedup), 0.5 (2.5x speedup).) and v2(Lumina-Image-2.0 (~25 s), TeaCache (0.2) (~16.7 s, 1.5x speedup), TeaCache (0.3) (~15.6 s, 1.6x speedup), TeaCache (0.5) (~13.79 s, 1.8x speedup), TeaCache (1.1) (~11.9 s, 2.1x speedup)).

The v1 coefficients [393.76566581,−603.50993606,209.10239044,−23.00726601,0.86377344] exhibit poor quality at low L1 values but perform better with higher L1 settings, though at a slower speed. The v2 coefficients [225.7042019806413,−608.8453716535591,304.1869942338369,124.21267720116742,−1.4089066892956552] , however, offer faster computation and better quality at low L1 levels but incur significant feature loss at high L1 values.

You can change the value in line 72 to switch versions

v1

v2

📈 Inference Latency Comparisons on a single 4090 (step 50)

v1

Lumina-Image-2.0	TeaCache (0.2)	TeaCache (0.3)	TeaCache (0.4)	TeaCache (0.5)
~25 s	~20 s	~16 s	~12 s	~10 s

v2

Lumina-Image-2.0	TeaCache (0.2)	TeaCache (0.3)	TeaCache (0.5)	TeaCache (1.1)
~25 s	~16.7 s	~15.6 s	~13.79 s	~11.9 s

Installation

pip install --upgrade diffusers[torch] transformers protobuf tokenizers sentencepiece
pip install flash-attn --no-build-isolation

Usage

You can modify the thresh in line 154 to obtain your desired trade-off between latency and visul quality. For single-gpu inference, you can use the following command:

python teacache_lumina2.py

Citation

If you find TeaCache is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{liu2024timestep,
  title={Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model},
  author={Liu, Feng and Zhang, Shiwei and Wang, Xiaofeng and Wei, Yujie and Qiu, Haonan and Zhao, Yuzhong and Zhang, Yingya and Ye, Qixiang and Wan, Fang},
  journal={arXiv preprint arXiv:2411.19108},
  year={2024}
}

Acknowledgements

We would like to thank the contributors to the Lumina-Image-2.0 and Diffusers.

4.4 KiB Raw Blame History Unescape Escape