2024-12-18 10:49:43 +08:00
2024-12-06 21:02:09 +08:00
2024-12-18 10:49:43 +08:00
2024-12-16 17:27:04 +08:00
2024-12-06 21:02:09 +08:00
2024-12-06 21:02:09 +08:00
2024-12-06 21:02:09 +08:00
2024-12-06 21:02:09 +08:00
2024-12-06 21:02:09 +08:00
2024-12-06 21:06:58 +08:00
2024-12-16 17:27:04 +08:00

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

1University of Chinese Academy of Sciences,  2Alibaba Group
3Institute of Automation, Chinese Academy of Sciences
4Fudan University,  5Nanyang Technological University
(* Work was done during internship at Alibaba Group. † Corresponding author.)

visualization

Introduction

We introduce Timestep Embedding Aware Cache (TeaCache), a training-free caching approach that estimates and leverages the fluctuating differences among model outputs across timesteps. For more details and visual results, please visit our project page.

Installation

Prerequisites:

  • Python >= 3.10
  • PyTorch >= 1.13 (We recommend to use a >2.0 version)
  • CUDA >= 11.6

We strongly recommend using Anaconda to create a new environment (Python >= 3.10) to run our examples:

conda create -n teacache python=3.10 -y
conda activate teacache

Install VideoSys:

git clone https://github.com/LiewFeng/TeaCache
cd TeaCache
pip install -e .

Evaluation of TeaCache

We first generate videos according to VBench's prompts.

And then calculate Vbench, PSNR, LPIPS and SSIM based on the video generated.

  1. Generate video
cd eval/teacache
python experiments/latte.py
python experiments/opensora.py
python experiments/open_sora_plan.py
  1. Calculate Vbench score
# vbench is calculated independently
# get scores for all metrics
python vbench/run_vbench.py --video_path aaa --save_path bbb
# calculate final score
python vbench/cal_vbench.py --score_dir bbb
  1. Calculate other metrics
# these metrics are calculated compared with original model
# gt video is the video of original model
# generated video is our methods's results
python common_metrics/eval.py --gt_video_dir aa --generated_video_dir bb

Citation

@misc{liu2024timestep,
      title={Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model},
      author={Feng Liu and Shiwei Zhang and Xiaofeng Wang and Yujie Wei and Haonan Qiu and Yuzhong Zhao and Yingya Zhang and Qixiang Ye and Fang Wan},
      year={2024},
      eprint={2411.19108},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.19108}
}

Acknowledgement

This repository is built based on VideoSys. Thanks for their contributions!

Description
No description provided
Readme Multiple Licenses 28 MiB
Languages
Python 100%