Merge branch 'main' of https://github.com/kijai/ComfyUI-CogVideoXWrapper

2026-08-01 01:21:19 +08:00 · 2024-08-27 18:01:43 +03:00 · 2024-08-27 18:01:43 +03:00 · a30037feb1
commit a30037feb1
parent 06b5e021ad 1561834955
1 changed files with 8 additions and 0 deletions
--- a/readme.md
+++ b/readme.md
@ -1,11 +1,19 @@
 # WORK IN PROGRESS

+## Update
+5b model is now also supported for basic text2vid: https://huggingface.co/THUDM/CogVideoX-5b
+
+It is also autodownloaded to `ComfyUI/models/CogVideo/CogVideoX-5b`, text encoder is not needed as we use the ComfyUI T5.
+
+https://github.com/user-attachments/assets/991205cc-826e-4f93-831a-c10441f0f2ce
+
 Requires diffusers 0.30.1 (this is specified in requirements.txt)

 Uses same T5 model than SD3 and Flux, fp8 works fine too. Memory requirements depend mostly on the video length. 
 VAE decoding seems to be the only big that takes a lot of VRAM when everything is offloaded, peaks at around 13-14GB momentarily at that stage.
 Sampling itself takes only maybe 5-6GB.

+
 Hacked in img2img to attempt vid2vid workflow, works interestingly with some inputs, highly experimental.

 https://github.com/user-attachments/assets/e6951ef4-ea7a-4752-94f6-cf24f2503d83