From 49767f1cda2af0a21a5613222ff2ee215afcd141 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jukka=20Sepp=C3=A4nen?= <40791699+kijai@users.noreply.github.com> Date: Wed, 7 Aug 2024 02:15:27 +0300 Subject: [PATCH] Update readme.md --- readme.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/readme.md b/readme.md index e2f3c01..a5fa5ad 100644 --- a/readme.md +++ b/readme.md @@ -4,6 +4,13 @@ Currently requires diffusers with PR: https://github.com/huggingface/diffusers/p This is specified in requirements.txt +Uses same T5 model than SD3 and Flux, fp8 works fine too. Memory requirements depend mostly on the video length. +VAE decoding seems to be the only big that takes a lot of VRAM when everything is offloaded, peaks at around 13-14GB momentarily at that stage. +Sampling itself takes only maybe 5-6GB. + +Hacked in img2img to attempt vid2vid workflow, works interestingly with some inputs, highly experimental. + +https://github.com/user-attachments/assets/e6951ef4-ea7a-4752-94f6-cf24f2503d83 https://github.com/user-attachments/assets/9e41f37b-2bb3-411c-81fa-e91b80da2559