26 Commits

Author SHA1 Message Date
kijai
3613700752 possible sdpa kernel fixes and add optional cfg scheduling 2024-10-27 12:23:01 +02:00
kijai
f29f739707 support cublas_ops with GGUF
pretty big speed boost on 4090 at least, needs this installed:
https://github.com/aredden/torch-cublas-hgemm
2024-10-26 16:42:25 +03:00
kijai
0d15c0bd69 torch compile for vae loader 2024-10-26 03:24:25 +03:00
kijai
b7d3fc5e73 Update nodes.py 2024-10-26 00:56:12 +03:00
kijai
3348a0fed7 ability to split the batch for the other decoder node 2024-10-26 00:42:48 +03:00
kijai
ca5dfdf79c typo 2024-10-25 18:36:03 +03:00
kijai
25eeab3c4c torch.compile support
works in Windows with torch 2.5.0 and Triton from https://github.com/woct0rdho/triton-windows
2024-10-25 18:15:30 +03:00
kijai
36a4275b3b Add alternative VAE decoding node
This was actually unused code in the VAE model, only does spatial tiling though, but seams look better
2024-10-25 15:30:20 +03:00
kijai
a51ca0e907 Add GGUF_Q8_0 2024-10-25 01:28:09 +03:00
kijai
1f25400bc2 small update to GGUF model 2024-10-24 22:44:02 +03:00
kijai
813bbe8f4b Add model and vae loader nodes 2024-10-24 21:38:06 +03:00
kijai
813d6aa92f Add optional inputs to force execution order between text encoding and model loading 2024-10-24 19:01:28 +03:00
kijai
f4c13b1ef4 Add first GGUF test version 2024-10-24 17:05:50 +03:00
kijai
d699fae213 cleanup, possibly support older GPUs 2024-10-24 14:27:11 +03:00
kijai
257c526125 Cleanup, fix seed gen, better warnings for decoder 2024-10-24 12:45:01 +03:00
kijai
f714748ad4 make compatible with comfy save/load latents 2024-10-24 03:23:01 +03:00
Jukka Seppänen
00a550e81c Should work without flash_attn (thanks @logtd), add sage_attn
tested to work in Linux at least
2024-10-24 02:25:57 +03:00
Jukka Seppänen
1ba3ac8e25 Revert "works without flash_attn (thanks @juxtapoz!)"
This reverts commit a1b1f86aa3b6780a4981157b8e2e37b0a1017568.
2024-10-24 02:23:46 +03:00
Jukka Seppänen
a1b1f86aa3 works without flash_attn (thanks @juxtapoz!)
at least on Linux, also sage_attn
2024-10-24 02:19:00 +03:00
kijai
1cd5409295 Add bf16 model 2024-10-24 00:00:38 +03:00
kijai
426c35d6b0 fix vae autodownload path 2024-10-23 21:56:04 +03:00
kijai
fb880273a0 update 2024-10-23 17:04:50 +03:00
kijai
db87f8e608 Add accelerate 2024-10-23 16:05:19 +03:00
kijai
34e029bacc fix dtype selection 2024-10-23 15:49:37 +03:00
kijai
4efb7c85df cleanup 2024-10-23 15:45:20 +03:00
kijai
b80cb4a691 initial commit 2024-10-23 15:34:22 +03:00