Yoshimasa Niwa
24a834be79
Use dtype for vea.
...
Somehow, download node is not using dtype for vae.
2024-11-05 13:53:34 +09:00
kijai
fdecd4ee08
Update nodes.py
2024-11-05 06:18:49 +02:00
kijai
78f9e7b896
Update nodes.py
2024-11-04 15:10:15 +02:00
kijai
f94cf43331
Update nodes.py
2024-11-04 15:06:20 +02:00
kijai
56b5dbbf82
Add different RMSNorm functions for testing
...
Initial testing for me shows that the RMSNorm from flash_attn.ops.triton.layer_norm is ~8-10% faster, apex is untested as I don't currently have it installed.
2024-11-04 14:11:58 +02:00
kijai
fd4a02e6a6
Update nodes.py
2024-11-03 19:41:08 +02:00
kijai
0dc011d1b6
cleanup and align more to comfy code, switch to using cpu seed as well
2024-11-03 18:36:23 +02:00
kijai
5ec01cbff4
fix vae download path
2024-11-02 01:14:14 +02:00
kijai
09f327326b
Add denoise
2024-11-01 23:05:34 +02:00
kijai
85c996d7b8
Add MochiSigmaSchedule node, better denoise formula
2024-11-01 19:36:40 +02:00
kijai
ec298a1d64
fix encoding precision
2024-11-01 18:21:34 +02:00
kijai
a5b06b02ad
tiled encoding
2024-11-01 18:03:47 +02:00
kijai
69ab797b8c
Add sampler preview
2024-11-01 06:18:41 +02:00
kijai
ac5de728ad
Add VAE encoder
2024-11-01 05:22:49 +02:00
kijai
ebd0f62d53
Update nodes.py
2024-10-30 22:08:22 +02:00
kijai
d971a19410
spatial VAE decoder fixes
2024-10-30 22:06:55 +02:00
kijai
f0f939b20b
cleanup, fix untiled spatial vae decode
2024-10-30 21:12:34 +02:00
kijai
3dce06b28b
make compatible with comfy cliptextencode
2024-10-28 12:23:14 +02:00
kijai
db23e2ecc0
should sleep more
2024-10-27 20:15:44 +02:00
kijai
0b6812671c
oops
2024-10-27 20:11:07 +02:00
kijai
3395aa8ca0
cleanup, sampler output name fix
2024-10-27 20:02:44 +02:00
kijai
195da244df
make cfg 1.0 not do uncond, set steps by sigma schedule
2024-10-27 19:52:16 +02:00
kijai
4348d1ed20
fix Q4 cublas ops
2024-10-27 14:04:51 +02:00
kijai
d1155ad305
Update nodes.py
2024-10-27 12:41:43 +02:00
kijai
185f4e0bee
custom sigmas
2024-10-27 12:41:33 +02:00
kijai
3613700752
possible sdpa kernel fixes and add optional cfg scheduling
2024-10-27 12:23:01 +02:00
kijai
f29f739707
support cublas_ops with GGUF
...
pretty big speed boost on 4090 at least, needs this installed:
https://github.com/aredden/torch-cublas-hgemm
2024-10-26 16:42:25 +03:00
kijai
0d15c0bd69
torch compile for vae loader
2024-10-26 03:24:25 +03:00
kijai
b7d3fc5e73
Update nodes.py
2024-10-26 00:56:12 +03:00
kijai
3348a0fed7
ability to split the batch for the other decoder node
2024-10-26 00:42:48 +03:00
kijai
ca5dfdf79c
typo
2024-10-25 18:36:03 +03:00
kijai
25eeab3c4c
torch.compile support
...
works in Windows with torch 2.5.0 and Triton from https://github.com/woct0rdho/triton-windows
2024-10-25 18:15:30 +03:00
kijai
36a4275b3b
Add alternative VAE decoding node
...
This was actually unused code in the VAE model, only does spatial tiling though, but seams look better
2024-10-25 15:30:20 +03:00
kijai
a51ca0e907
Add GGUF_Q8_0
2024-10-25 01:28:09 +03:00
kijai
1f25400bc2
small update to GGUF model
2024-10-24 22:44:02 +03:00
kijai
813bbe8f4b
Add model and vae loader nodes
2024-10-24 21:38:06 +03:00
kijai
813d6aa92f
Add optional inputs to force execution order between text encoding and model loading
2024-10-24 19:01:28 +03:00
kijai
f4c13b1ef4
Add first GGUF test version
2024-10-24 17:05:50 +03:00
kijai
d699fae213
cleanup, possibly support older GPUs
2024-10-24 14:27:11 +03:00
kijai
257c526125
Cleanup, fix seed gen, better warnings for decoder
2024-10-24 12:45:01 +03:00
kijai
f714748ad4
make compatible with comfy save/load latents
2024-10-24 03:23:01 +03:00
Jukka Seppänen
00a550e81c
Should work without flash_attn (thanks @logtd), add sage_attn
...
tested to work in Linux at least
2024-10-24 02:25:57 +03:00
Jukka Seppänen
1ba3ac8e25
Revert "works without flash_attn (thanks @juxtapoz!)"
...
This reverts commit a1b1f86aa3b6780a4981157b8e2e37b0a1017568.
2024-10-24 02:23:46 +03:00
Jukka Seppänen
a1b1f86aa3
works without flash_attn (thanks @juxtapoz!)
...
at least on Linux, also sage_attn
2024-10-24 02:19:00 +03:00
kijai
1cd5409295
Add bf16 model
2024-10-24 00:00:38 +03:00
kijai
426c35d6b0
fix vae autodownload path
2024-10-23 21:56:04 +03:00
kijai
fb880273a0
update
2024-10-23 17:04:50 +03:00
kijai
db87f8e608
Add accelerate
2024-10-23 16:05:19 +03:00
kijai
34e029bacc
fix dtype selection
2024-10-23 15:49:37 +03:00
kijai
4efb7c85df
cleanup
2024-10-23 15:45:20 +03:00