13 Commits

Author SHA1 Message Date
kijai
e20eb66f93 cleanup 2024-10-27 03:00:52 +02:00
kijai
ddfb3a6bf2 backends 2024-10-26 17:49:15 +03:00
kijai
f29f739707 support cublas_ops with GGUF
pretty big speed boost on 4090 at least, needs this installed:
https://github.com/aredden/torch-cublas-hgemm
2024-10-26 16:42:25 +03:00
kijai
b932036af3 Update asymm_models_joint.py 2024-10-25 22:37:08 +03:00
kijai
e66735527c tweak 2024-10-25 20:04:01 +03:00
kijai
25eeab3c4c torch.compile support
works in Windows with torch 2.5.0 and Triton from https://github.com/woct0rdho/triton-windows
2024-10-25 18:15:30 +03:00
kijai
813bbe8f4b Add model and vae loader nodes 2024-10-24 21:38:06 +03:00
kijai
d699fae213 cleanup, possibly support older GPUs 2024-10-24 14:27:11 +03:00
Jukka Seppänen
00a550e81c Should work without flash_attn (thanks @logtd), add sage_attn
tested to work in Linux at least
2024-10-24 02:25:57 +03:00
Jukka Seppänen
1ba3ac8e25 Revert "works without flash_attn (thanks @juxtapoz!)"
This reverts commit a1b1f86aa3b6780a4981157b8e2e37b0a1017568.
2024-10-24 02:23:46 +03:00
Jukka Seppänen
a1b1f86aa3 works without flash_attn (thanks @juxtapoz!)
at least on Linux, also sage_attn
2024-10-24 02:19:00 +03:00
kijai
1cd5409295 Add bf16 model 2024-10-24 00:00:38 +03:00
kijai
b80cb4a691 initial commit 2024-10-23 15:34:22 +03:00