105 Commits

Author SHA1 Message Date
Jukka Seppänen
76956cda50
Merge pull request #67 from niw/use_float16_on_mps
Use float16 for autocast on mps
2024-11-07 00:25:24 +09:00
kijai
84e536e226 make latent normalization optional for testing 2024-11-05 22:30:13 +02:00
kijai
fbd2252dc4 fix 2024-11-05 20:36:32 +02:00
kijai
70ad32621b reorder 2024-11-05 20:21:47 +02:00
kijai
70bd024456 Update asymm_models_joint.py 2024-11-05 19:40:35 +02:00
kijai
cdf9fdbc2d allow apex norm to work 2024-11-05 19:40:22 +02:00
kijai
7391389276 cleanup 2024-11-05 12:29:53 +02:00
kijai
ab3b18a153 Update mochi_fastercache_test_01.json 2024-11-05 09:41:18 +02:00
kijai
f6020f71e0 dynamic compile 2024-11-05 08:57:48 +02:00
kijai
04d15b64ae Update asymm_models_joint.py 2024-11-05 08:51:59 +02:00
kijai
1811b7b6c5 Update asymm_models_joint.py 2024-11-05 08:50:36 +02:00
kijai
d29e95d707 Add example 2024-11-05 08:44:55 +02:00
kijai
24a7edfca6 update, works 2024-11-05 08:41:50 +02:00
kijai
3535a846a8 update from main 2024-11-05 07:08:14 +02:00
Yoshimasa Niwa
24a834be79 Use dtype for vea.
Somehow, download node is not using dtype for vae.
2024-11-05 13:53:34 +09:00
Yoshimasa Niwa
99285ca1e7 Use float16 for autocast on mps
mps only supports float16 for autocast for now.
2024-11-05 13:53:34 +09:00
Jukka Seppänen
21374934d3
Merge pull request #66 from niw/workaround_mps_pad_problem
Workaround pad problem on mps
2024-11-05 13:48:45 +09:00
kijai
fdecd4ee08 Update nodes.py 2024-11-05 06:18:49 +02:00
Yoshimasa Niwa
5ca4bbf319 Workaround pad problem on mps
When using `torch.nn.functional.pad` with tensor that size is
larger than 2^16 (65526), the output tensor would be broken.

This patch moves tensor to CPU to workaround the problem.
It doesn't too much impacts in terms of speed of vea on mps.
2024-11-05 12:46:13 +09:00
kijai
78f9e7b896 Update nodes.py 2024-11-04 15:10:15 +02:00
kijai
f94cf43331 Update nodes.py 2024-11-04 15:06:20 +02:00
kijai
56b5dbbf82 Add different RMSNorm functions for testing
Initial testing for me shows that the RMSNorm from flash_attn.ops.triton.layer_norm is ~8-10% faster, apex is untested as I don't currently have it installed.
2024-11-04 14:11:58 +02:00
kijai
fd4a02e6a6 Update nodes.py 2024-11-03 19:41:08 +02:00
kijai
4a7458ffd6 restore MochiEdit compatiblity
temporary
2024-11-03 19:18:29 +02:00
kijai
0dc011d1b6 cleanup and align more to comfy code, switch to using cpu seed as well 2024-11-03 18:36:23 +02:00
kijai
3cf9289e08 cleanup code 2024-11-03 01:47:34 +02:00
kijai
a6e545531c test 2024-11-03 00:32:51 +02:00
kijai
5ec01cbff4 fix vae download path 2024-11-02 01:14:14 +02:00
kijai
09f327326b Add denoise 2024-11-01 23:05:34 +02:00
kijai
85c996d7b8 Add MochiSigmaSchedule node, better denoise formula 2024-11-01 19:36:40 +02:00
kijai
ec298a1d64 fix encoding precision 2024-11-01 18:21:34 +02:00
kijai
a5b06b02ad tiled encoding 2024-11-01 18:03:47 +02:00
kijai
69ab797b8c Add sampler preview 2024-11-01 06:18:41 +02:00
kijai
ac5de728ad Add VAE encoder 2024-11-01 05:22:49 +02:00
kijai
ebd0f62d53 Update nodes.py 2024-10-30 22:08:22 +02:00
kijai
d971a19410 spatial VAE decoder fixes 2024-10-30 22:06:55 +02:00
kijai
f0f939b20b cleanup, fix untiled spatial vae decode 2024-10-30 21:12:34 +02:00
kijai
3dce06b28b make compatible with comfy cliptextencode 2024-10-28 12:23:14 +02:00
kijai
ce903c0384 Update mz_gguf_loader.py 2024-10-28 04:07:11 +02:00
kijai
db23e2ecc0 should sleep more 2024-10-27 20:15:44 +02:00
kijai
0b6812671c oops 2024-10-27 20:11:07 +02:00
kijai
3395aa8ca0 cleanup, sampler output name fix 2024-10-27 20:02:44 +02:00
kijai
195da244df make cfg 1.0 not do uncond, set steps by sigma schedule 2024-10-27 19:52:16 +02:00
kijai
4348d1ed20 fix Q4 cublas ops 2024-10-27 14:04:51 +02:00
kijai
d1155ad305 Update nodes.py 2024-10-27 12:41:43 +02:00
kijai
185f4e0bee custom sigmas 2024-10-27 12:41:33 +02:00
kijai
3613700752 possible sdpa kernel fixes and add optional cfg scheduling 2024-10-27 12:23:01 +02:00
kijai
e20eb66f93 cleanup 2024-10-27 03:00:52 +02:00
kijai
e82e6ee3f7 Update mz_gguf_loader.py 2024-10-26 18:10:45 +03:00
kijai
c5c136cb11 fix 2024-10-26 17:57:02 +03:00