ComfyUI/ldm at e42682b24ef033a93001ba27cc5c5aa461a61d8d - ComfyUI - 丝路新云-代码仓

xinyun/ComfyUI

mirror of https://git.datalinker.icu/comfyanonymous/ComfyUI synced 2026-01-25 00:34:30 +08:00

History

rattus128 e42682b24e

Reduce Peak WAN inference VRAM usage (#9898 )

* flux: Do the xq and xk ropes one at a time

This was doing independendent interleaved tensor math on the q and k
tensors, leading to the holding of more than the minimum intermediates
in VRAM. On a bad day, it would VRAM OOM on xk intermediates.

Do everything q and then everything k, so torch can garbage collect
all of qs intermediates before k allocates its intermediates.

This reduces peak VRAM usage for some WAN2.2 inferences (at least).

* wan: Optimize qkv intermediates on attention

As commented. The former logic computed independent pieces of QKV in
parallel which help more inference intermediates in VRAM spiking
VRAM usage. Fully roping Q and garbage collecting the intermediates
before touching K reduces the peak inference VRAM usage.

2025-09-16 19:21:14 -04:00

..

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Add support for Chroma Radiance (#9682 )

2025-09-13 17:58:43 -04:00

chroma_radiance

Changes to the previous radiance commit. (#9851 )

2025-09-13 18:03:34 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Reduce Peak WAN inference VRAM usage (#9898 )

2025-09-16 19:21:14 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Fix issue on old torch. (#9791 )

2025-09-10 00:23:47 -04:00

Hunyuan refiner vae now works with tiled. (#9836 )

2025-09-12 19:46:46 -04:00

Change cosmos and hydit models to use the native RMSNorm. (#7934 )

2025-05-04 06:26:20 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Implement hunyuan image refiner model. (#9817 )

2025-09-12 00:43:20 -04:00

Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884 )

2025-09-15 20:05:03 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Remove windows line endings. (#8866 )

2025-07-11 02:37:51 -04:00

Enable Runtime Selection of Attention Functions (#9639 )

2025-09-12 18:07:38 -04:00

Reduce Peak WAN inference VRAM usage (#9898 )

2025-09-16 19:21:14 -04:00

common_dit.py

add RMSNorm to comfy.ops

2025-04-14 18:00:33 -04:00

util.py

Fix and enforce new lines at the end of files.

2024-12-30 04:14:59 -05:00