ComfyUI/comfy at ab7ab5be23fb9b71d1790f424e7dcf91dc1fe0cc - ComfyUI - 丝路新云-代码仓

xinyun/ComfyUI

mirror of https://git.datalinker.icu/comfyanonymous/ComfyUI synced 2026-06-24 08:16:58 +08:00

History

rattus ab7ab5be23

Fix Race condition in --async-offload that can cause corruption (#10501 )

* mm: factor out the current stream getter

Make this a reusable function.

* ops: sync the offload stream with the consumption of w&b

This sync is nessacary as pytorch will queue cuda async frees on the
same stream as created to tensor. In the case of async offload, this
will be on the offload stream.

Weights and biases can go out of scope in python which then
triggers the pytorch garbage collector to queue the free operation on
the offload stream possible before the compute stream has used the
weight. This causes a use after free on weight data leading to total
corruption of some workflows.

So sync the offload stream with the compute stream after the weight
has been used so the free has to wait for the weight to be used.

The cast_bias_weight is extended in a backwards compatible way with
the new behaviour opt-in on a defaulted parameter. This handles
custom node packs calling cast_bias_weight and defeatures
async-offload for them (as they do not handle the race).

The pattern is now:

cast_bias_weight(... , offloadable=True) #This might be offloaded
thing(weight, bias, ...)
uncast_bias_weight(...)

* controlnet: adopt new cast_bias_weight synchronization scheme

This is nessacary for safe async weight offloading.

* mm: sync the last stream in the queue, not the next

Currently this peeks ahead to sync the next stream in the queue of
streams with the compute stream. This doesnt allow a lot of
parallelization, as then end result is you can only get one weight load
ahead regardless of how many streams you have.

Rotate the loop logic here to synchronize the end of the queue before
returning the next stream. This allows weights to be loaded ahead of the
compute streams position.

2025-10-29 17:17:46 -04:00

..

Support the HuMo model. (#9903 )

2025-09-17 00:12:48 -04:00

Replace print with logging (#6138 )

2024-12-20 16:24:55 -05:00

LoRA Trainer: LoRA training node in weight adapter scheme (#8446 )

2025-06-13 19:25:59 -04:00

Uni pc sampler now works with audio and video models.

2025-01-18 05:27:58 -05:00

Add Hunyuan 3D 2.1 Support (#8714 )

2025-09-04 20:36:20 -04:00

Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884 )

2025-09-15 20:05:03 -04:00

Fix batch size above 1 giving bad output in chroma radiance. (#10394 )

2025-10-18 23:15:34 -04:00

Silence clip tokenizer warning. (#8934 )

2025-07-16 14:42:07 -04:00

Controlnet refactor.

2024-06-27 18:43:11 -04:00

Improvements to the TAESD3 implementation.

2024-06-16 02:04:24 -04:00

Implement gemma 3 as a text encoder. (#10241 )

2025-10-06 22:08:08 -04:00

Fix LoRA Trainer bugs with FP8 models. (#9854 )

2025-09-20 21:24:48 -04:00

checkpoint_pickle.py

Remove pytorch_lightning dependency.

2023-06-13 10:11:33 -04:00

cli_args.py

Speed up offloading using pinned memory. (#10526 )

2025-10-29 00:21:01 -04:00

clip_config_bigg.json

Fix potential issue with non clip text embeddings.

2024-07-30 14:41:13 -04:00

clip_model.py

USO style reference. (#9677 )

2025-09-02 15:36:22 -04:00

clip_vision_config_g.json

Add support for clip g vision model to CLIPVisionLoader.

2023-08-18 11:13:29 -04:00

clip_vision_config_h.json

Add support for unCLIP SD2.x models.

2023-04-01 23:19:15 -04:00

clip_vision_config_vitl_336_llava.json

Support llava clip vision model.

2025-03-06 00:24:43 -05:00

clip_vision_config_vitl_336.json

support clip-vit-large-patch14-336 (#4042 )

2024-07-17 13:12:50 -04:00

clip_vision_config_vitl.json

Add support for unCLIP SD2.x models.

2023-04-01 23:19:15 -04:00

clip_vision_siglip_384.json

Support new flux model variants.

2024-11-21 08:38:23 -05:00

clip_vision_siglip_512.json

Support 512 siglip model.

2025-04-05 07:01:01 -04:00

clip_vision.py

Some changes to the previous hunyuan PR. (#9725 )

2025-09-04 20:39:02 -04:00

conds.py

Add some warnings and prevent crash when cond devices don't match. (#9169 )

2025-08-04 04:20:12 -04:00

context_windows.py

Make step index detection much more robust (#9392 )

2025-08-17 18:54:07 -04:00

controlnet.py

Fix Race condition in --async-offload that can cause corruption (#10501 )

2025-10-29 17:17:46 -04:00

diffusers_convert.py

Remove useless code.

2025-01-24 06:15:54 -05:00

diffusers_load.py

load_unet -> load_diffusion_model with a model_options argument.

2024-08-12 23:20:57 -04:00

float.py

Clamp output when rounding weight to prevent Nan.

2024-10-19 19:07:10 -04:00

gligen.py

Remove some useless code. (#8812 )

2025-07-06 07:07:39 -04:00

hooks.py

Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook (#6377 )

2025-01-11 12:20:23 -05:00

latent_formats.py

Add support for Chroma Radiance (#9682 )

2025-09-13 17:58:43 -04:00

lora_convert.py

Implement the USO subject identity lora. (#9674 )

2025-09-01 18:54:02 -04:00

lora.py

Support the omnigen2 umo lora. (#9886 )

2025-09-15 18:10:55 -04:00

model_base.py

Mixed Precision Quantization System (#10498 )

2025-10-28 16:20:53 -04:00

model_detection.py

Mixed Precision Quantization System (#10498 )

2025-10-28 16:20:53 -04:00

model_management.py

Fix Race condition in --async-offload that can cause corruption (#10501 )

2025-10-29 17:17:46 -04:00

model_patcher.py

Fix case of weights not being unpinned. (#10533 )

2025-10-29 15:48:06 -04:00

model_sampling.py

Refactor model sampling sigmas code. (#10250 )

2025-10-08 17:49:02 -04:00

nested_tensor.py

WIP way to support multi multi dimensional latents. (#10456 )

2025-10-23 21:21:14 -04:00

ops.py

Fix Race condition in --async-offload that can cause corruption (#10501 )

2025-10-29 17:17:46 -04:00

options.py

Only parse command line args when main.py is called.

2023-09-13 11:38:20 -04:00

patcher_extension.py

Fix order of inputs nested merge_nested_dicts (#10362 )

2025-10-15 16:47:26 -07:00

pixel_space_convert.py

Changes to the previous radiance commit. (#9851 )

2025-09-13 18:03:34 -04:00

quant_ops.py

Reduce memory usage for fp8 scaled op. (#10531 )

2025-10-29 15:43:51 -04:00

rmsnorm.py

Add warning when using old pytorch. (#9347 )

2025-08-15 00:22:26 -04:00

sample.py

Fix mistake. (#10484 )

2025-10-25 23:07:29 -04:00

sampler_helpers.py

Added context window support to core sampling code (#9238 )

2025-08-13 21:33:05 -04:00

samplers.py

WIP way to support multi multi dimensional latents. (#10456 )

2025-10-23 21:21:14 -04:00

sd1_clip_config.json

Fix potential issue with non clip text embeddings.

2024-07-30 14:41:13 -04:00

sd1_clip.py

Disable prompt weights for qwen. (#9438 )

2025-08-20 01:08:11 -04:00

sd.py

Fix issue. (#10527 )

2025-10-29 00:37:00 -04:00

sdxl_clip.py

Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803 )

2025-04-25 19:36:00 -04:00

supported_models_base.py

Mixed Precision Quantization System (#10498 )

2025-10-28 16:20:53 -04:00

supported_models.py

Lower wan memory estimation value a bit. (#9964 )

2025-09-20 22:09:35 -04:00

utils.py

WIP way to support multi multi dimensional latents. (#10456 )

2025-10-23 21:21:14 -04:00