Fix on-load VRAM OOM (#11144)

slow down the CPU on model load to not run ahead. This fixes a VRAM on flux 2 load. I went to try and debug this with the memory trace pickles, which needs --disable-cuda-malloc which made the bug go away. So I tried this synchronize and it worked. The has some very complex interactions with the cuda malloc async and I dont have solid theory on this one yet. Still debugging but this gets us over the OOM for the moment.
2026-03-16 11:57:07 +08:00 · 2025-12-07 09:42:09 +10:00 · 2025-12-07 09:42:09 +10:00 · 4086acf3c2
commit 4086acf3c2
parent 50ca97e776
1 changed files with 2 additions and 0 deletions
--- a/comfy/model_patcher.py
+++ b/comfy/model_patcher.py
@ -762,6 +762,8 @@ class ModelPatcher:
                    key = "{}.{}".format(n, param)
                    self.unpin_weight(key)
                    self.patch_weight_to_device(key, device_to=device_to)
                if comfy.model_management.is_device_cuda(device_to):
                    torch.cuda.synchronize()
                logging.debug("lowvram: loaded module regularly {} {}".format(n, m))
                m.comfy_patched_weights = True