Fix on-load VRAM OOM

slow down the CPU on model load to not run ahead. This fixes a VRAM on flux 2 load. I went to try and debug this with the memory trace pickles, which needs --disable-cuda-malloc which made the bug go away. So I tried this synchronize and it worked. The has some very complex interactions with the cuda malloc async and I dont have solid theory on this one yet. Still debugging but this gets us over the OOM for the moment.
2026-01-24 02:24:27 +08:00 · 2025-12-06 17:14:30 +10:00 · 2025-12-06 17:14:30 +10:00 · deff0ac835
commit deff0ac835
parent d7a0aef650
1 changed files with 2 additions and 0 deletions
--- a/comfy/model_patcher.py
+++ b/comfy/model_patcher.py
@ -761,6 +761,8 @@ class ModelPatcher:
                    key = "{}.{}".format(n, param)
                    self.unpin_weight(key)
                    self.patch_weight_to_device(key, device_to=device_to)
+                if comfy.model_management.is_device_cuda(device_to):
+                    torch.cuda.synchronize()

                logging.debug("lowvram: loaded module regularly {} {}".format(n, m))
                m.comfy_patched_weights = True