From 4086acf3c2f0ca3a8861b04f6179fa9f908e3e25 Mon Sep 17 00:00:00 2001
From: rattus <46076784+rattus128@users.noreply.github.com>
Date: Sun, 7 Dec 2025 09:42:09 +1000
Subject: [PATCH] Fix on-load VRAM OOM (#11144)

slow down the CPU on model load to not run ahead. This fixes a VRAM on
flux 2 load.

I went to try and debug this with the memory trace pickles, which needs
--disable-cuda-malloc which made the bug go away. So I tried this
synchronize and it worked.

The has some very complex interactions with the cuda malloc async and
I dont have solid theory on this one yet.

Still debugging but this gets us over the OOM for the moment.
---
 comfy/model_patcher.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/comfy/model_patcher.py b/comfy/model_patcher.py
index 4f076a6aa..5b1ccb824 100644
--- a/comfy/model_patcher.py
+++ b/comfy/model_patcher.py
@@ -762,6 +762,8 @@ class ModelPatcher:
                     key = "{}.{}".format(n, param)
                     self.unpin_weight(key)
                     self.patch_weight_to_device(key, device_to=device_to)
+                if comfy.model_management.is_device_cuda(device_to):
+                    torch.cuda.synchronize()
 
                 logging.debug("lowvram: loaded module regularly {} {}".format(n, m))
                 m.comfy_patched_weights = True