ComfyUI

mirror of https://git.datalinker.icu/comfyanonymous/ComfyUI synced 2026-06-17 07:46:58 +08:00

Author	SHA1	Message	Date
comfyanonymous	f16219e3aa	Add cheap latent preview for flux 2. (#10907 ) Thank you to the person who calculated them. You saved me a percent of my time.	2025-11-26 04:00:43 -05:00
comfyanonymous	58b8574661	Fix Flux2 reference image mem estimation. (#10905 )	2025-11-26 02:36:19 -05:00
comfyanonymous	bdb10a583f	Fix loras not working on mixed fp8. (#10899 )	2025-11-26 00:07:58 -05:00
comfyanonymous	0e24dbb19f	Adjustments to Z Image. (#10893 )	2025-11-25 19:02:51 -05:00
comfyanonymous	e9aae31fa2	Z Image model. (#10892 )	2025-11-25 18:41:45 -05:00
comfyanonymous	d196a905bb	Lower vram usage for flux 2 text encoder. (#10887 )	2025-11-25 14:58:39 -05:00
comfyanonymous	dff996ca39	Fix crash. (#10885 )	2025-11-25 14:30:24 -05:00
comfyanonymous	6b573ae0cb	Flux 2 (#10879 )	2025-11-25 10:50:19 -05:00
comfyanonymous	015a0599d0	I found a case where this is needed (#10875 )	2025-11-25 03:23:19 -05:00
comfyanonymous	acfaa5c4a1	Don't try fp8 matrix mult in quantized ops if not supported by hardware. (#10874 )	2025-11-25 02:55:49 -05:00
comfyanonymous	b6805429b9	Allow pinning quantized tensors. (#10873 )	2025-11-25 02:48:20 -05:00
comfyanonymous	25022e0b09	Cleanup and fix issues with text encoder quants. (#10872 )	2025-11-25 01:48:53 -05:00
Haoming	b2ef58e2b1	block info (#10844 )	2025-11-24 10:40:09 -08:00
Haoming	6a6d456c88	block info (#10842 )	2025-11-24 10:38:38 -08:00
Haoming	3d1fdaf9f4	block info (#10843 )	2025-11-24 10:30:40 -08:00
comfyanonymous	cbd68e3d58	Add better error message for common error. (#10846 )	2025-11-23 04:55:22 -05:00
comfyanonymous	532938b16b	--disable-api-nodes now sets CSP header to force frontend offline. (#10829 )	2025-11-21 17:51:55 -05:00
comfyanonymous	943b3b615d	HunyuanVideo 1.5 (#10819 ) * init * update * Update model.py * Update model.py * remove print * Fix text encoding * Prevent empty negative prompt Really doesn't work otherwise * fp16 works * I2V * Update model_base.py * Update nodes_hunyuan.py * Better latent rgb factors * Use the correct sigclip output... * Support HunyuanVideo1.5 SR model * whitespaces... * Proper latent channel count * SR model fixes This also still needs timesteps scheduling based on the noise scale, can be used with two samplers too already * vae_refiner: roll the convolution through temporal Work in progress. Roll the convolution through time using 2-latent-frame chunks and a FIFO queue for the convolution seams. * Support HunyuanVideo15 latent resampler * fix * Some cleanup Co-Authored-By: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> * Proper hyvid15 I2V channels Co-Authored-By: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> * Fix TokenRefiner for fp16 Otherwise x.sum has infs, just in case only casting if input is fp16, I don't know if necessary. * Bugfix for the HunyuanVideo15 SR model * vae_refiner: roll the convolution through temporal II Roll the convolution through time using 2-latent-frame chunks and a FIFO queue for the convolution seams. Added support for encoder, lowered to 1 latent frame to save more VRAM, made work for Hunyuan Image 3.0 (as code shared). Fixed names, cleaned up code. * Allow any number of input frames in VAE. * Better VAE encode mem estimation. * Lowvram fix. * Fix hunyuan image 2.1 refiner. * Fix mistake. * Name changes. * Rename. * Whitespace. * Fix. * Fix. --------- Co-authored-by: kijai <40791699+kijai@users.noreply.github.com> Co-authored-by: Rattus <rattus128@gmail.com>	2025-11-20 22:44:43 -05:00
comfyanonymous	cb96d4d18c	Disable workaround on newer cudnn. (#10807 )	2025-11-19 23:56:23 -05:00
comfyanonymous	17027f2a6a	Add a way to disable the final norm in the llama based TE models. (#10794 )	2025-11-18 22:36:03 -05:00
comfyanonymous	d526974576	Fix hunyuan 3d 2.0 (#10792 )	2025-11-18 16:46:19 -05:00
comfyanonymous	bd01d9f7fd	Add left padding support to tokenizers. (#10753 )	2025-11-15 06:54:40 -05:00
comfyanonymous	443056c401	Fix custom nodes import error. (#10747 ) This should fix the import errors but will break if the custom nodes actually try to use the class.	2025-11-14 03:26:05 -05:00
comfyanonymous	f60923590c	Use same code for chroma and flux blocks so that optimizations are shared. (#10746 )	2025-11-14 01:28:05 -05:00
rattus	94c298f962	flux: reduce VRAM usage (#10737 ) Cleanup a bunch of stack tensors on Flux. This take me from B=19 to B=22 for 1600x1600 on RTX5090.	2025-11-13 16:02:03 -08:00
contentis	3b3ef9a77a	Quantized Ops fixes (#10715 ) * offload support, bug fixes, remove mixins * add readme	2025-11-12 18:26:52 -05:00
rattus	1c7eaeca10	qwen: reduce VRAM usage (#10725 ) Clean up a bunch of stacked and no-longer-needed tensors on the QWEN VRAM peak (currently FFN). With this I go from OOMing at B=37x1328x1328 to being able to succesfully run B=47 (RTX5090).	2025-11-12 16:20:53 -05:00
rattus	18e7d6dba5	mm/mp: always unload re-used but modified models (#10724 ) The partial unloader path in model re-use flow skips straight to the actual unload without any check of the patching UUID. This means that if you do an upscale flow with a model patch on an existing model, it will not apply your patchings. Fix by delaying the partial_unload until after the uuid checks. This is done by making partial_unload a model of partial_load where extra_mem is -ve.	2025-11-12 16:19:53 -05:00
comfyanonymous	1199411747	Don't pin tensor if not a torch.nn.parameter.Parameter (#10718 )	2025-11-11 19:33:30 -05:00
rattus	c350009236	ops: Put weight cast on the offload stream (#10697 ) This needs to be on the offload stream. This reproduced a black screen with low resolution images on a slow bus when using FP8.	2025-11-09 22:52:11 -05:00
comfyanonymous	dea899f221	Unload weights if vram usage goes up between runs. (#10690 )	2025-11-09 18:51:33 -05:00
comfyanonymous	e632e5de28	Add logging for model unloading. (#10692 )	2025-11-09 18:06:39 -05:00
comfyanonymous	2abd2b5c20	Make ScaleROPE node work on Flux. (#10686 )	2025-11-08 15:52:02 -05:00
comfyanonymous	a1a70362ca	Only unpin tensor if it was pinned by ComfyUI (#10677 )	2025-11-07 11:15:05 -05:00
rattus	cf97b033ee	mm: guard against double pin and unpin explicitly (#10672 ) As commented, if you let cuda be the one to detect double pin/unpinning it actually creates an asyc GPU error.	2025-11-06 21:20:48 -05:00
comfyanonymous	09dc24c8a9	Pinned mem also seems to work on AMD. (#10658 )	2025-11-05 19:11:15 -05:00
comfyanonymous	1d69245981	Enable pinned memory by default on Nvidia. (#10656 ) Removed the --fast pinned_memory flag. You can use --disable-pinned-memory to disable it. Please report if it causes any issues.	2025-11-05 18:08:13 -05:00
comfyanonymous	97f198e421	Fix qwen controlnet regression. (#10657 )	2025-11-05 18:07:35 -05:00
comfyanonymous	c4a6b389de	Lower ltxv mem usage to what it was before previous pr. (#10643 ) Bring back qwen behavior to what it was before previous pr.	2025-11-04 22:47:35 -05:00
contentis	4cd881866b	Use single apply_rope function across models (#10547 )	2025-11-04 20:10:11 -05:00
comfyanonymous	7f3e4d486c	Limit amount of pinned memory on windows to prevent issues. (#10638 )	2025-11-04 17:37:50 -05:00
comfyanonymous	af4b7b5edb	More fp8 torch.compile regressions fixed. (#10625 )	2025-11-03 22:14:20 -05:00
comfyanonymous	0f4ef3afa0	This seems to slow things down slightly on Linux. (#10624 )	2025-11-03 21:47:14 -05:00
comfyanonymous	6b88478f9f	Bring back fp8 torch compile performance to what it should be. (#10622 )	2025-11-03 19:22:10 -05:00
comfyanonymous	e199c8cc67	Fixes (#10621 )	2025-11-03 17:58:24 -05:00
comfyanonymous	0652cb8e2d	Speed up torch.compile (#10620 )	2025-11-03 17:37:12 -05:00
comfyanonymous	958a17199a	People should update their pytorch versions. (#10618 )	2025-11-03 17:08:30 -05:00
comfyanonymous	97ff9fae7e	Clarify help text for --fast argument (#10609 ) Updated help text for the --fast argument to clarify potential risks.	2025-11-02 13:14:04 -05:00
rattus	135fa49ec2	Small speed improvements to --async-offload (#10593 ) * ops: dont take an offload stream if you dont need one * ops: prioritize mem transfer The async offload streams reason for existence is to transfer from RAM to GPU. The post processing compute steps are a bonus on the side stream, but if the compute stream is running a long kernel, it can stall the side stream, as it wait to type-cast the bias before transferring the weight. So do a pure xfer of the weight straight up, then do everything bias, then go back to fix the weight type and do weight patches.	2025-11-01 18:48:53 -04:00
comfyanonymous	44869ff786	Fix issue with pinned memory. (#10597 )	2025-11-01 17:25:59 -04:00

1 2 3 4 5 ...

1807 Commits