Rattus f43ff9fcbd retune lowVramPatch VRAM accounting
In the lowvram case, this now does its math in the model dtype in the
post de-quantization domain. Account for that. The patching was also
put back on the compute stream getting it off-peak so relax the
MATH_FACTOR to only x2 so get out of the worst-case assumption of
everything peaking at once.
2025-12-07 22:55:05 +10:00
..
2024-06-27 18:43:11 -04:00
2025-11-28 19:40:19 -05:00
2025-09-02 15:36:22 -04:00
2025-01-24 06:15:54 -05:00
2025-07-06 07:07:39 -04:00
2025-12-05 22:20:22 -05:00
2025-10-25 23:07:29 -04:00
2025-12-05 22:20:22 -05:00
2025-12-05 23:01:19 -05:00