rattus 1c7eaeca10
qwen: reduce VRAM usage (#10725)
Clean up a bunch of stacked and no-longer-needed tensors on the QWEN
VRAM peak (currently FFN).

With this I go from OOMing at B=37x1328x1328 to being able to
succesfully run B=47 (RTX5090).
2025-11-12 16:20:53 -05:00
..
2024-12-20 16:24:55 -05:00
2025-11-12 16:20:53 -05:00
2024-06-27 18:43:11 -04:00
2025-09-02 15:36:22 -04:00
2025-01-24 06:15:54 -05:00
2025-07-06 07:07:39 -04:00
2025-09-15 18:10:55 -04:00
2025-10-25 23:07:29 -04:00
2025-10-30 17:39:02 -04:00