15 Commits

Author SHA1 Message Date
Jee Jee Li
2f5f9acd55
[LoRA] Continue optimizing MoE LoRA weight loading (#29322)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-27 05:56:28 -08:00
JartX
8e38e99829
[Feature] EPLB on Qwen3VLMoe and CompressedTensorsWNA16MoEMethod (#28849) 2025-11-19 18:30:08 -05:00
Harry Mellor
97d1c99302
Rename clashing method names for vLLM model protocol (#27583)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-12 19:14:33 -08:00
Jee Jee Li
daec4d2624
[Model]Improve Qwen3VLMoeForConditionalGeneration packed_modules_mapping (#27096)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-10-17 04:47:00 -07:00
Harry Mellor
8fcaaf6a16
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
Lukas Geiger
93f2c0aa08
[Models] Improve iteration over layers (#26425)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-10-08 20:48:33 +00:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
whx
cbf9221992
[Model] Supplement to PR 24862: Pass param prefix to LLMHead (#25805)
Signed-off-by: whx-sjtu <2952154980@qq.com>
2025-10-03 21:34:53 +08:00
Roger Wang
66bca9b8bd
[MM] Add text-only mode for Qwen3-VL (#26000) 2025-09-30 21:13:42 -07:00
JJJYmmm
471997adf6
[Bugfix] fix Qwen3VLMoe load when pp > 1 (#25838)
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
2025-09-28 17:56:12 +00:00
Isotr0py
d4d9899860
[Quantization] Add field to skip unquantized modules for GPTQ config (#25455)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-26 15:47:41 +00:00
Roger Wang
be874c0201
[Bugfix] Fix Qwen3-VL-MoE weight loading for EP (#25300)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-09-20 00:04:05 -07:00
Roger Wang
1dfea5f4a9
[Bugfix][Perf] Misc fixes for Qwen3 VL (#25238)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-09-19 10:46:16 +00:00
Roger Wang
3127274d02
[MM Encoder] Apply DP ViT for Qwen3-VL model series (#24955)
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-17 21:04:21 -07:00
Roger Wang
0f7acdd73c
[Model] Support Qwen3-VL Model Series (#24727)
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-17 05:01:04 +00:00