Ilya Markov
e50c454672
[BugFix] Support EP/DP + EPLB with MTP ( #25311 )
...
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Signed-off-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
2025-11-05 15:22:17 +00:00
Harry Mellor
8fcaaf6a16
Update Optional[x] -> x | None and Union[x, y] to x | y ( #26633 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
bnellnm
47e66c24e2
[Model] Apply shared experts overlap optimization to all models with shared experts ( #26145 )
...
Signed-off-by: Bill Nell <bnell@redhat.com>
2025-10-09 11:31:04 -04:00
Lukas Geiger
93f2c0aa08
[Models] Improve iteration over layers ( #26425 )
...
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-10-08 20:48:33 +00:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort ( #26247 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Anion
f4db5e6de1
[Bugfix][Model] Fix inference for Hunyuan dense models ( #25354 )
...
Signed-off-by: anion <1005128408@qq.com>
Signed-off-by: Anion <123177548+Anionex@users.noreply.github.com>
2025-09-30 14:38:07 +00:00
Harry Mellor
61aedb5ffe
MoveVllmConfig from config/__init__.py to config/vllm.py ( #25271 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-29 19:49:49 -07:00
Cyrus Leung
27d7638b94
[Bugfix] Merge MM embeddings by index instead of token IDs ( #16229 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-09-27 08:15:12 +00:00
Woosuk Kwon
1c3ffdbecc
[V0 Deprecation] Remove V0 sampling metadata ( #25345 )
...
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-09-21 10:37:11 -07:00
YiwenC
9d8a2d86d2
[EPLB] Add EPLB support for hunyuan_v1 ( #23078 )
2025-09-18 04:51:35 +00:00
whx
4a9375fe9d
[Model] Pass param prefix to LLMHead ( #24862 )
...
Signed-off-by: whx-sjtu <2952154980@qq.com>
2025-09-17 16:01:27 +08:00
Jiangyun Zhu
eafa8dcde6
[Model] Add pp support for hunyuan ( #24212 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-09-04 03:58:26 -07:00
Asher
2671334d45
[Model] add Hunyuan V1 Dense Model support. ( #21368 )
...
Signed-off-by: Asher Zhang <asherszhang@tencent.com>
2025-07-23 03:54:08 -07:00