weichen
831501ec55
Merge 0431508388b8e130a170ef017d760d873a80ee23 into 254f6b986720c92ddf97fbb1a6a6465da8e87e29
2025-12-25 00:06:54 +00:00
Richard Zou
254f6b9867
[Bugfix] Fix eagle dp tests on A100 ( #31241 )
...
Signed-off-by: Richard Zou <zou3519@gmail.com>
2025-12-25 00:05:04 +00:00
Michael Goin
bc5ef333e0
[Perf] Add skip_clone to SamplingParams for internal request handling ( #31041 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-12-24 14:35:57 -08:00
Cyrus Leung
09dc7c690c
[Chore][1/2] Drop v0.14 deprecations ( #31285 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-24 09:54:01 -08:00
ゆり
506eb0f454
[Bugfix] Remove dead block_quant_to_tensor_quant function ( #31294 )
...
Co-authored-by: yurekami <yurekami@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 17:22:48 +00:00
Ning Xie
5d93089686
[cli] complete vllm cli help message ( #31226 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-12-24 15:45:47 +00:00
Kevin McKay
66c9887440
[Bugfix][Hardware][AMD] Fix FP8 dtype in silu_mul quantization ( #31179 )
...
Signed-off-by: c0de128 <kevin.mckay@outlook.com>
2025-12-24 10:37:11 -05:00
wang.yuqi
1ff67df182
[CI] Reorganization pooling_mteb_test ( #31265 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-12-24 23:36:20 +08:00
skaraban3807
7cd288a4b3
[PERF] Add interleaved memory allocation to NUMA module ( #30800 )
2025-12-24 13:47:49 +00:00
Cyrus Leung
d201807339
[Chore] Bump lm-eval version ( #31264 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-24 05:39:13 -08:00
Cyrus Leung
aa3868ecfe
[Chore] Remove unused noqas ( #31263 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-24 05:38:46 -08:00
Cyrus Leung
7adeb4bfa8
[Bugfix] Fix max_model_len="auto" handling ( #31260 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-24 19:15:27 +08:00
wang.yuqi
bd89ce16d2
[Model] Introduce verify_and_update_model_config for VerifyAndUpdateConfig. ( #31131 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
2025-12-24 09:54:57 +00:00
Pleaplusone
b41aeb3468
[Bugfix][ROCm] Fix load issue on deepseek quark quantization when shared expert enabled ( #31261 )
...
Signed-off-by: ganyi <ygan@amd.com>
2025-12-24 16:47:44 +08:00
weichen
0431508388
Use request_id as the identifier when removing a request
...
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
weichen
0000d981d2
add ut for sjf scheduler policy
...
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Harry Mellor
da9d153112
Delete now empty file
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Harry Mellor
58615e5889
docstring
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Harry Mellor
9e8d9e1231
Consolidate SJF code and remove global variable
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Harry Mellor
53d57d9dca
Remove tuple stuff
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Harry Mellor
601387735c
Fix removal from heap
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
4fe722fae5
abstracting common code to HeapBasedRequestQueue
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
cc0a8ae572
naming
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Harry Mellor
ac674f6fc7
Move docstring
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
6413793466
Update scheduler.py
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
779769ea97
Create __init__.py
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
ed2a808252
Update normalized_scorer.py
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
b04f678659
linting
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
1e8b313afb
linting
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
db3e0a576e
linting
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
dd0e1224bc
linting
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
379eabac7f
linting
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
e14d347982
use heap
...
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Pr0Wh1teGivee
0098c3fb93
[Feat][Sched] Add SJF Scheduling Policy
...
Co-authored-by: HiC4Sh1e <chenjie137@huawei.com>
Co-authored-by: JiahongZhang-Work <iscocheung@gmail.com>
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>
Signed-off-by: weichen <calvin_zhu0210@outlook.com>
2025-12-24 16:30:26 +08:00
Ryan Rock
ddfac7034e
[CI/Build] Ignore data_parallel_size_local ( #30281 )
...
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
2025-12-24 07:40:54 +00:00
Micah Williamson
6559d96796
[ROCm][CI] Set TORCH_NCCL_BLOCKING_WAIT Distributed Tests On ROCm ( #31259 )
...
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-12-24 07:19:07 +00:00
kliuae
1c74150bca
[ROCm][CI] Fix "Distributed Tests (H200)" Test ( #31227 )
...
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
2025-12-24 06:56:30 +00:00
Andreas Karatzas
0247a91e00
[ROCm][CI] Fix entrypoints tests and Python-only installation test on ROCm ( #28979 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-23 22:42:30 -08:00
Michael Goin
8ee90c83f8
Add --max-model-len auto to auto-fit context to available memory ( #29431 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-12-23 21:37:14 -08:00
Nick Cao
d7e05ac743
[docker] Fix downloading sccache on aarch64 platform ( #30070 )
...
Signed-off-by: Nick Cao <nickcao@nichi.co>
2025-12-23 21:36:33 -08:00
sihao_li
471ddb99a0
[XPU] Remove distributed_executor_backend check ( #30760 )
...
Signed-off-by: sihao.li <sihao.li@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
2025-12-23 21:34:33 -08:00
Xiong Wang
bb24592d13
[Qwen3-Omni] fixed _get_feat_extract_output_lengths function ( #31007 )
...
Signed-off-by: Xiong Wang <wangxiongts@163.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-12-23 21:33:54 -08:00
Matthew Bonanni
369f47aa0f
[DeepSeek v3.2] Remove unnecessary syncwarps ( #31047 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-12-23 21:33:30 -08:00
zejunchen-zejun
dabff12ed3
[Bugfix][ROCm][Dynamo][DS 3.1][FP8] fix unsupported hasattr call when Dynamo tracing for ROCm device ( #31149 )
...
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
2025-12-23 21:32:19 -08:00
Ming Yang
3bb9561928
Revert "[bench] Support common prefix len config (for decode-only bench)" ( #31240 )
...
Signed-off-by: Ming Yang <minos.future@gmail.com>
2025-12-23 21:17:23 -08:00
Micah Williamson
3ce791ac77
[ROCm][CI] Set VLLM_FLOAT32_MATMUL_PRECISION="tf32" For terratorch Tests In AMD CI ( #31242 )
...
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-12-24 03:21:50 +00:00
Andreas Karatzas
e42894f5b5
[ROCm][CI][Bugfix] Fix Siglip2 rotary embedding dispatch and InternVL video test tolerance ( #31235 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-24 02:56:58 +00:00
Wentao Ye
76e6a95192
[Bug] Fix Number of dimensions of tensors must match. for Deepseek V3.2 ( #31160 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-12-24 10:41:09 +08:00
Chao Lei
8b59753cdb
[P/D] Mooncake connector support more protocols ( #30133 )
...
Signed-off-by: LCAIZJ <leichao139636@163.com>
2025-12-24 10:24:07 +08:00
Chen Zhang
538e830caa
[KVEvent] User request.block_hash for parent block_hash ( #30544 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>
2025-12-23 18:23:43 -08:00