rongfu.leng
|
3779eb8c81
|
[Feature][eplb] add verify ep or tp or dp (#21102)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-07-21 23:41:14 -07:00 |
|
Shu Wang
|
9e23ad9655
|
Update fp4 quantize API (#21327)
Signed-off-by: Shu Wang <shuw@nvidia.com>
|
2025-07-21 23:40:21 -07:00 |
|
Wentao Ye
|
e69a92a1ce
|
[Bug] DeepGemm: Fix Cuda Init Error (#21312)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-07-21 23:36:18 -07:00 |
|
Varun Sundar Rabindranath
|
8425f785ad
|
[Misc] DeepEPHighThroughtput - Enable Inductor pass (#21311)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
|
2025-07-21 23:35:45 -07:00 |
|
Konrad Zawora
|
c17231e827
|
Fix kv_cache_dtype handling for out-of-tree HPU plugin (#21302)
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
|
2025-07-21 23:35:14 -07:00 |
|
Wentao Ye
|
6e5b5ca580
|
[Refactor] Fix Compile Warning #1444-D (#21208)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-07-21 23:33:51 -07:00 |
|
Thomas Parnell
|
488d8a986a
|
[V1] [Hybrid] Add new test to verify that hybrid views into KVCacheTensor are compatible (#21300)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-07-21 23:31:18 -07:00 |
|
Jialin Ouyang
|
af376ca19d
|
[Core] Minimize number of dict lookup in _maybe_evict_cached_block (#21281)
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
|
2025-07-21 22:37:34 -07:00 |
|
Ming Yang
|
e7b2042681
|
Revert "[Performance] Performance improvements in non-blockwise fp8 CUTLASS MoE (#20762) (#21334)
Signed-off-by: Ming Yang <minos.future@gmail.com>
|
2025-07-21 21:49:01 -07:00 |
|
Ratnam Parikh
|
90f1e55421
|
[Intel GPU] Ray Compiled Graph avoid NCCL for Intel GPU (#21338)
Signed-off-by: ratnampa <ratnam.parikh@intel.com>
|
2025-07-21 21:48:27 -07:00 |
|
Li, Jiang
|
5e70dcd6e6
|
[Doc] Fix CPU doc format (#21316)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-21 21:47:49 -07:00 |
|
Chaojun Zhang
|
25d585ab7b
|
[XPU] Enable external_launcher to serve as an executor via torchrun (#21021)
Signed-off-by: chzhang <chaojun.zhang@intel.com>
|
2025-07-21 21:47:35 -07:00 |
|
Lu Fang
|
8d0a01a5f2
|
[v1][sampler] Inplace logprobs comparison to get the token rank (#21283)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-07-21 13:47:47 -07:00 |
|
Nick Hill
|
60ae223986
|
Merge remote-tracking branch 'origin/main' into one-pod-per-node-lb
Signed-off-by: Nick Hill <nhill@redhat.com>
# Conflicts:
# vllm/v1/engine/core_client.py
|
2025-07-21 19:20:59 +01:00 |
|
Himanshu Jaju
|
0ec82edda5
|
[perf] Speed up align sum kernels (#21079)
Signed-off-by: Himanshu Jaju <hj@mistral.ai>
|
2025-07-21 11:19:23 -07:00 |
|
Michael Goin
|
005ae9be6c
|
Fix bad lm-eval fork (#21318)
|
2025-07-21 10:47:51 -07:00 |
|
Robert Shaw
|
29d1ffc5b4
|
[DP] Fix Prometheus Logging (#21257)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-21 09:11:35 -07:00 |
|
Lucas Wilkinson
|
304dce7ec0
|
[Attention] Clean up iRoPE in V1 (#21188)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2025-07-21 09:10:30 -07:00 |
|
Ming Yang
|
6ece16c4fe
|
[Misc] Add dummy maverick test (#21199)
Signed-off-by: Ming Yang <minos.future@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-07-21 09:08:09 -07:00 |
|
simpx
|
a0e827e07c
|
[BugFix] make utils.current_stream thread-safety (#21252) (#21253)
Signed-off-by: simpx <simpxx@gmail.com>
|
2025-07-21 09:07:36 -07:00 |
|
Li, Jiang
|
a15a50fc17
|
[CPU] Enable shared-memory based pipeline parallel for CPU backend (#21289)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-21 09:07:08 -07:00 |
|
Woosuk Kwon
|
6dda13c86b
|
[Misc] Add sliding window to flashinfer test (#21282)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-07-21 08:37:49 -07:00 |
|
Nick Hill
|
7a793ad562
|
fix data_parallel_hybrid_lb arg default value
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-07-21 15:29:48 +01:00 |
|
Zhiyu
|
6b46c4b653
|
Add Nvidia ModelOpt config adaptation (#19815)
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
|
2025-07-21 10:02:58 -04:00 |
|
Robert Shaw
|
58e4227fec
|
Update vllm/engine/arg_utils.py
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-07-21 08:24:34 -04:00 |
|
Ning Xie
|
d97841078b
|
[Misc] unify variable for LLM instance (#20996)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-07-21 12:18:33 +01:00 |
|
Harry Mellor
|
e6b90a2805
|
[Docs] Make tables more space efficient in supported_models.md (#21291)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-21 02:25:02 -07:00 |
|
Harry Mellor
|
be54a951a3
|
[Docs] Fix hardcoded links in docs (#21287)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-07-21 02:23:57 -07:00 |
|
Cyrus Leung
|
042af0c8d3
|
[Model][1/N] Support multiple poolers at model level (#21227)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-21 02:22:21 -07:00 |
|
Cyrus Leung
|
378d33c392
|
[Bugfix] Fix missing placeholder in logger debug (#21280)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-20 22:50:06 -07:00 |
|
Huy Do
|
940af1f03a
|
Add the instruction to run e2e validation manually before release (#21023)
Signed-off-by: Huy Do <huydhn@gmail.com>
|
2025-07-20 22:29:18 -07:00 |
|
Simon Mo
|
92615d7fe8
|
[Docs] Add RFC Meeting to Issue Template (#21279)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-07-20 21:58:07 -07:00 |
|
Kay Yan
|
8188196a1c
|
[CI] Cleanup modelscope version constraint in Dockerfile (#21243)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
|
2025-07-20 20:13:02 -07:00 |
|
Robert Shaw
|
40397e378d
|
finished validating
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-21 00:46:31 +00:00 |
|
Robert Shaw
|
1b481d3489
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-21 00:33:24 +00:00 |
|
Robert Shaw
|
e80c015d24
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-21 00:29:18 +00:00 |
|
Robert Shaw
|
f53166a963
|
update ux
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-21 00:21:35 +00:00 |
|
Robert Shaw
|
6feb4569fe
|
update ux
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-21 00:21:12 +00:00 |
|
Robert Shaw
|
5f0663bce4
|
cleanup
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-21 00:08:26 +00:00 |
|
Robert Shaw
|
1dcd90065d
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-21 00:00:40 +00:00 |
|
Robert Shaw
|
e81c277e6e
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 23:55:02 +00:00 |
|
Robert Shaw
|
a58892880e
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 23:53:10 +00:00 |
|
Robert Shaw
|
3c206b1975
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 23:52:07 +00:00 |
|
Robert Shaw
|
ec86e797da
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 23:32:49 +00:00 |
|
Robert Shaw
|
d327a6bed5
|
cleanup
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 23:26:48 +00:00 |
|
Robert Shaw
|
2d32c2849f
|
stash
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 23:10:16 +00:00 |
|
Robert Shaw
|
fe68027a08
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 22:45:45 +00:00 |
|
Robert Shaw
|
32a35f5d93
|
stash
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 22:43:08 +00:00 |
|
Robert Shaw
|
be03d841f6
|
stash
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 22:33:17 +00:00 |
|
Robert Shaw
|
91608889a4
|
updated
Signed-off-by: Robert Shaw <robshaw@redhat.com>
|
2025-07-20 22:19:37 +00:00 |
|