Cyrus Leung
|
fba0642704
|
[CI/Build][Doc] Update gte-Qwen2-1.5B-instruct usage (#18683)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-05-25 20:27:50 -07:00 |
|
Lukas Geiger
|
6071e989df
|
[Core][Multimodal] Convert PIL Image to array without data copy when hashing (#18682)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-05-25 17:33:35 +00:00 |
|
Cyrus Leung
|
57fd13a707
|
[Bugfix] Fix profiling dummy data for Pixtral (#18677)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-25 14:05:30 +00:00 |
|
Reid
|
3a886bd58c
|
[Misc] small improve (#18680)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-25 06:05:38 -07:00 |
|
Reid
|
35be8fad62
|
[CI/build] fix no regex (#18676)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-25 10:10:51 +00:00 |
|
Yuqi Zhang
|
f2faac745d
|
[Bugfix] Fix cpu usage and cache hit stats reporting on cpu environment (#18674)
Signed-off-by: zzzyq <zhangyuqi94@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-25 02:36:06 -07:00 |
|
Reid
|
279f854519
|
[doc] improve readability (#18675)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-25 01:40:31 -07:00 |
|
Reid
|
624b77a2b3
|
[doc] fix broken links (#18671)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-25 01:36:33 -07:00 |
|
Cyrus Leung
|
503f8487c2
|
[Misc] Reduce logs on startup (#18649)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 23:03:53 -07:00 |
|
Ning Xie
|
44073a7ac3
|
[BUGFIX] catch subclass first for try...except (#18672)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-05-25 05:34:24 +00:00 |
|
Michael Goin
|
63934543a0
|
Speed up the kernels/quantization/ tests (#18669)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-05-25 05:02:59 +00:00 |
|
Isotr0py
|
75f81750f3
|
[VLM] Initialize video input support for InternVL models (#18499)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-25 04:51:25 +00:00 |
|
Mengqing Cao
|
6ab681bcbe
|
[Misc][ModelScope] Change to use runtime VLLM_USE_MODELSCOPE (#18655)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-05-25 04:51:21 +00:00 |
|
Chenguang Li
|
cebc22f3b6
|
[Misc]Replace cuda hard code with current_platform in Ray (#14668)
Signed-off-by: noemotiovon <757486878@qq.com>
|
2025-05-24 20:26:31 -07:00 |
|
Ning Xie
|
6c6dcd8611
|
[MISC] correct signature for LoaderFunction (#18670)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-05-24 20:17:47 -07:00 |
|
Seiji Eicher
|
7891fdf0c6
|
[V1] Fix _pickle.PicklingError: Can't pickle <class 'transformers_modules.deepseek-ai.DeepSeek-V2-Lite... (#18640)
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
|
2025-05-24 20:07:20 -07:00 |
|
Woosuk Kwon
|
6825d9a998
|
[BugFix][Spec Decode] Improve Prefix Caching Logic in Speculative Decoding (#18668)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-05-24 17:33:46 -07:00 |
|
Reid
|
b554ab736e
|
[CI/Build] fix permission denied issue (#18645)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-24 16:09:10 +00:00 |
|
Aaron Pham
|
9ea7f1abf3
|
fix(regression): clone from reference items (#18662)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-05-24 15:25:20 +00:00 |
|
Aaron Pham
|
2807271c86
|
[CI] enforce import regex instead of re (#18665)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-05-24 08:04:14 -07:00 |
|
wangxiyuan
|
b9018a3f9f
|
[BugFix] Fix import error for fused_moe (#18642)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-05-24 07:53:36 -07:00 |
|
Ning Xie
|
4ceafb6299
|
[MISC] typo fix and clean import (#18664)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-05-24 07:52:09 -07:00 |
|
Cyrus Leung
|
2e6705784f
|
[CI/Build] chmod +x to cleanup_pr_body.sh (#18650)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 07:26:45 -07:00 |
|
Cyrus Leung
|
1cb194a018
|
[Doc] Reorganize user guide (#18661)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 07:25:33 -07:00 |
|
ztang2370
|
2cd4d58df4
|
[Model] use AutoWeightsLoader for gpt2 (#18625)
Signed-off-by: zt2370 <ztang2370@gmail.com>
|
2025-05-24 13:36:13 +00:00 |
|
Cyrus Leung
|
6d166a8d35
|
[Doc] Add community links (#18657)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 06:06:38 -07:00 |
|
Cyrus Leung
|
ef1dd6870f
|
[Doc] Fix indentation problems in V0 Paged Attention docs (#18659)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 06:06:35 -07:00 |
|
Mengqing Cao
|
e77dc4bad8
|
[MISC][pre-commit] Add pre-commit check for triton import (#17716)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2025-05-24 20:09:15 +08:00 |
|
Cyrus Leung
|
07458a51ce
|
[Doc] Update README links, mark external links (#18635)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 09:57:15 +00:00 |
|
qizixi
|
c1e4a4052d
|
[V1][Spec Decode] Support multi-layer eagle draft model (#18030)
Signed-off-by: qizixi <qizixi@meta.com>
|
2025-05-24 09:45:34 +00:00 |
|
Yuanhao WU
|
a859320575
|
[Model] Add support for Qwen2.5-Omni-7B-AWQ (Qwen2_5OmniForConditionalGeneration) (#18647)
|
2025-05-24 09:15:36 +00:00 |
|
Reid
|
441dc63ac7
|
[Frontend] improve vllm serve --help display (#18643)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-24 07:53:22 +00:00 |
|
qizixi
|
d55e446d13
|
[V1][Spec Decode] Small refactors to improve eagle bookkeeping performance (#18424)
Signed-off-by: qizixi <qizixi@meta.com>
|
2025-05-24 06:51:22 +00:00 |
|
Wenhua Cheng
|
ec82c3e388
|
FIX MOE issue in AutoRound format (#18586)
Signed-off-by: wenhuach21 <wenhua.cheng@intel.com>
|
2025-05-23 22:01:40 -07:00 |
|
Mathieu Borderé
|
45ab403a1f
|
config.py: Clarify that only local GGUF checkpoints are supported. (#18623)
Signed-off-by: Mathieu Bordere <mathieu@letmetweakit.com>
|
2025-05-24 08:46:34 +08:00 |
|
Robert Shaw
|
2b10ba7491
|
[Bugfix][Nixl] Fix Preemption Bug (#18631)
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
|
2025-05-23 23:30:16 +00:00 |
|
Feng XiaoLong
|
4fc1bf813a
|
[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454)
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com>
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>
|
2025-05-23 16:16:26 -07:00 |
|
Pavani Majety
|
f2036734fb
|
[ModelOpt] Introduce VLLM_MAX_TOKENS_PER_EXPERT_FP4_MOE env var to control blockscale tensor allocation (#18160)
Signed-off-by: Pavani Majety <pmajety@nvidia.com>
|
2025-05-23 15:52:20 -07:00 |
|
Cyrus Leung
|
7d9216495c
|
[Doc] Update references to doc files (#18637)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-23 15:49:21 -07:00 |
|
Michael Goin
|
0ddf88e16e
|
[CI] Enable test_initialization to run on V1 (#16736)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-05-23 15:09:44 -07:00 |
|
Huy Do
|
1645b60196
|
Use prebuilt FlashInfer x86_64 PyTorch 2.7 CUDA 12.8 wheel for CI (#18537)
Signed-off-by: Huy Do <huydhn@gmail.com>
|
2025-05-23 21:17:16 +00:00 |
|
Jiayi Yao
|
2628a69e35
|
[V1] Support Deepseek MTP (#18435)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Co-authored-by: Rui Qiao <ruisearch42@gmail.com>
|
2025-05-23 10:26:28 -07:00 |
|
Cyrus Leung
|
371f7e4ca2
|
[Doc] Fix broken links and unlinked docs, add shortcuts to home sidebar (#18627)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-23 10:22:40 -07:00 |
|
Cyrus Leung
|
15b45ffb9a
|
[Doc] Avoid documenting dynamic / internal modules (#18626)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-23 09:58:02 -07:00 |
|
Cyrus Leung
|
273cb3b4d9
|
[Doc] Fix top-level API links/docs (#18621)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-23 09:46:56 -07:00 |
|
David Xia
|
8ddd1cf26a
|
[Doc] fix list formatting (#18624)
Signed-off-by: David Xia <david@davidxia.com>
|
2025-05-23 09:41:17 -07:00 |
|
Chen Zhang
|
6550114c9c
|
[v1] Redo "Support multiple KV cache groups in GPU model runner (#17945)" (#18593)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-05-23 09:39:47 -07:00 |
|
Michael Goin
|
9520a989df
|
[Docs] Change mkdocs to not use directory urls (#18622)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-05-23 09:33:21 -07:00 |
|
Harry Mellor
|
3d28ad343f
|
Fix figures in design doc (#18612)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-23 09:09:54 -07:00 |
|
youkaichao
|
6a7988c55b
|
Refactor pplx init logic to make it modular (prepare for deepep) (#18200)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-05-23 23:43:43 +08:00 |
|