Lu Fang
d00dd65cd4
[Doc] Improve the Pull Request template with key components ( #19086 )
...
Signed-off-by: Lu Fang <lufang@fb.com>
2025-06-03 23:44:34 +08:00
Raushan Turganbay
d81edded69
[Bugfix] disable processor cache ( #19068 )
...
Signed-off-by: raushan <raushan@huggingface.co>
2025-06-03 15:06:04 +00:00
Harry Mellor
476844d44c
Fix underscores in dict keys passed via CLI ( #19030 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-06-03 14:39:24 +00:00
Jee Jee Li
4e68ae5e59
[CI/Build] Remove V0 LoRA test ( #19066 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-06-03 14:30:18 +00:00
youkaichao
4e88723f32
[doc] clarify windows support ( #19088 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-06-03 21:42:17 +08:00
Cyrus Leung
118ff92111
[Doc] Update V1 user guide for embedding and enc-dec models ( #19060 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-06-03 02:29:41 -07:00
Isotr0py
ec2dcd80bc
[Misc] Update WeightsMapper for qwen2-vl/qwen2.5-vl ( #19054 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-06-03 09:08:20 +00:00
Jee Jee Li
42243fbda0
[Doc] Add InternVL LoRA support ( #19055 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-06-03 09:08:03 +00:00
Michael Goin
6d18ed2a2e
Update docker docs with ARM CUDA cross-compile ( #19037 )
...
Signed-off-by: mgoin <michael@neuralmagic.com>
2025-06-03 08:21:53 +00:00
Chen Zhang
f32fcd9444
[v1][KVCacheManager] Rename BlockHashType to BlockHash ( #19015 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-06-03 08:01:48 +00:00
Lu Fang
d32aa2e670
[Bugfix] Use cmake 3.26.1 instead of 3.26 to avoid build failure ( #19019 )
...
Signed-off-by: Lu Fang <lufang@fb.com>
2025-06-03 00:16:17 -07:00
Michael Goin
cc977286e7
Reduce logs in CLI scripts and plugin loader ( #18970 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-03 06:00:45 +00:00
Reid
17430e3653
[bugfix] small fix logic issue ( #18999 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-03 05:35:12 +00:00
汪志鹏
1282bd812e
Add tarsier model support ( #18985 )
...
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
2025-06-03 13:13:13 +08:00
Rui Qiao
bdce64f236
[V1] Support DP with Ray ( #18779 )
2025-06-02 21:15:13 -07:00
Gregory Shtrasberg
9e6f61e8c3
[ROCm][Build] Clean up the ROCm build ( #19040 )
...
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-06-02 20:47:47 -07:00
Li, Jiang
8655f47f37
[CPU][CI] Re-enable the CPU CI tests ( #19046 )
...
Signed-off-by: jiang.li <jiang1.li@intel.com>
2025-06-02 20:46:47 -07:00
Concurrensee
4ce42f9204
Adding "LoRA Test %N" to AMD production tests ( #18929 )
...
Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>
2025-06-02 20:46:44 -07:00
Tyler Michael Smith
8a57872b2a
[Bugfix][EP+DP] Use pplx-kernel internode instead of intranode ( #19034 )
...
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-06-03 11:36:51 +08:00
Sage Moore
5f4a501b9a
more fixes
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-03 03:04:53 +00:00
Hyogeun Oh (오효근)
5bc1ad6cee
[Doc] Remove duplicate TOCs during MkDocs migration ( #19021 )
...
Signed-off-by: Zerohertz <ohg3417@gmail.com>
2025-06-02 19:49:48 -07:00
Sage Moore
539c0c3add
first round of fixes
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-03 02:38:44 +00:00
Sage Moore
18e7d6c7b8
Merge branch 'main' of https://github.com/neuralmagic/vllm into lwilkinson/attn-slicing
2025-06-03 00:52:39 +00:00
Siyuan Liu
9112b443a0
[Hardware][TPU] Initial support of model parallelism with single worker using SPMD ( #18011 )
...
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
Co-authored-by: Hossein Sarshar <hossein.sarshar@gmail.com>
Co-authored-by: Chengji Yao <chengjiyao@google.com>
2025-06-03 00:06:20 +00:00
Calvin Chen
c57d577e8d
add an absolute path for run.sh ( #18258 )
...
Signed-off-by: calvin chen <120380290@qq.com>
2025-06-02 19:38:23 +00:00
Sage Moore
2731e8cbcb
temporarily remove enable_microbatching
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:30:01 +00:00
Sage Moore
919eef995b
temporarily remove enable_microbatching
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:28:58 +00:00
Sage Moore
e34e4411b9
fa format
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:17:50 +00:00
Sage Moore
d46397661f
pplx format
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:17:15 +00:00
Sage Moore
243eac58a4
forward context format
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:16:06 +00:00
Sage Moore
8332924320
dp format
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:15:23 +00:00
Sage Moore
d4b502a73a
mla format
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:14:19 +00:00
Sage Moore
44a595f6d6
config format
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:13:27 +00:00
Sage Moore
92e0cc79a8
format
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 19:04:26 +00:00
Sage Moore
8ea80fca4a
revert offline_inference/basic.py
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 18:05:48 +00:00
Sage Moore
21d9529a79
revert offline_inference/basic.py
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 18:05:26 +00:00
Sage Moore
d6eca0c130
remove modular kernel
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 18:03:21 +00:00
Sage Moore
6645882e95
comment prepare input
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 18:02:23 +00:00
Sage Moore
065816d25f
misc cleanups to prepare for rebase
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 18:01:24 +00:00
Sage Moore
90e46ee5e3
misc cleanups to prepare for rebase
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 18:00:56 +00:00
Gregory Shtrasberg
ca2f6b9c30
[Bugfix][Model] Attempt to fix eagle in V0. ( #18978 )
...
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-06-02 08:15:53 -07:00
Frαnçois
20133cfee2
[Frontend] enable custom logging for the uvicorn server (OpenAI API server) ( #18403 )
...
Signed-off-by: François Paupier <francois.paupier@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-06-02 15:04:23 +00:00
Sage Moore
8f592524cb
misc cleanups to prepare for rebase
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 14:15:52 +00:00
Sage Moore
0323e29153
misc cleanups to prepare for rebase
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-06-02 14:13:30 +00:00
jennyyyyzhen
ebb1ec9318
[Model] enable data parallel for Llama4 vision encoder ( #18368 )
...
Signed-off-by: yzhen <yzhen@devgpu093.cco2.facebook.com>
Co-authored-by: yZhen <yZhen@fb.com>
Co-authored-by: yzhen <yzhen@devgpu093.cco2.facebook.com>
2025-06-02 19:22:54 +08:00
Reid
5b168b6d7a
[doc] add pytest tips ( #19010 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-02 11:07:26 +00:00
22quinn
9760fd8f6a
[Core] Support inplace model weights loading ( #18745 )
...
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-06-02 17:38:50 +08:00
Robert Shaw
b9f61e1387
[Bugfix][Nixl] Fix DP Metadata Handshake ( #19008 )
...
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
2025-06-02 03:30:41 +00:00
zhrrr
d6fd3a33b8
[Misc] reuse num_tokens_across_dp of get_dp_padding to avoid unnecessary dp all reduce in set_forward_context ( #18935 )
...
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
2025-06-01 19:41:18 +00:00
Reid
432ec9926e
[doc] wrong output ( #19000 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-06-01 11:26:14 +00:00