xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-16 22:17:30 +08:00

Author	SHA1	Message	Date
Yeshwanth N	71b1c8b667	[Chore]:Extract math and argparse utilities to separate modules (#27188 ) Signed-off-by: Yeshwanth Surya <yeshsurya@gmail.com> Signed-off-by: Yeshwanth N <yeshsurya@gmail.com> Signed-off-by: yeshsurya <yeshsurya@gmail.com>	2025-10-26 04:03:32 -07:00
Cyrus Leung	8fb7b2fab9	[Doc] Fix links to GH projects (#27530 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-26 17:55:51 +08:00
Cyrus Leung	be7b55a83d	[Doc] Remove Molmo warning (#27527 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-26 16:22:52 +08:00
Lucia Fang	315b860abe	[bugfix]fix empty prompts for async-engine mode in benchmark throughput (#27494 ) Signed-off-by: Lucia Fang <fanglu@fb.com>	2025-10-26 08:16:35 +00:00
rongfu.leng	87c41c26ad	[Bugfix] Fix processor initialization for model from modelscope instead of HF (#27461 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-10-26 07:44:31 +00:00
JartX	65d2cf9511	[BUGFIX][ROCM] ViT FlashAttention on ROCm (no GFX9) and contiguous on qwen3vl ROCm TORCH_SDPA (#27190 ) Signed-off-by: JartX <sagformas@epdcenter.es> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-10-26 15:08:52 +08:00
Isotr0py	d63cd9ff10	[CI/Build] Use CPU for mm processing test on CI (#27522 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-10-26 13:09:18 +08:00
Cyrus Leung	66a168a197	[CI/Build] Refactor processing tests (#27470 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-10-25 16:14:30 +00:00
Matthew Bonanni	a99564ac5b	[Attention] Add missing kv cache scale setup (#27490 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-10-25 00:12:49 -07:00
Cyrus Leung	4c5f632165	[Misc] Simplify max tokens in multimodal registry (#27500 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-24 23:56:01 -07:00
Kuntai Du	b853540388	[Core][Hybrid allocator + kv connector 1/n] Enable hybrid allocator + KV cache connector (#25712 ) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>	2025-10-24 23:34:18 -07:00
Zhuohan Li	56ed7609a9	Revert "[Misc] Remove use of CUDA_VISIBLE_DEVICES for device selectio… (#27502 )	2025-10-25 05:31:43 +00:00
Jiangyun Zhu	29c9cb8007	[CI] Add tests for cudagraph (#27391 ) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>	2025-10-25 02:37:33 +00:00
Yihua Cheng	83f478bb19	[KVConnector] Migrate the LMCache integration code to be vLLM native (#25542 ) Signed-off-by: ApostaC <yihua98@uchicago.edu>	2025-10-25 00:23:53 +00:00
Varun Sundar Rabindranath	269c4db0a4	[Misc][DP] Guard mxfp4 implementation selection (#27484 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-10-24 23:29:24 +00:00
Wentao Ye	52efc34ebf	[Log] Optimize Startup Log (#26740 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-10-24 19:27:04 -04:00
Pengchao Wang	d95d0f4b98	[Distributed] Basic set of configuration for large EP deployment on GB200 (#27328 ) Signed-off-by: Pengchao Wang <wpc@fb.com> Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com>	2025-10-24 14:16:44 -07:00
Lehua Ding	0402428200	[Perf][Async Scheduling] Remove CPU->GPU sync in dummy_run (#27455 ) Signed-off-by: Lehua Ding <lehuading@tencent.com>	2025-10-24 20:45:36 +00:00
jinghanhu	17af6aa0da	[Document] Add ms-swift library to rlhf.md (#27469 )	2025-10-24 20:31:50 +00:00
Zhewen Li	fc168c33f3	[CI/Build] Fix test_torch_utils in AMD CI (#27317 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-10-24 12:26:00 -07:00
Isotr0py	acc78aeb88	[Bugfix] Fix interns1-vit qk norm code path (#27480 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-10-24 17:43:45 +00:00
Ming Yang	0f67d4d962	[Attention] Add MLA prefill backend: trtllm_ragged_attention_deepseek (#26397 ) Signed-off-by: Ming Yang <minos.future@gmail.com>	2025-10-24 10:24:08 -07:00
kourosh hakhamaneshi	7e1d697b56	[Bugfix] Fix MultiConnector stats reconstruction across process boundaries (#27366 ) Signed-off-by: Kourosh Hakhamaneshi <Kourosh@anyscale.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>	2025-10-24 17:08:05 +00:00
Chendi.Xue	699d62e6cf	[NIXL][BUGFIX] delay done_recving queue cleanup to bottom of get_finished (#27297 ) Signed-off-by: Chendi Xue <chendi.xue@intel.com>	2025-10-24 17:01:41 +00:00
Richard Zou	cd390b609d	[compile] Turn standalone_compile back on (#27460 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-10-24 16:30:27 +00:00
Fadi Arafeh	2080b05099	[cpu][fix] Fix onednn_mm crash on consecutive matmuls with same M,K,N and different dtype (#27472 ) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>	2025-10-24 15:57:48 +00:00
Lifans	6454afec90	[Doc] Fix minor issues in docs/design/metrics.md (#27436 ) Signed-off-by: Lifan Shen <lifans@meta.com>	2025-10-24 05:40:54 -07:00
Chauncey	41a62564a7	Fix test named tool use (#27458 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-10-24 20:27:45 +08:00
fhl2000	284cc92275	[MISC] `cudagraph_capture_sizes` related improvements (#26016 ) Signed-off-by: fhl <2410591650@qq.com> Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-24 05:11:05 -07:00
ioana ghiban	435be10db9	Fix AArch64 CPU Docker pipeline (#27331 ) Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com>	2025-10-24 05:11:01 -07:00
Cyrus Leung	b7030d962b	[Benchmark] Enable benchmark to run with `encoding_format="bytes"` (#27467 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-24 11:16:50 +00:00
Chauncey	3567816932	[Refactor] move tool parsing logic from protocol.py to the tool parser (#27383 ) Co-authored-by: Aaron Pham <contact@aarnphm.xyz>	2025-10-24 09:53:23 +00:00
22quinn	e0ef8a2920	[BugFix] Fix torchrun DP with LLM class (#27395 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-10-24 08:11:37 +00:00
Isotr0py	42efe609ba	[MM][Bugfix] Replace `PatchEmbed`'s conv3d to linear layer (#27418 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-24 07:32:47 +00:00
Yu Jiaqi	88d3141ec6	[Docs] remove v1 column for embedding models (#27446 ) Signed-off-by: piood <2477084691@qq.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-10-23 23:55:03 -07:00
Rui Qiao	09a6a49eaf	[Misc] Avoid "PyTorch non-writable tensors" warning in RayPPCommunicator (#27443 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-10-24 14:53:09 +08:00
strinczer	074475541a	[Bugfix] Fix Pydantic union resolution for ResponseFunctionToolCall in Responses API (#26706 ) Signed-off-by: Shai Trinczer <strinczer@icloud.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>	2025-10-23 22:53:42 -07:00
Aaron Pham	d4c574c39f	[Chore] remove structural tags logging lines (#27451 )	2025-10-24 05:35:45 +00:00
usberkeley	c528b9006a	Fix EventPublisherFactory logic for disabled KV cache events (#27419 ) Signed-off-by: Bradley <bradley.b.pitt@gmail.com>	2025-10-24 05:00:01 +00:00
fhl2000	85fee74b33	[Bugfix][CI] Move resolving cudagraph_mode before initializing attn_metadata_builder (#27427 ) Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>	2025-10-23 20:31:14 -07:00
hfan	8dbe0c527f	[Misc] Add TPU usage report when using tpu_inference. (#27423 ) Signed-off-by: Hongmin Fan <fanhongmin@google.com>	2025-10-23 20:29:37 -07:00
Xiangyu Li	5cc6bddb6e	[Kernel] Add GPTQv2 format support for low-bit or asymmetric quantization, by adapting gptq_gemm (#26092 )	2025-10-23 23:26:13 -04:00
Harry Mellor	1f9460c4c1	Fix pooling adapters for Transformers backend (#27338 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-23 20:23:55 -07:00
xiao-llm	70022ffc00	Granite 4.0 quark quantization support (#26944 ) Signed-off-by: Xiao YU <Xiao.YU@xilinx.com> Signed-off-by: Xiao Yu <xiao.yu.dc@outlook.com> Co-authored-by: Xiao YU <Xiao.YU@xilinx.com>	2025-10-24 02:14:03 +00:00
Akash kaothalkar	f417746ad7	[Hardware][POWERPC] Disable oneDNN path in vllm/model_executor/layers/utils.py for Powerpc (#27422 ) Signed-off-by: Akash Kaothalkar <akash.kaothalkar@ibm.com> Co-authored-by: Akash Kaothalkar <akash.kaothalkar@ibm.com>	2025-10-23 21:21:36 +00:00
Yu Jiaqi	0552cfb195	[Model] Siglip Embedding Support (#27324 ) Signed-off-by: piood <2477084691@qq.com>	2025-10-23 20:19:48 +00:00
Kebe	51dd14ac2b	[Bugfix][DP] Fix creating too many DP Placement Groups (#26880 ) Signed-off-by: Kebe <mail@kebe7jun.com> Signed-off-by: Rui Qiao <ruisearch42@gmail.com> Co-authored-by: Rui Qiao <ruisearch42@gmail.com>	2025-10-23 20:16:51 +00:00
Matthew Bonanni	dbfbf9f324	[Attention] Fix FlashMLA metadata builder arguments for q_len > 1 (#27368 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-10-23 15:58:15 -04:00
Jonathan Chen	ca76486a16	[Chore] Separate out `vllm.utils.platform_utils.py` (#27374 ) Signed-off-by: Jonathan <chenleejonathan@gmail.com>	2025-10-23 19:08:06 +00:00
Varun Sundar Rabindranath	a9f55dc588	[Misc] Add triton_kernels dependency (#27370 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-10-23 12:04:14 -07:00

1 2 3 4 5 ...

10748 Commits