xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-29 20:07:54 +08:00

Author	SHA1	Message	Date
Matthew Bonanni	8f3616f422	Remove old cutlass mla (#23961 ) Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-09-17 14:31:43 +00:00
samzong	47f670b03b	[Docs] improve code formatting and comments for eliminate griffe build warning. (#25010 ) Signed-off-by: samzong <samzong.lu@gmail.com>	2025-09-17 07:31:20 -07:00
Tao He	dd6a910aac	[Bugfix][Qwen3-Next] fixes the varlen issue in qwen3-next's MTP implementation. (#24957 ) Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>	2025-09-17 21:59:09 +08:00
dolpm	1b962e2457	[fix] lora benchmarks pass no_lora_flag_cpu (#23774 ) Signed-off-by: Dylan Maloy <34420038+dolpm@users.noreply.github.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-09-17 21:22:25 +08:00
Aidyn-A	bfe9380161	Apply fixes for CUDA 13 (#24599 ) Signed-off-by: Aidyn-A <aidyn.b.aitzhan@gmail.com>	2025-09-17 09:15:42 -04:00
Li, Jiang	9fccd04e30	[Bugfix] Fix Stream usage in CPU model runner and OneDNN kernel check (#25046 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-09-17 05:54:02 -07:00
danielafrimi	252ada5559	Add RADIO Vision Encoder Support to vLLM (#24595 ) Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> Co-authored-by: root <root@cw-dfw-h100-001-305-026.cm.cluster>	2025-09-17 05:53:30 -07:00
Cyrus Leung	e120533d7a	[Misc] Avoid use of deprecated `AutoModelForVision2Seq` (#25065 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-17 12:19:15 +00:00
Shijun Yin	2b85697031	[BugFix] enable DOTALL to match multi-line tool_call parameters in extract_tool_call_required_streaming (#24668 ) Signed-off-by: Shijun Yin <shijun.yin@outlook.com>	2025-09-17 09:21:18 +00:00
Chauncey	544fe76b95	[Frontend] Support returning all prompt logprobs (#24956 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-09-17 09:03:52 +00:00
Xinyu Chen	bb58dc8c20	[DP] Create placement groups by ray_device_key (#25026 ) Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>	2025-09-17 08:57:25 +00:00
Michael Yao	0fb2551c23	[Docs] Fix griffe warning in base_static_graph.py (#25018 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2025-09-17 08:49:19 +00:00
Zhuohan Li	6c47f6bfa4	[Core] Remove tokenizer group in vLLM (#24078 ) Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>	2025-09-17 08:42:59 +00:00
whx	c15309a730	[Model] Apply SharedFusedMoE to glm4_moe. (#24849 ) Signed-off-by: whx-sjtu <2952154980@qq.com>	2025-09-17 16:02:31 +08:00
whx	4a9375fe9d	[Model] Pass param prefix to LLMHead (#24862 ) Signed-off-by: whx-sjtu <2952154980@qq.com>	2025-09-17 16:01:27 +08:00
Lukas Geiger	03191cd8f0	[Core][MultiModalHasher] Hash images without converting image mode (#24969 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-09-17 00:57:34 -07:00
rouchenzi	b77bf34e53	[EPLB] Support EPLB for Mixtral Model (#22842 ) Signed-off-by: rouchenzi <ruochenwen@gmail.com> Signed-off-by: rouchenzi <40842833+rouchenzi@users.noreply.github.com> Co-authored-by: Bowen Wang <abmfy@icloud.com>	2025-09-17 07:27:34 +00:00
Kunshang Ji	dd39baf717	[XPU] Fix xpu model runner call torch.cuda APIs (#25011 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2025-09-17 06:45:25 +00:00
Daniel Serebrenik	43a62c51be	Add more documentation and improve usability of lognormal dist (benchmark_serving_multi_turn) (#23255 ) Signed-off-by: daniels <daniels@pliops.com>	2025-09-17 05:53:17 +00:00
haoyangli-amd	ca2d1925ef	[Rocm] [quantization] Fix quark ptpc moe and add test case (#24649 ) Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com> Co-authored-by: Haoyang Li <haoyang.li@amd.com>	2025-09-16 22:15:13 -07:00
Roger Wang	0f7acdd73c	[Model] Support Qwen3-VL Model Series (#24727 ) Signed-off-by: Roger Wang <hey@rogerw.io> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com> Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-17 05:01:04 +00:00
Woosuk Kwon	5801e49776	[V0 Deprecation] Remove MQLLMEngine (#25019 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>	2025-09-16 21:29:27 -07:00
Russell Bryant	58d4c705a8	[Core] Get num_encoder_tokens from scheduler config (#24989 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-09-16 20:59:07 -07:00
Prashant Gupta	ea3de5ef0d	[misc] fix typo in value error (#24995 ) Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>	2025-09-16 20:58:38 -07:00
Michael Goin	67532a1a68	[UX] Remove "quantization is not fully optimized yet" log (#25012 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-09-16 20:57:51 -07:00
yyzxw	5672ba90bd	[Docs] fix invalid doc link (#25017 ) Signed-off-by: zxw <1020938856@qq.com>	2025-09-16 20:53:23 -07:00
Michael Goin	dd83a157f1	[UX] Enforce valid choices for envs like VLLM_ATTENTION_BACKEND, etc (#24761 ) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com>	2025-09-16 20:42:23 -07:00
Isotr0py	5a411ef6c4	[Benchmarks] Add MMVU video dataset support and clean up deprecated datasets (#24719 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-17 03:29:43 +00:00
Nick Hill	eeb135eb87	[Core] Use `CpuGpuBuffer` for block table tensors (#24795 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-09-16 19:18:06 -07:00
elvischenv	3059b9cc6b	[Doc] Add --force-overwrite option to generate_cmake_presets.py (#24375 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-09-16 18:45:29 -07:00
Benjamin Bartels	64ad551878	Removes source compilation of nixl dependency (#24874 ) Signed-off-by: bbartels <benjamin@bartels.dev> Signed-off-by: Benjamin Bartels <benjamin@bartels.dev> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Daniele <36171005+dtrifiro@users.noreply.github.com>	2025-09-17 01:33:18 +00:00
Tahsin Tunan	cef32104b4	[FP8] Extend per-token-group quantization support to QuantFP8 (#24342 ) Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com> Signed-off-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com>	2025-09-16 18:31:06 -07:00
Michael Goin	493b10f8bf	[CI] GPT-OSS GPQA eval test for Blackwell (#24920 ) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-09-16 18:13:21 -07:00
Matthew Bonanni	d119fc8614	[CI][Bugfix] Fix failing Blackwell test (#24993 ) Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2025-09-16 15:55:02 -07:00
Michael Goin	dbebb7f812	[Perf] Reuse workspace for FP8+FP4 Marlin MoE (#20500 ) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-09-16 15:45:10 -06:00
Aleksandr Malyshev	3053a22b33	fp8 kv cache support fix for torch.compile (#22758 ) Signed-off-by: Aleksandr Malyshev <maleksan@amd.com> Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Co-authored-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>	2025-09-16 21:27:11 +00:00
Andrew Sansom	02d4b85454	Use kwargs for long lists of `EngineCoreRequest` arguments in tests and fix extra kwargs (#24987 ) Signed-off-by: Andrew Sansom <andrew@protopia.ai>	2025-09-16 14:06:56 -07:00
Andrew Xia	86daa875fe	[gpt-oss][1][bugfix] fix streaming final output (#24466 ) Signed-off-by: Andrew Xia <axia@meta.com>	2025-09-16 13:56:16 -06:00
Concurrensee	dcf2f3ec06	[ROCm] Add dependencies for ROCm (#24900 ) Signed-off-by: Yida Wu <yida.wu@amd.com>	2025-09-16 19:49:06 +00:00
Chen Zhang	218454b9b2	[MISC] Add code owners of vllm/v1 to vllm/v1/core (#24928 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-09-16 19:07:34 +00:00
Andrew Xia	f4d6eb95cf	[gpt-oss][1b] streaming add item id, content id (#24788 ) Signed-off-by: Andrew Xia <axia@meta.com>	2025-09-16 18:41:12 +00:00
Sugar	cd1f885bcf	Directly get max encoder len from VLLM config in V1 (#24866 ) Signed-off-by: Sugar-zsg <952242923@qq.com>	2025-09-16 17:52:31 +00:00
Isotr0py	d593cf28fa	[Misc] Add removed encoder-decoder models to previously supported models list (#24961 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-16 10:46:46 -07:00
lianyibo	faa7a5daac	[Bugfix] Fix unable to run encoder model when disable_hybrid_kv_cache_manager is true (#24571 ) Signed-off-by: lianyibo <lianyibo1@kunlunit.com> Co-authored-by: Chen Zhang <zhangch99@outlook.com>	2025-09-16 17:36:58 +00:00
Sage Moore	567939953b	[Core/DBO][1/N] Add Dual-Batch Overlap mechanism to VLLM (#23693 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Sage Moore <sage@neuralmagic.com> Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by: yewentao256 <zhyanwentao@126.com> Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Co-authored-by: yewentao256 <zhyanwentao@126.com> Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2025-09-16 12:21:48 -04:00
Lukas Geiger	08369289af	[Core][MultiModalHasher] Don't convert memoryviews to bytes during hashing (#24925 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-09-16 15:32:47 +00:00
Chih-Chieh Yang	73cfb3c5ee	[Model] Clean up and simplify Mamba2 Metadata Usage in both V0 and V1 (#24331 ) Signed-off-by: Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com>	2025-09-16 14:53:43 +00:00
Ming Yang	4e5affeaa1	[CI] Add Decode Context Parallelism (DCP) test to CI (#24487 ) Signed-off-by: Ming Yang <minos.future@gmail.com>	2025-09-16 21:21:28 +08:00
TeeKen Lau	e4f0b4cd96	(doc): set cmake c++ compatible standard when building on MacOS CPU. (#23483 ) Signed-off-by: teekenl <teekenlau@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-16 06:08:46 -07:00
liangwen12year	de3e53a75b	feat: Add Grafana and Perces monitoring dashboards for vLLM (#23498 )	2025-09-16 05:53:40 -07:00

... 5 6 7 8 9 ...

9852 Commits