xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-10 10:47:09 +08:00

Author	SHA1	Message	Date
bnellnm	dc2979c585	[Kernels] Overlap shared experts with combine instead of dispatch (#24254 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2025-09-18 12:10:21 +08:00
toncao	027d37df38	[Bugfix][Qwen3-Next] add prefixes to shared_expert in qwen3-next and mlp in qwen2moe to successfully load ignored params in quantized models (#24960 ) Signed-off-by: toncao <cpatonn@gmail.com> Co-authored-by: toncao <cpatonn@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-09-18 12:08:50 +08:00
Lukas Geiger	b98219670f	[Core][MM] Cleanup `MultiModalCache` (#25006 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-09-17 21:08:41 -07:00
Harry Mellor	32baf1d036	[Docs] Clean up the contributing README (#25099 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-17 21:05:18 -07:00
Roger Wang	3127274d02	[MM Encoder] Apply DP ViT for Qwen3-VL model series (#24955 ) Signed-off-by: Roger Wang <hey@rogerw.io> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com> Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-17 21:04:21 -07:00
bnellnm	4ac510f484	[Kernels] Enable DeepGEMM by default (#24462 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2025-09-17 20:19:52 -07:00
Woosuk Kwon	7fb2a5be28	[V0 Deprecation] Skip PP test (#25128 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-17 20:18:36 -07:00
Woosuk Kwon	6c036615dc	[V0 Deprecation] Remove misc V0 tests (#25118 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-17 19:41:55 -07:00
Woosuk Kwon	2fc24e94f9	[V0 Deprecation] Remove V0 Tracing & Metrics tests (#25115 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-17 19:40:44 -07:00
Woosuk Kwon	2c3c1bd07a	[V0 Deprecation] Remove V0 Engine tests (#25114 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-17 19:38:09 -07:00
bnellnm	5963b98b46	[Kernel] Delegate construction of FusedMoEQuantConfig to FusedMoEMethodBase subclasses (#22537 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2025-09-17 17:43:31 -06:00
elvischenv	e6585ddb45	[Bugfix] Fix accuracy issue for silu_mul + nvfp4 quant fusion kernel (#24833 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-09-17 16:37:23 -07:00
Karan Goel	2a4d6412e6	Add a batched auto tune script (#25076 ) Signed-off-by: Karan Goel <karangoel@google.com> Signed-off-by: Karan Goel <3261985+karan@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-09-17 22:41:18 +00:00
elvischenv	e67a79db03	[Bugfix] Refactor Flashinfer TRTLLM attention kernel selection logic (#24600 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-09-17 15:36:29 -07:00
Michael Goin	9f882d8791	Disable failing GPT-OSS Eval (Blackwell) for now (#25107 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-09-17 15:36:00 -07:00
Douglas Lehr	1a456c7c90	Aiter mha fp8 fix (#24991 ) Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com>	2025-09-17 22:29:14 +00:00
Alexander Matveev	fedb75fa27	[Bugfix][B200] Fix `cutlass_mla` hang (#24966 ) Signed-off-by: Alexander Matveev <amatveev@redhat.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-09-17 18:06:38 -04:00
Andrew Xia	bff2e5f1d6	[gpt-oss][2] fix types for streaming (#24556 ) Signed-off-by: Andrew Xia <axia@meta.com>	2025-09-17 22:04:28 +00:00
czhu-cohere	3c068c637b	[Kernel] Faster pre-processing time for W4A8 (#23972 ) Signed-off-by: czhu-cohere <conway.zhu@cohere.com>	2025-09-17 14:35:32 -07:00
ahao-anyscale	f20c3b0951	[BUG] Exclude .pth files when pulling remote files (#25092 ) Signed-off-by: ahao-anyscale <ahao@anyscale.com>	2025-09-17 20:42:09 +00:00
Mohammad Miadh Angkad	883131544f	[Bugfix] Update import path for bc_linter_include (#24766 ) Signed-off-by: Mohammad Miadh Angkad <mangkad.bsdsba2027@aim.edu>	2025-09-17 20:33:11 +00:00
Yihua Cheng	ee5fd49150	[Misc] Update owners for KV connector and V1 offloading (#25041 ) Signed-off-by: ApostaC <yihua98@uchicago.edu>	2025-09-17 12:37:29 -07:00
afeldman-nm	7ae9887542	[V1] Logits processor docs (#22919 ) Signed-off-by: Andrew Feldman <afeldman@redhat.com> Signed-off-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com> Co-authored-by: Joseph Marinier <Joseph.Marinier@gmail.com>	2025-09-17 11:53:12 -07:00
Michael Goin	e3db5ebb66	[CI Bugfix] Fix failing test_model_load_with_params tests due to tokenizer refactor (#25086 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-09-17 11:15:05 -07:00
Woosuk Kwon	9d442b7c48	[V0 Deprecation] Remove V0 tests in test_sequence.py (#25088 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-17 11:08:45 -07:00
Woosuk Kwon	eb68c2dcd9	[CI] Revert back prepare_prompts and check_answers (#25087 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-17 11:03:16 -07:00
Michael Goin	8b32464ac1	Change log level from info to debug for IOProcessor (#24999 ) Signed-off-by: Michael Goin <mgoin64@gmail.com>	2025-09-17 10:21:28 -07:00
Woosuk Kwon	99cc41ad50	[V0 Deprecation] Remove unused output processor util (#25023 ) Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>	2025-09-17 09:50:07 -07:00
Simon Mo	d6a518fdde	Remove unused find_cuda_init helper script (#25044 )	2025-09-17 09:47:40 -07:00
Simon Mo	4aa8c7b047	cleanup: remove adapter commons (#25045 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-09-17 16:46:29 +00:00
Woosuk Kwon	4b946d693e	[V0 Deprecation] Remove V0 Core tests (#25082 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-17 09:32:42 -07:00
Michael Goin	087c6ffc92	[CI Bugfix] Fix failing test_invalid_env (#25078 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-09-17 08:28:58 -07:00
samzong	4a2d33e371	[Docs] vllm/benchmarks/datasets.py fix docstring param format. (#24970 ) Signed-off-by: samzong <samzong.lu@gmail.com>	2025-09-17 08:11:51 -07:00
Matthew Bonanni	8f3616f422	Remove old cutlass mla (#23961 ) Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-09-17 14:31:43 +00:00
samzong	47f670b03b	[Docs] improve code formatting and comments for eliminate griffe build warning. (#25010 ) Signed-off-by: samzong <samzong.lu@gmail.com>	2025-09-17 07:31:20 -07:00
Tao He	dd6a910aac	[Bugfix][Qwen3-Next] fixes the varlen issue in qwen3-next's MTP implementation. (#24957 ) Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>	2025-09-17 21:59:09 +08:00
dolpm	1b962e2457	[fix] lora benchmarks pass no_lora_flag_cpu (#23774 ) Signed-off-by: Dylan Maloy <34420038+dolpm@users.noreply.github.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-09-17 21:22:25 +08:00
Aidyn-A	bfe9380161	Apply fixes for CUDA 13 (#24599 ) Signed-off-by: Aidyn-A <aidyn.b.aitzhan@gmail.com>	2025-09-17 09:15:42 -04:00
Li, Jiang	9fccd04e30	[Bugfix] Fix Stream usage in CPU model runner and OneDNN kernel check (#25046 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-09-17 05:54:02 -07:00
danielafrimi	252ada5559	Add RADIO Vision Encoder Support to vLLM (#24595 ) Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com> Co-authored-by: root <root@cw-dfw-h100-001-305-026.cm.cluster>	2025-09-17 05:53:30 -07:00
Cyrus Leung	e120533d7a	[Misc] Avoid use of deprecated `AutoModelForVision2Seq` (#25065 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-17 12:19:15 +00:00
Shijun Yin	2b85697031	[BugFix] enable DOTALL to match multi-line tool_call parameters in extract_tool_call_required_streaming (#24668 ) Signed-off-by: Shijun Yin <shijun.yin@outlook.com>	2025-09-17 09:21:18 +00:00
Chauncey	544fe76b95	[Frontend] Support returning all prompt logprobs (#24956 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-09-17 09:03:52 +00:00
Xinyu Chen	bb58dc8c20	[DP] Create placement groups by ray_device_key (#25026 ) Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>	2025-09-17 08:57:25 +00:00
Michael Yao	0fb2551c23	[Docs] Fix griffe warning in base_static_graph.py (#25018 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2025-09-17 08:49:19 +00:00
Zhuohan Li	6c47f6bfa4	[Core] Remove tokenizer group in vLLM (#24078 ) Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>	2025-09-17 08:42:59 +00:00
whx	c15309a730	[Model] Apply SharedFusedMoE to glm4_moe. (#24849 ) Signed-off-by: whx-sjtu <2952154980@qq.com>	2025-09-17 16:02:31 +08:00
whx	4a9375fe9d	[Model] Pass param prefix to LLMHead (#24862 ) Signed-off-by: whx-sjtu <2952154980@qq.com>	2025-09-17 16:01:27 +08:00
Lukas Geiger	03191cd8f0	[Core][MultiModalHasher] Hash images without converting image mode (#24969 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-09-17 00:57:34 -07:00
rouchenzi	b77bf34e53	[EPLB] Support EPLB for Mixtral Model (#22842 ) Signed-off-by: rouchenzi <ruochenwen@gmail.com> Signed-off-by: rouchenzi <40842833+rouchenzi@users.noreply.github.com> Co-authored-by: Bowen Wang <abmfy@icloud.com>	2025-09-17 07:27:34 +00:00

1 2 3 4 5 ...

9585 Commits