xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-26 08:57:13 +08:00

Author	SHA1	Message	Date
Cyrus Leung	d346ec695e	[CI/Build] Consolidate model loader tests and requirements (#25765 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-26 21:45:20 -07:00
Wentao Ye	c242c98031	[Bugfix] Allow Only SDPA Backend for ViT on B200 for Qwen3-VL (#25788 )	2025-09-26 20:44:52 -07:00
WeiQing Chen	f1d53d150c	[Multimodal][Speculative Decoding]Eagle Eagle3 mm support, enablement on qwen2.5vl (#22872 ) Signed-off-by: Junhong <liujunhong11@huawei.com> Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com> Co-authored-by: Junhong <liujunhong11@huawei.com> Co-authored-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>	2025-09-27 03:35:47 +00:00
Bram Wasti	dc48ba0c75	Kernel-override Determinism [1/n] (#25603 ) Signed-off-by: Bram Wasti <bwasti@meta.com>	2025-09-26 16:59:09 -07:00
Michael Goin	f708bd4904	[CI] Add E2E Blackwell Quantized MoE Test (#25723 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-09-26 12:23:00 -07:00
阿丹(adan)	33f6aaf972	Eagle3 that supports the Minicpm3 model (#24243 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: liudan <adan@minicpm.com> Co-authored-by: liudan <liudan@qq.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Lucia Fang <116399278+luccafong@users.noreply.github.com>	2025-09-26 10:04:57 -07:00
Isotr0py	d4d9899860	[Quantization] Add field to skip unquantized modules for GPTQ config (#25455 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-26 15:47:41 +00:00
Chih-Chieh Yang	2b6b1d7809	[Model] Mamba2 varlen refactor (#21467 ) Signed-off-by: Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com> Co-authored-by: RishiAstra <40644327+RishiAstra@users.noreply.github.com>	2025-09-26 11:31:14 +00:00
Sage Moore	dfb9af2014	[Bugfix] Fix Shared Expert/Zero expert code in FusedMoE.process_chunk (#25698 ) Signed-off-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2025-09-26 01:25:28 -07:00
Tao He	99b3a504c5	[Qwen3-Next][GDN] fixes cuda graph capturing bug in GDN metadata and a stride bug in causal_conv_1d. (#25743 ) Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>	2025-09-26 01:18:58 -07:00
xaguilar-amd	52621c8f5c	[Harware][AMD][Model] Triton MoE tuning configs for GLM-4.5 for MI300X (#25703 ) Signed-off-by: xaguilar <Xavier.AguilarFruto@amd.com>	2025-09-26 01:18:20 -07:00
Eugene Khvedchenya	392edee34a	EVS Support (Video tokens pruning) (#22980 ) Signed-off-by: Eugene Khvedchenia <ekhvedchenia@nvidia.com> Signed-off-by: Eugene Khvedchenya <ekhvedchenya@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-09-26 11:54:54 +08:00
Aleksandr Malyshev	53a30845be	Llamas 3.1 405B fp4 changes upstreaming from 355_wip (#25135 ) Signed-off-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com>	2025-09-25 19:16:53 -06:00
Wentao Ye	9fe4c2bdb9	[Refactor] Remove DeepGEMM OP Register (#25710 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-09-25 20:13:41 -04:00
Shu Wang	081b5594a2	Fix routing_bias dtype (#25711 ) Signed-off-by: Shu Wang. <shuw@nvidia.com>	2025-09-25 23:35:14 +00:00
tomeras91	57329a8c01	[Model] rename NemotronH_Nano_VL -> NemotronH_Nano_VL_V2 (#25708 ) Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>	2025-09-25 16:10:29 -07:00
Cyrus Leung	89fa54e6f7	[Optimization] Use a cheaper cache key in `get_model_architecture` (#25682 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-25 17:54:20 -04:00
Matthew Bonanni	3468f17ebe	[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (#25489 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>	2025-09-25 17:37:50 +00:00
Cyrus Leung	0ea80c87d9	[Model] Define `merge_by_field_config` MM interface (#25676 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-25 17:13:07 +00:00
Michael Goin	916bd9204d	Revert "[Bug] Dynamo Unsupported due to `BasevLLMParameter.torch_function` calling disabled super()" (#25681 ) Signed-off-by: yewentao256 <zhyanwentao@126.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-09-25 09:45:06 -07:00
Isotr0py	03858e6d1c	[Bugfix] Fix InternS1 video processing after Transformers v4.56 (#25644 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-25 14:46:04 +00:00
Cyrus Leung	12c1287d64	[mypy] Further improve MM type annotations (#25654 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-25 10:57:36 +00:00
Isotr0py	17b4c6685c	[Bugfix] Fix Qwen3-VL max_num_video_tokens calculation for video profiling (#25648 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-25 18:36:01 +08:00
Roger Wang	7be9ffcd9f	[Misc] Fix Qwen3-VL `video_grid_thw` typing (#25646 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-09-25 10:16:45 +00:00
Tyler Michael Smith	1260180c67	Revert "[Performance] Move apply_w8a8_block_fp8_linear to an op class… (#25607 ) Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>	2025-09-25 08:05:21 +00:00
Jacob Kahn	bc092ea873	Map CwmForCausalLM to llama and LlamaForCausalLM (#25611 ) Signed-off-by: Jacob Kahn <jacobkahn1@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-09-25 07:37:03 +00:00
Cyrus Leung	755ed7b05b	[Misc] Simplify PoolerOutput and move to `v1/outputs` (#25629 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-25 06:47:03 +00:00
XuruiYang	845adb3ec6	[Model] Add LongCat-Flash (#23991 ) Signed-off-by: yangxurui <yangxurui@meituan.com> Co-authored-by: yangxurui <yangxurui@meituan.com>	2025-09-24 21:53:40 -07:00
Saman A. Pour	90b139cfff	Enable Fbgemm NVFP4 on Dense models (#25609 ) Signed-off-by: Saman Keon <samanamp@outlook.com>	2025-09-24 21:12:53 -07:00
Wentao Ye	4492e3a554	[Bug] Dynamo Unsupported due to `BasevLLMParameter.torch_function` calling disabled super() (#25613 ) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-09-24 18:52:52 -07:00
Wei Wei	05c19485a5	[Kernel] Support DCP for Triton backend (#25132 ) Signed-off-by: Wei Wei <wwei6@meta.com>	2025-09-24 18:09:34 -07:00
Jee Jee Li	52d0cb8458	[Model] Improve DotsOCRForCausalLM (#25466 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-09-25 07:58:08 +08:00
Wentao Ye	1f29141258	[Refactor] Use DeepGEMM Col Major TMA Aligned Tensor (#25517 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-09-24 18:52:36 -04:00
Duncan Moss	6160ba4151	feat: BF16 FlashInfer Fused Cutlass MOE for Hopper and Blackwell Expert Parallel (#25503 ) Signed-off-by: Duncan Moss <djm.moss@gmail.com>	2025-09-24 18:50:04 -04:00
Harry Mellor	8c853050e7	[Docs] Enable `fail_on_warning` for the docs build in CI (#25580 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-24 19:30:33 +00:00
Shu Wang	54e42b72db	Support mnnvl all2allv from Flashinfer (#21003 ) Signed-off-by: Shu Wang <shuw@nvidia.com> Signed-off-by: Shu Wang. <shuw@nvidia.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>	2025-09-24 14:38:16 -04:00
Cyrus Leung	9313be5017	[Misc] Improve type annotations for jsontree (#25577 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-24 22:49:58 +08:00
Woosuk Kwon	2e19a848d4	[V0 Deprecation] Remove max_seq_len_to_capture (#25543 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-24 01:51:39 -07:00
Cyrus Leung	6488f3481b	[Misc]] Move processing context to multimodal directory (#25548 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-24 08:15:00 +00:00
Isotr0py	27ec3c78f3	[CI/Build] Fix v1 OOT registration test (#25547 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-24 08:03:13 +00:00
Li, Jiang	1cbcfb94de	[Bugfix][CPU] Skip unsupported custom op register on CPU (#25534 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-09-24 06:21:51 +00:00
Corey Lowman	d747c2ef18	[Perf] Fix jit compiles at runtime of fla gated delta rule (#25432 ) Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-09-24 11:16:13 +08:00
Yong Hoon Shin	77d906995c	[KV sharing] Re-land Gemma3n model changes from #22628 (#24357 ) Signed-off-by: Yong Hoon Shin <yhshin@meta.com>	2025-09-23 19:25:34 -07:00
Nikhil Gupta	359d293006	[fix]: add Arm 4bit fused moe support (#23809 ) Signed-off-by: Nikhil Gupta <nikhil.gupta2@arm.com>	2025-09-24 01:32:22 +00:00
Kyle Sayers	de94289a98	[Core] Support weight_loader_v2 for `UnquantizedLinearMethod` (#23036 ) Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-09-23 18:30:26 -06:00
Wentao Ye	88d7bdbd23	[Bug] Fix AttributeError: 'FusedMoE' object has no attribute 'w13_weight_scale'. Did you mean: 'w13_weight_scale_inv' (#25519 ) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-09-24 00:07:51 +00:00
Chenxi Yang	0d235b874a	Add CUTLASS FP8 MOE benchmark scripts and kernel config (#25302 ) Signed-off-by: Chenxi Yang <cxyang@fb.com> Co-authored-by: Chenxi Yang <cxyang@fb.com>	2025-09-23 18:07:42 -06:00
Juan Villamizar	bde2a1a8a4	[ROCm] Small functional changes for gptoss (#25201 ) Signed-off-by: jpvillam <jpvillam@amd.com> Co-authored-by: jpvillam <jpvillam@amd.com>	2025-09-23 23:39:50 +00:00
Thomas Parnell	5e25b12236	[Kernel] [Mamba] Remove BLOCK_H=1 from list of tuneable configurations for `_chunk_cumsum_fwd_kernel` (#25197 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Chih-Chieh-Yang <chih.chieh.yang@ibm.com>	2025-09-23 23:23:30 +00:00
Michael Goin	7361ab379f	Remove redundant mutates_args and dispatch_key for direct_register_custom_op (#25512 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-09-23 22:48:40 +00:00

1 2 3 4 5 ...

2818 Commits