xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-17 20:47:16 +08:00

Author	SHA1	Message	Date
Zhewen Li	f8b19c0ffd	[Bugfix] Fix GPT-OSS on AMD after #28603 (#28816 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-11-17 13:15:26 -05:00
Nick Hill	637f292196	[CI] Fix broken pipeline (#28781 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-15 08:44:14 -08:00
Angela Yi	f36292dbee	[compile] Enable sequence parallelism matching w/o custom ops enabled (#27126 ) Signed-off-by: angelayi <yiangela7@gmail.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ProExpertProg <lgovedic@redhat.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <luka.govedic@gmail.com>	2025-11-15 11:46:12 +00:00
Kunshang Ji	da14ae0fad	[XPU][CI]disable lm cache uts (#28696 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2025-11-14 03:15:50 +00:00
Bradley D	b39a5026eb	[ci][amd] fix basic models extra init test (#28676 ) Signed-off-by: Bradley Davis <bradleyhd@meta.com>	2025-11-14 02:44:36 +00:00
Alexei-V-Ivanov-AMD	f2b8e1c551	Mirrored test group definitions for AMD (2025-11-11) (#28573 ) Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>	2025-11-14 00:16:34 +00:00
Yanan Cao	262d263f6c	[Bugfix] Eliminate tuple inputs to submodules in graph partitioning (#28533 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>	2025-11-13 15:09:05 -05:00
Nick Hill	8832fff972	[BugFix] Fix `mm_encoder_attn_backend` arg type checking (#28599 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-13 03:06:03 +00:00
Harry Mellor	51c599f0ec	Skip models that cannot currently init on Transformers v5 (#28471 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-12 23:43:57 +00:00
Harry Mellor	a742134cc5	Remove deprecated fields from `CompilationConfig` (#27593 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-12 16:10:28 +00:00
Huamin Li	c748355e0d	[CI] Introduce autorun_on_main feature (#27836 ) Signed-off-by: Huamin Li <3ericli@gmail.com>	2025-11-12 08:51:19 +00:00
Andreas Karatzas	9f0247cfa4	`VLLM_USE_TRITON_FLASH_ATTN` V0 variable deprecation (#27611 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com> Signed-off-by: Andreas Karatzas <Andreas.Karatzas@amd.com>	2025-11-11 18:34:36 -08:00
Li, Jiang	7f829be7d3	[CPU] Refactor CPU attention backend (#27954 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-11-12 09:43:06 +08:00
wangxiyuan	e1710393c4	[[V0 deprecation]]Remove VLLM_USE_V1 env (#28204 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-11 18:22:16 -07:00
zhrrr	68c09efc37	[Kernel][Perf] fuse QK Norm and RoPE into one cuda kernel for Qwen Model (#27165 ) Signed-off-by: zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>	2025-11-11 12:00:31 -05:00
usberkeley	3143eb23fc	[BugFix] Add test_outputs.py to CI pipeline (#28466 ) Signed-off-by: Bradley <bradley.b.pitt@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-11 16:01:30 +00:00
Matthew Bonanni	b30dfa03c5	[Attention] Refactor CUDA attention backend selection logic (#24794 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-11 07:40:44 -05:00
Adrian Abeyta	a5a790eea6	[Bugfix] Ensure calculated KV scales are applied in attention. (#27232 ) Signed-off-by: adabeyta <aabeyta@redhat.com>	2025-11-10 23:42:37 +00:00
Ilya Markov	d17ecc6b19	[PERF] Allreduce fusion. Support torch native matching. Tuning of the thresholds (#24248 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2025-11-10 18:33:11 -05:00
Zhewen Li	a65a934ebe	[CI/Build] Temporary fix to LM Eval Small Models (#28324 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-11-09 21:08:38 +00:00
Simon Mo	d0ceb38ae8	[Build] Fix release pipeline failing annotation (#28272 ) Signed-off-by: simon-mo <simon.mo@hey.com> Signed-off-by: Simon Mo <simon.mo@hey.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-11-07 10:06:45 -08:00
Copilot	a736e5ff77	[CI] Reduce Blackwell Fusion test runtime by filtering tests and only run all tests in nightly (#28074 )	2025-11-07 15:58:16 +08:00
Alexis MacAskill	a47d94f18c	Add runai model streamer e2e test for GCS (#28079 ) Signed-off-by: Alexis MacAskill <amacaskill@google.com>	2025-11-07 03:07:54 +00:00
Michael Goin	f32229293e	Disable nm-testing models with issues in CI (#28206 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-11-06 06:19:07 -08:00
gmagogsfm	bde5039325	[CI] Add compile/test_multimodal_compile.py to CI (#28151 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-06 05:41:47 +00:00
Samuel Shen	40db194446	[CI]: Add LMCacheConnector Unit Tests (#27852 ) Signed-off-by: Samuel Shen <slshen@uchciago.edu> Co-authored-by: Samuel Shen <slshen@uchciago.edu> Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>	2025-11-05 09:45:57 -08:00
Alexei-V-Ivanov-AMD	80c9275348	Enabling cooperative multi-gpu tests on multi-gpu nodes (#27986 ) Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>	2025-11-05 10:35:49 -05:00
Ilya Markov	e50c454672	[BugFix] Support EP/DP + EPLB with MTP (#25311 ) Signed-off-by: ilmarkov <markovilya197@gmail.com> Signed-off-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Sage Moore <sage@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>	2025-11-05 15:22:17 +00:00
Zhewen Li	878fd5a16f	[CI/Build] Enable some fixed tests in AMD CI (#28078 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-11-05 03:15:59 +00:00
Zhewen Li	53f6e81dfd	[CI/Build] Fix OpenAI API correctness on AMD CI (#28022 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-11-04 07:20:50 +00:00
QiliangCui	7956b0c0bc	Remove the tpu docker image nightly build. (#27997 ) Signed-off-by: Qiliang Cui <derrhein@gmail.com>	2025-11-04 00:35:54 +00:00
Matthew Bonanni	01baefe674	Add TP parameter to attention tests (#27683 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-11-03 13:04:40 -08:00
Lucas Wilkinson	4bc400f47e	[CI/Testing] Add basic single node dual batch overlap test (#27235 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-11-03 17:00:46 +00:00
Matthew Bonanni	f29aeb5a25	Add FLASHINFER_MLA to test_mla_backends and add B200 CI run (#27663 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-10-31 11:12:19 -07:00
Jee Jee Li	0384aa7150	[CI/Build] Add gpt-oss LoRA test (#27870 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-10-31 22:17:21 +08:00
Wentao Ye	2bf0bcc1fc	[CI Test] Add Scheduled Integration Test (#27765 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-10-30 17:29:26 -07:00
Jakub Sochacki	697f507a8e	[CI/Build][Intel] Enable performance benchmarks for Intel Gaudi 3 (#26919 ) Signed-off-by: jakub-sochacki <jakub.sochacki@wp.pl>	2025-10-31 07:57:22 +08:00
Zhewen Li	e806178d2a	[BugFix][VL] Fix FA selection on Qwen2.5-VL (#27790 ) Signed-off-by: zhewenli <zhewenli@meta.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-30 07:54:44 +00:00
Huamin Li	5be1bed790	[CI/Build]Add eval config for Qwen3-235B-A22B-Instruct-2507-FP8 (#27113 ) Signed-off-by: Huamin Li <3ericli@gmail.com>	2025-10-30 07:50:56 +00:00
Kuntai Du	8bff831f0a	[Benchmark] Cleanup deprecated nightly benchmark and adjust the docstring for performance benchmark (#25786 ) Signed-off-by: KuntaiDu <kuntai@uchicago.edu>	2025-10-30 04:43:37 +00:00
Kunshang Ji	b5bae42f91	[XPU] Update latest IPEX 2.8 release (#27735 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2025-10-30 11:17:13 +08:00
22quinn	f7a6682872	[CI/Build] Test torchrun with 8 cards (#27548 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-10-29 10:26:06 -07:00
bnellnm	1891cf605a	[Bugfix] Fix modular kernel tests (#27707 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2025-10-29 16:14:33 +08:00
Cyrus Leung	4fb8771cc0	[CI/Build] Move pre-commit only scripts to `tools/pre_commit` (#27657 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-29 08:04:33 +00:00
Zhewen Li	8b62495076	[Bugfix] Fix non-contiguous tensor error in `rocm_unquantized_gemm_impl` (#27605 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-10-29 00:00:15 -07:00
Zhewen Li	83fd49b1fc	[CI/Build][Bugfix]Fix Quantized Models Test on AMD (#27712 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-10-29 06:27:30 +00:00
Mohammad Miadh Angkad	a8c02fb5bf	[Bugfix][CI] Fix v1 attention backend tests and add CI coverage (#26597 ) Signed-off-by: Mohammad Miadh Angkad <MAngkad.BSDSBA2027@aim.edu> Signed-off-by: Mohammad Miadh Angkad <mangkad.bsdsba2027@aim.edu> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>	2025-10-28 11:42:05 -04:00
Zhewen Li	0291fbf65c	[CI/Build] Fix amd model executor test (#27612 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-10-28 08:58:11 +00:00
Cyrus Leung	55cba4a05c	[CI/Build] Update causal-conv1d installation (#27529 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-26 22:14:22 +08:00
Cyrus Leung	c7abff2990	Revert "[CI/Build] Use CPU for mm processing test on CI (#27522 )" (#27531 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-26 04:44:27 -07:00

1 2 3 4 5 ...

854 Commits