xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-07 03:29:10 +08:00

Author	SHA1	Message	Date
Nick Hill	45c0526ac9	[BugFix] Handle errors when preprocessing added requests (#30895 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-12-19 01:29:11 +00:00
Elizabeth Thomas	41b6f9200f	Remove all2all backend envvar (#30363 ) Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-18 19:46:28 +00:00
Andrey Talman	e06d0bf0aa	2.9.1 PyTorch release update (#28495 )	2025-12-17 12:20:22 -08:00
Chauncey	9ad5b21710	[Refactor] [4/N] Move VLLM_SERVER_DEV endpoints into the serve directory (#30749 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-17 02:27:30 -08:00
Michael Goin	10ee1c64cf	[CI] Generalize gsm8k test args and add Qwen3-Next MTP B200 test (#30723 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-16 14:28:34 -05:00
Lucas Wilkinson	00a8d7628c	[BugFix] Fix memory spike in workspace allocation (#30744 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-12-16 06:46:22 -08:00
Cyrus Leung	ed586e7724	[Refactor] [3/N] Move tool parser tests and run on CPU (#30693 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-15 13:45:36 +00:00
Michael Goin	2f32a68d75	[CI] Update several models in registry that are available online now (#30514 ) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2025-12-12 18:28:13 -08:00
Kevin H. Luu	b4039c08b5	[ci] Mark PrimeRL integration test as soft fail (#30578 ) Signed-off-by: Kevin H. Luu <khluu000@gmail.com>	2025-12-12 14:13:09 -08:00
shivampr	cd7740ac5c	[ROCm] Enable Triton ScaledMM fallback + kernel selection fix (#26668 ) Signed-off-by: Shivam <shivampr.dev@gmail.com> Signed-off-by: Shivam <shivamprasad91@gmail.com>	2025-12-12 13:28:20 -05:00
Sage Moore	b4054c8ab4	Revert "[CI] Add Async Eplb nightly CI tests (#29385 )" (#30431 )	2025-12-11 00:48:35 +00:00
Ilya Markov	0b6a8a304c	[BugFix] Fix non detected failing tests (#30277 ) Signed-off-by: ilmarkov <markovilya197@gmail.com>	2025-12-09 17:57:55 +00:00
Zhewen Li	263c38d74d	[CI/Build] Update batch invariant test trigger (#30080 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-12-05 00:42:37 +00:00
Zhewen Li	c493b9d092	[CI/Build] Add MM code path to Examples Test (#29986 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-12-03 19:21:45 -08:00
WeiQing Chen	7fe9c1a223	[CI] Add Async Eplb nightly CI tests (#29385 ) Signed-off-by: David Chen <530634352@qq.com> Signed-off-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-12-03 09:51:08 +00:00
wang.yuqi	2eb4fe9129	[examples] Resettle pooling examples. (#29365 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-02 15:54:28 +00:00
Shengqi Chen	4b612664fd	[CI] Renovation of nightly wheel build & generation (take 2) (#29838 ) Signed-off-by: Shengqi Chen <harry-chen@outlook.com>	2025-12-01 22:17:10 -08:00
Kevin H. Luu	ec7035c9d4	[ci] Make distributed 8 gpus test optional (#29801 ) Signed-off-by: Kevin H. Luu <khluu000@gmail.com>	2025-12-01 10:22:05 -08:00
Cyrus Leung	2afcec4dec	[Misc] Update `TokenizerLike` interface and move `get_cached_tokenizer` (#29730 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-30 14:59:47 +08:00
Cyrus Leung	34a984274e	[Misc] Refactor tokenizer interface (#29693 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-29 04:02:21 -08:00
Angela Yi	4b17ce6815	Add gpu memory wait before test_async_tp (#28893 ) Signed-off-by: angelayi <yiangela7@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-28 20:19:05 -08:00
Isotr0py	d40c854009	[CI/Build] Rework CPU multimodal processor test (#29684 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-28 17:10:29 +00:00
HDCharles	df01eda4dc	[Bugfix] Make compressed-tensors MoEs respect ignored layers (#28878 ) Signed-off-by: HDCharles <charlesdavidhernandez@gmail.com>	2025-11-26 21:35:13 -05:00
Huamin Li	70d5953f82	Revert "[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841 )" (#29483 ) Signed-off-by: Huamin Li <3ericli@gmail.com>	2025-11-26 22:27:26 +08:00
Harry Mellor	bf0c75cd4f	Make Transformers Nightly tests soft-fail and enable all tests (#29401 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-25 12:41:15 +00:00
elvischenv	6330f9477d	[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-11-25 07:59:40 +00:00
Rémi Delacourt	12c007e288	EAGLE Support DP>1 (#26086 ) Signed-off-by: Rémi Delacourt <remi@mistral.ai> Signed-off-by: Rémi Delacourt <54138269+Flechman@users.noreply.github.com> Signed-off-by: remi <remi@mistral.ai>	2025-11-25 07:32:21 +00:00
Varun Sundar Rabindranath	e924bbb4f4	[Build/CI][DP/EP] Add QWen/Qwen3-30B-A3B-FP8 + EPLB tests to Nightly H100 and B200 (#29195 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-11-24 16:06:17 +00:00
Cyrus Leung	d1cf8214e5	[Bugfix] Use HF config fields as fallback when loading Mistral config (#29239 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-22 11:22:48 -07:00
Wentao Ye	1f400c58b8	[CI] Add batch invariant test to ci (#27842 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-11-21 09:20:33 -07:00
Michael Goin	986ab5db63	[CI Bugfix] Fix Kernels DeepGEMM Test (H100) (#29106 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-11-20 16:42:33 -08:00
Alexander Matveev	3aaa94ac99	[Performance] Reduce DeepGEMM N dim restriction from 128 to 64 multiplier (#28687 ) Signed-off-by: Alexander Matveev <amatveev@redhat.com> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-11-19 15:47:13 -08:00
Shu Wang	613abb50d5	[MoE] Nvfp4 Masked Gemm: Add flashinfer grouped_gemm_nt_masked (#25990 ) Signed-off-by: Shu Wang. <shuw@nvidia.com> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-11-19 13:29:06 -08:00
Copilot	61728cd1df	Re-enable FlashInfer for Llama4 on Blackwell in e2e fusion tests (#28966 ) Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-19 13:32:19 -05:00
Harry Mellor	a8b70304d6	Update `rope_scaling` to `rope_parameters` in preparation for Transformers v5 (#28542 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-19 09:06:36 -08:00
Yanan Cao	2c8b9182b5	[CI] Reorganize compile tests so new tests are automatically included in CI (#28625 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>	2025-11-19 06:13:50 -08:00
Nick Hill	637f292196	[CI] Fix broken pipeline (#28781 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-15 08:44:14 -08:00
Angela Yi	f36292dbee	[compile] Enable sequence parallelism matching w/o custom ops enabled (#27126 ) Signed-off-by: angelayi <yiangela7@gmail.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ProExpertProg <lgovedic@redhat.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <luka.govedic@gmail.com>	2025-11-15 11:46:12 +00:00
Yanan Cao	262d263f6c	[Bugfix] Eliminate tuple inputs to submodules in graph partitioning (#28533 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>	2025-11-13 15:09:05 -05:00
Nick Hill	8832fff972	[BugFix] Fix `mm_encoder_attn_backend` arg type checking (#28599 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-13 03:06:03 +00:00
Harry Mellor	51c599f0ec	Skip models that cannot currently init on Transformers v5 (#28471 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-12 23:43:57 +00:00
Harry Mellor	a742134cc5	Remove deprecated fields from `CompilationConfig` (#27593 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-12 16:10:28 +00:00
Huamin Li	c748355e0d	[CI] Introduce autorun_on_main feature (#27836 ) Signed-off-by: Huamin Li <3ericli@gmail.com>	2025-11-12 08:51:19 +00:00
zhrrr	68c09efc37	[Kernel][Perf] fuse QK Norm and RoPE into one cuda kernel for Qwen Model (#27165 ) Signed-off-by: zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>	2025-11-11 12:00:31 -05:00
usberkeley	3143eb23fc	[BugFix] Add test_outputs.py to CI pipeline (#28466 ) Signed-off-by: Bradley <bradley.b.pitt@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-11 16:01:30 +00:00
Matthew Bonanni	b30dfa03c5	[Attention] Refactor CUDA attention backend selection logic (#24794 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-11 07:40:44 -05:00
Adrian Abeyta	a5a790eea6	[Bugfix] Ensure calculated KV scales are applied in attention. (#27232 ) Signed-off-by: adabeyta <aabeyta@redhat.com>	2025-11-10 23:42:37 +00:00
Ilya Markov	d17ecc6b19	[PERF] Allreduce fusion. Support torch native matching. Tuning of the thresholds (#24248 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2025-11-10 18:33:11 -05:00
Zhewen Li	a65a934ebe	[CI/Build] Temporary fix to LM Eval Small Models (#28324 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-11-09 21:08:38 +00:00
Copilot	a736e5ff77	[CI] Reduce Blackwell Fusion test runtime by filtering tests and only run all tests in nightly (#28074 )	2025-11-07 15:58:16 +08:00

1 2 3 4 5 ...

491 Commits