767 Commits

Author SHA1 Message Date
Michael Goin
60bc25e74c
[CI] Add Blackwell LM Eval Small Models test to nightly (#26052)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-10-05 14:59:50 -06:00
Jiangyun Zhu
9c3c21c519
[CI] fix mamba kernel test (#26250)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-10-05 18:26:59 +00:00
Harry Mellor
512b8affa4
Update ruff pre-commit hooks version (#26255)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-10-05 09:50:50 -07:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Angela Yi
7cfa4b24bf
[BugFix] Fix de-functionalization pass for rotary_embedding (#23953)
Signed-off-by: angelayi <yiangela7@gmail.com>
2025-10-03 15:44:18 -07:00
Sahithi Chigurupati
75003f34e8
[CI] Push multiarch manifests as nightly builds (#25764)
Signed-off-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com>
2025-10-03 15:42:55 -07:00
Cyrus Leung
d00d652998
[CI/Build] Replace vllm.entrypoints.openai.api_server entrypoint with vllm serve command (#25967)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-02 10:04:57 -07:00
leo-pony
e51de388a2
[Platform][CI] Added OOT platform interface e2e test that running on Ascend NPU (#25470)
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-10-02 23:19:22 +08:00
Michael Goin
ee04c0cd04
[CI] Tweaks to GPT-OSS Eval (Blackwell) for stability (#26030)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-10-01 12:02:17 -07:00
Reza Barazesh
bc546f76a1
[CI] Move applicable tests to CPU (#24080)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-30 14:45:20 +01:00
Isotr0py
0899ba5b42
[CI/Build] Include Transformers backend test in nightly transformers test (#25885)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-29 09:33:39 -07:00
Kunshang Ji
143844fa43
[XPU]Fix xpu spec decoding UTs, avoid using cuda graph (#25847)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-09-29 05:15:10 +00:00
Cyrus Leung
cd87bfbf37
[CI/Build] Reorganize root-level V1 tests (#25767)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-27 13:51:15 +08:00
22quinn
b3613e3ace
[CI/Build] Add timing to Model Executor Test (#25799)
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-09-26 21:57:27 -07:00
Cyrus Leung
d346ec695e
[CI/Build] Consolidate model loader tests and requirements (#25765)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-26 21:45:20 -07:00
Michael Goin
cf89202855
[CI] Fix FlashInfer AOT in release docker image (#25730)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-09-26 14:11:40 -07:00
Michael Goin
f708bd4904
[CI] Add E2E Blackwell Quantized MoE Test (#25723)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-09-26 12:23:00 -07:00
Cyrus Leung
db1e42f627
[CI/Build] Fix some V1 tests not being run (#25569)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-26 20:52:36 +08:00
Cyrus Leung
bc9d7b5595
[CI/Build] Split up Distributed Tests (#25572)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-26 14:49:33 +02:00
Matthew Bonanni
3468f17ebe
[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (#25489)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
2025-09-25 17:37:50 +00:00
Isotr0py
03858e6d1c
[Bugfix] Fix InternS1 video processing after Transformers v4.56 (#25644)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-25 14:46:04 +00:00
Li, Jiang
eb32335e35
[CPU] update torch 2.8 and fix missing fields in TorchSDPAMetadata (#25652)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-09-25 13:29:11 +00:00
Jackmin801
77a7fce1bb
[CI/Build] add nightly prime-rl integration tests (#25207)
Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-09-24 08:44:22 +00:00
kourosh hakhamaneshi
abad204be6
[BugFix] Fix OOM in vLLM replicas by ensuring consistent NCCL memory accounting (#25359)
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2025-09-23 15:49:09 -07:00
Ilya Markov
8bdd8b5c51
Enable symmetric memory all reduce by default only enabling for TP (#25070)
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-09-23 15:53:00 -04:00
Amir Samani
8c1c81a3de
[core] add nccl symmetric memory for all reduce (#24532)
Signed-off-by: Amir Samani <asamani@nvidia.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-09-23 14:33:06 -04:00
Ekagra Ranjan
867ecdd1c8
[Spec Decode][CI] Add e2e test for examples/spec_decode.py and prevent breaking Acceptance Length (#24531)
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-09-23 10:46:40 -07:00
Lucia Fang
922979bfcc
[DP] support torchrun external launcher with Data Parallelism (#24899)
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
2025-09-22 12:06:05 -07:00
Nicolò Lucchesi
a66d131381
[TPU][Bugfix][CI] Fix broken tests/build dependency (#25255)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-09-22 09:55:04 +00:00
Huamin Li
62b38dc832
[Doc] improve test-pipeline.yaml documentation (#25305)
Signed-off-by: Huamin Li <3ericli@gmail.com>
Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com>
2025-09-20 20:29:12 -07:00
Woosuk Kwon
c99db8c8dd
[V0 Deprecation] Remove V0 core (#25321)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-20 19:58:26 -07:00
Woosuk Kwon
52c2a8d4ad
[V0 Deprecation] Remove LLMEngine (#25033)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-20 17:56:30 -07:00
Or Ozeri
a53ad626d6
[KV offload][1b/N] rename offloading to kv_offload (#25191)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
2025-09-18 20:53:52 +00:00
Or Ozeri
505805b645
[KV offload][1/N] Introduce an offloading component (#19848)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
2025-09-18 10:57:07 -07:00
Aaron Pham
29283e8976
[Chore] Cleanup guided namespace, move to structured outputs config (#22772)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-18 09:20:27 +00:00
Woosuk Kwon
5c65a72bb1
[V0 Deprecation] Remove more V0 tests (#25117)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 22:05:25 -07:00
Woosuk Kwon
2fc24e94f9
[V0 Deprecation] Remove V0 Tracing & Metrics tests (#25115)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 19:40:44 -07:00
elvischenv
e6585ddb45
[Bugfix] Fix accuracy issue for silu_mul + nvfp4 quant fusion kernel (#24833)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-09-17 16:37:23 -07:00
Michael Goin
9f882d8791
Disable failing GPT-OSS Eval (Blackwell) for now (#25107)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-09-17 15:36:00 -07:00
Woosuk Kwon
4b946d693e
[V0 Deprecation] Remove V0 Core tests (#25082)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 09:32:42 -07:00
Woosuk Kwon
5801e49776
[V0 Deprecation] Remove MQLLMEngine (#25019)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-09-16 21:29:27 -07:00
Michael Goin
493b10f8bf
[CI] GPT-OSS GPQA eval test for Blackwell (#24920)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-16 18:13:21 -07:00
Ming Yang
4e5affeaa1
[CI] Add Decode Context Parallelism (DCP) test to CI (#24487)
Signed-off-by: Ming Yang <minos.future@gmail.com>
2025-09-16 21:21:28 +08:00
Woosuk Kwon
759ef49b15
Remove V0 Encoder-Decoder Support (#24907)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-09-15 21:17:14 -07:00
Jiangyun Zhu
5bcc153d7b
[Compile] Fix noop_elimination pass and add tests for noop_elimination (#24880)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-09-15 23:33:18 +00:00
Simon Mo
fd2f10546c
[ci] fix wheel names for arm wheels (#24898)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-09-15 14:39:08 -07:00
Didier Durand
4979eb79da
[Doc]: fix typos in various files (#24821)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-09-15 01:08:52 -07:00
afeldman-nm
c8c42597ab
[CI] Speed up model unit tests in CI (#24253)
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
2025-09-12 10:36:50 -07:00
dongluw
a5b84f1cbf
[Core] Shared memory based object store for Multimodal data caching and IPC (#20452)
Signed-off-by: donglu <donglu@cohere.com>
2025-09-12 07:54:17 -07:00
RichardoMu
40b6c9122b
[V1] feat:add engine v1 tracing (#20372)
Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Signed-off-by: Ye Zhang <zhysishu@gmail.com>
Signed-off-by: RichardoMu <44485717+RichardoMrMu@users.noreply.github.com>
Signed-off-by: simon-mo <simon.mo@hey.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Co-authored-by: Mu Huai <tianbowen.tbw@antgroup.com>
Co-authored-by: Ye Zhang <zhysishu@gmail.com>
Co-authored-by: Benjamin Bartels <benjamin@bartels.dev>
Co-authored-by: simon-mo <simon.mo@hey.com>
Co-authored-by: 瑜琮 <ly186375@antfin.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-09-11 17:10:39 -07:00