Roger Wang
0ff70821c9
[Core] Deprecate xformers ( #29262 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-24 04:18:55 +00:00
Zero
30854783ad
[Model] Add OpenCUA-7B support ( #29068 )
...
Signed-off-by: lim4349 <rockmanzero@naver.com>
Signed-off-by: Zero <rockmanzero@naver.com>
Co-authored-by: Cloud User <ubuntu@a100-80g-4.novalocal>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-24 10:27:55 +08:00
Cyrus Leung
389aa1b2eb
[Doc] Update more docs with respect to V1 ( #29188 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-23 10:58:48 +08:00
Michael Act
3ed767ec06
docs: fixes distributed executor backend config for multi-node vllm ( #29173 )
...
Signed-off-by: Michael Act <michael.a.c.tulenan@gdplabs.id>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-11-23 10:58:28 +08:00
Benjamin Bartels
eb5352a770
[CI/build] Removes source compilation from runtime image ( #26966 )
...
Signed-off-by: bbartels <benjamin@bartels.dev>
2025-11-22 10:23:09 -08:00
Angela Yi
d5dbdbfcb2
[docs] Fix cudagraph mode config ( #29170 )
...
Signed-off-by: angelayi <yiangela7@gmail.com>
2025-11-21 17:10:27 -08:00
Julien Denize
57430fc95c
Default model load/config/tokenizer to mistral format if relevant files exist ( #28659 )
...
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-11-21 13:58:59 -08:00
wangxiyuan
4050bae417
[Doc] Update plugin doc ( #28532 )
...
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-11-21 14:57:26 +00:00
Cyrus Leung
9452863088
Revert "Revert #28875 ( #29159 )" ( #29179 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-21 04:27:43 -08:00
Cyrus Leung
aab0102a26
[V0 deprecation] Remove more V0 references ( #29088 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-21 11:56:59 +00:00
Cyrus Leung
4d7231e774
Revert #28875 ( #29159 )
2025-11-21 01:40:17 -08:00
Cyrus Leung
56e96b37e4
[V0 Deprecation] Remove best_of ( #29090 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-21 11:40:40 +08:00
Qidong Su
698024ecce
[Doc] update installation guide regarding aarch64+cuda pytorch build ( #28875 )
...
Signed-off-by: Qidong Su <soodoshll@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-11-20 19:40:25 -08:00
jeremyteboul
0730414999
[Core] Add audio_embeds support to chat completions ( #29059 )
...
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
2025-11-21 11:39:47 +08:00
Michael Goin
87cbbdff63
Update model references for OLMo3 ( #29099 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-11-21 09:16:52 +08:00
Rob Mulla
dd39f91edb
[Doc] cleanup TPU documentation and remove outdated examples ( #29048 )
...
Signed-off-by: Rob Mulla <rob.mulla@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-21 00:05:59 +00:00
Shinichi Hemmi
c9e093116c
[MODEL] Implement plamo3 ( #28834 )
...
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
2025-11-20 03:00:19 -08:00
Shanshan Shen
d44e9df7d4
[Model][Mamba] Add selector for mamba attention backend and make it pluggable for other device ( #26487 )
...
Signed-off-by: shen-shanshan <467638484@qq.com>
2025-11-19 16:24:55 +00:00
Harry Mellor
4f5299f717
Relax Transformers modeling backend MoE experts check ( #28952 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-19 21:50:30 +08:00
Didier Durand
09540cd918
[Doc]: fix typos in various files ( #29010 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-19 04:56:21 -08:00
Harry Mellor
97cfa99d59
[Docs] Take env var definition out of folded admonition ( #29005 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-19 03:32:04 -08:00
Michael Yao
fdf93486d6
[Docs] Clean up moe_kernel_features.md ( #28530 )
...
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-11-19 02:35:29 -08:00
Louie Tsai
ae4821a108
Add CPU support model ( #28697 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2025-11-18 23:47:57 -08:00
Didier Durand
7ed27f3cb5
[Doc]: fix typos in various files ( #28945 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-18 22:52:30 -08:00
Uranus
6a25ea5f0e
[Docs] Update oneshot imports ( #28188 )
...
Signed-off-by: UranusSeven <109661872+UranusSeven@users.noreply.github.com>
2025-11-19 05:30:08 +00:00
Li, Jiang
20852c8f4c
[CPU] Refactor CPU WNA16 ( #28826 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-19 10:32:00 +08:00
Kevin H. Luu
c64c0b78de
[chore] Move the rest of wikimedia url to S3 ( #28921 )
...
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-18 09:44:18 -08:00
Didier Durand
083cf326dc
[Doc]: fix typos in various files ( #28863 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-17 20:32:14 -08:00
Pranav
f77bce001a
[Model] Add Afmoe architecture implementation ( #28332 )
...
Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Signed-off-by: Pranav <veldurthipranav@gmail.com>
Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
2025-11-17 15:11:20 -08:00
Jee Jee Li
3380ed5e11
[Doc] Add llama4 LoRA tag ( #28825 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-17 14:08:48 +08:00
Didier Durand
63fed55506
[Doc]: fix typos in various files ( #28811 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-16 14:30:06 +00:00
Didier Durand
2bb4435cb7
[Doc]: fix typos in various files ( #28567 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-15 19:27:50 +00:00
Cyrus Leung
89d3679221
[Doc] Fix failing doc build ( #28772 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-15 05:33:27 -08:00
tingtinggithub
cb15ee28db
Allow Gemma3 to take image embeddings ( #28483 )
...
Signed-off-by: tingtinggithub <streamttt@gmail.com>
2025-11-15 04:18:08 -08:00
Harry Mellor
67187554dd
[Docs] Enable some more markdown lint rules for the docs ( #28731 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-14 18:39:19 +00:00
Chen Wang
9261eb3dc1
docs(lora_resolvers): clarify multi-resolver order and storage path requirement ( #28153 )
...
Signed-off-by: Chen Wang <Chen.Wang1@ibm.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-14 18:08:30 +00:00
Julien Denize
085424808e
Remove audio optional dependency for mistral-common ( #28722 )
...
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-14 09:54:38 -08:00
Harry Mellor
5f3cd7f7f2
[Docs] Update the name of Transformers backend -> Transformers modeling backend ( #28725 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-14 16:34:14 +00:00
Fasal Shah
8d3748d3c7
[Doc] Fix macOS installation dependency resolution issue ( #26721 )
...
Signed-off-by: faisal shah <fashah@redhat.com>
2025-11-14 12:43:56 +00:00
Chauncey
5c9ad138d5
[Frontend] supports interleaved thinking ( #28531 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-11-13 16:14:13 +08:00
Harry Mellor
97d1c99302
Rename clashing method names for vLLM model protocol ( #27583 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-12 19:14:33 -08:00
Harry Mellor
3226283461
[Docs] Add some details about what the MoE block needs for the Transformers backend ( #28588 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-13 03:12:14 +00:00
Michael Goin
52eadcec9e
[Docs] Update meetups.md description ( #28583 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-11-13 00:00:23 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
4ca5cd5740
[Core][AMD] Migrate fully transparent sleep mode to ROCm platform ( #12695 )
...
Signed-off-by: Hollow Man <hollowman@opensuse.org>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: kliuae <kuanfu.liu@embeddedllm.com>
2025-11-12 15:24:12 -08:00
Benjamin Chislett
304419576a
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer ( #28479 )
...
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
2025-11-13 01:56:40 +09:00
Harry Mellor
a742134cc5
Remove deprecated fields from CompilationConfig ( #27593 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-12 16:10:28 +00:00
Chenguang Zheng
4ccffe561f
[Core] Encoder separation for Encode-Prefill-Decode Disaggregation ( #25233 )
...
Signed-off-by: n00909098 <nguyen.kha.long@huawei.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: herotai214 <herotai214@gmail.com>
Signed-off-by: Khuong Le <khuong.le.manh@huawei.com>
Signed-off-by: Khuong Le <lemanhkhuong2611@gmail.com>
Co-authored-by: n00909098 <nguyen.kha.long@huawei.com>
Co-authored-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Co-authored-by: herotai214 <herotai214@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Khuong Le <khuong.le.manh@huawei.com>
Co-authored-by: Khuong Le <lemanhkhuong2611@gmail.com>
2025-11-11 18:58:33 -08:00
Li, Jiang
7f829be7d3
[CPU] Refactor CPU attention backend ( #27954 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-12 09:43:06 +08:00
Michael Goin
28534b92b9
Add Zurich vLLM Meetup ( #28488 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-11-11 14:53:59 -08:00
xuebwang-amd
05576df85c
[ROCm][Quantization] extend AMD Quark to support mixed-precision quantized model ( #24239 )
...
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Co-authored-by: fxmarty-amd <felmarty@amd.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-11 12:05:22 -05:00