Harry Mellor
ba5c5e5404
[Docs] Switch to better markdown linting pre-commit hook ( #21851 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 19:45:08 -07:00
David Xia
7b49cb1c6b
[Doc] update Contributing page's testing section ( #18272 )
...
Signed-off-by: David Xia <david@davidxia.com>
2025-07-29 10:32:46 -07:00
Varun Sundar Rabindranath
f03e9cf2bb
[Doc] Add FusedMoE Modular Kernel Documentation ( #21623 )
...
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
2025-07-29 10:32:30 -07:00
David Xia
37f86d9048
[Docs] use uv in GPU installation docs ( #20277 )
...
Signed-off-by: David Xia <david@davidxia.com>
2025-07-29 10:32:06 -07:00
Brittany
759b87ef3e
[TPU] Add an optimization doc on TPU ( #21155 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 07:23:19 -07:00
Harry Mellor
f693b067a2
[Docs] Merge design docs for a V1 only future ( #21832 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 07:22:50 -07:00
Cyrus Leung
ab714131e4
[Doc] Update compatibility matrix for pooling and multimodal models ( #21831 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-29 06:29:51 -07:00
Kay Yan
2470419119
[Docs] Fix the outdated URL for installing from vLLM binaries ( #21523 )
...
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 04:56:27 -07:00
Cyrus Leung
a2480251ec
[Doc] Link to RFC for pooling optimizations ( #21806 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-28 23:53:18 -07:00
Michael Goin
947e982ede
[Docs] Minimize spacing for supported_hardware.md table ( #21779 )
2025-07-28 18:46:39 -07:00
Harry Mellor
94b71ae106
Use metavar to list the choices for a CLI arg when custom values are also accepted ( #21760 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-28 19:31:10 +00:00
Anton Vlasjuk
656c24f1b5
[Ernie 4.5] Name Change for Base 0.3B Model ( #21735 )
...
Signed-off-by: vasqu <antonprogamer@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-28 12:22:32 +00:00
Shinichi Hemmi
c7ffe93d9c
[Model] Support TP/PP/mamba2 kernel for PLaMo2 ( #19674 )
...
Signed-off-by: Shinichi Hemmi <shemmi@preferred.jp>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Co-authored-by: Calvin Metzger <metzger@preferred.jp>
Co-authored-by: Sixue Wang <cecilwang@preferred.jp>
2025-07-28 05:00:47 +00:00
Cyrus Leung
86ae693f20
[Deprecation][2/N] Replace --task with --runner and --convert ( #21470 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-27 19:42:40 -07:00
Isotr0py
3d847a3125
[VLM] Add video support for Intern-S1 ( #21671 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-07-27 11:49:43 +00:00
Ye (Charlotte) Qi
01a395e9e7
[CI/Build][Doc] Clean up more docs that point to old bench scripts ( #21667 )
...
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
2025-07-27 04:02:12 +00:00
Isotr0py
eed2f463b2
[VLM] Support HF format Phi-4-MM model ( #17121 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-07-26 20:07:57 -07:00
Ye (Charlotte) Qi
e7c4f9ee86
[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI ( #21355 )
...
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
2025-07-26 07:10:14 -07:00
Lyu Han
875af38e01
Support Intern-S1 ( #21628 )
...
Signed-off-by: Roger Wang <hey@rogerw.me>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-07-26 19:14:04 +08:00
WeiQing Chen
97349fe2bc
[Docs] add offline serving multi-modal video input expamle Qwen2.5-VL ( #21530 )
...
Signed-off-by: David Chen <530634352@qq.com>
2025-07-25 18:37:32 -07:00
Daniel Han
41d3082c41
Add Unsloth to RLHF.md ( #21636 )
2025-07-25 17:06:48 -07:00
Wenhua Cheng
5ac3168ee3
[Docs] add auto-round quantization readme ( #21600 )
...
Signed-off-by: Wenhua Cheng <wenhua.cheng@intel.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-25 08:52:42 -07:00
Kebe
396ee94180
[CI] Unifying Dockerfiles for ARM and X86 Builds ( #21343 )
...
Signed-off-by: Kebe <mail@kebe7jun.com>
2025-07-25 07:33:56 -07:00
bigshanedogg
29c6fbe58c
[MODEL] New model support for naver-hyperclovax/HyperCLOVAX-SEED-Vision-Instruct-3B ( #20931 )
...
Signed-off-by: bigshanedogg <bigshane319@gmail.com>
2025-07-25 06:05:42 -07:00
Zhou Fang
807a328bb6
[Docs] Add requirements/common.txt to run unit tests ( #21572 )
...
Signed-off-by: Zhou Fang <fang.github@gmail.com>
2025-07-24 20:51:15 -07:00
Simon Mo
a6c7fb8cff
[Docs] Add Expert Parallelism Initial Documentation ( #21373 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 12:36:06 -07:00
Ricardo Decal
a7272c23d0
[Docs][minor] Fix broken gh-file link in distributed serving docs ( #21543 )
...
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
2025-07-24 10:36:56 -07:00
Ricardo Decal
684174115d
[Docs] Rewrite Distributed Inference and Serving guide ( #20593 )
...
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 08:13:05 -07:00
Sanger Steel
cdb79ee63d
[Docs] Update Tensorizer usage documentation ( #21190 )
...
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Signed-off-by: William Goldby <willgoldby@gmail.com>
Co-authored-by: William Goldby <willgoldby@gmail.com>
2025-07-24 06:56:18 -07:00
elvischenv
5a19a6c670
[Fix] Update mamba_ssm to 2.2.5 ( #21421 )
...
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
2025-07-24 03:25:41 -07:00
Harry Mellor
13abd0eaf9
[Model] Officially support Emu3 with Transformers backend ( #21319 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 03:22:12 -07:00
Shintarou Okada
6eca337ce0
Replace --expand-tools-even-if-tool-choice-none with --exclude-tools-when-tool-choice-none for v0.10.0 ( #20544 )
...
Signed-off-by: okada <kokuzen@gmail.com>
Signed-off-by: okada shintarou <okada@preferred.jp>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 02:56:36 -07:00
deven-labovitch
63d92abb7c
[Frontend] Set MAX_AUDIO_CLIP_FILESIZE_MB via env var instead of hardcoding ( #21374 )
...
Signed-off-by: Deven Labovitch <deven@videa.ai>
2025-07-23 20:22:19 -07:00
Michael Goin
82ec66f514
[V0 Deprecation] Remove Prompt Adapters ( #20588 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-23 16:36:48 -07:00
Asher
2671334d45
[Model] add Hunyuan V1 Dense Model support. ( #21368 )
...
Signed-off-by: Asher Zhang <asherszhang@tencent.com>
2025-07-23 03:54:08 -07:00
Michael Yao
2cc5016a19
[Docs] Clean up v1/metrics.md ( #21449 )
...
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-07-23 03:37:25 -07:00
Michael Yao
23637dcdef
[Docs] Fix bullets and grammars in tool_calling.md ( #21440 )
...
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-07-23 01:23:20 -07:00
Raushan Turganbay
f38ee34a0a
[feat] Enable mm caching for transformers backend ( #21358 )
...
Signed-off-by: raushan <raushan@huggingface.co>
2025-07-22 08:18:46 -07:00
Raghav Ravishankar
82b8027be6
Add arcee model ( #21296 )
...
Signed-off-by: alyosha-swamy <raghav@arcee.ai>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-22 00:57:43 -07:00
Li, Jiang
5e70dcd6e6
[Doc] Fix CPU doc format ( #21316 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-21 21:47:49 -07:00
Li, Jiang
a15a50fc17
[CPU] Enable shared-memory based pipeline parallel for CPU backend ( #21289 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-21 09:07:08 -07:00
Ning Xie
d97841078b
[Misc] unify variable for LLM instance ( #20996 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-07-21 12:18:33 +01:00
Harry Mellor
e6b90a2805
[Docs] Make tables more space efficient in supported_models.md ( #21291 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-21 02:25:02 -07:00
Harry Mellor
be54a951a3
[Docs] Fix hardcoded links in docs ( #21287 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-21 02:23:57 -07:00
Cyrus Leung
042af0c8d3
[Model][1/N] Support multiple poolers at model level ( #21227 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-21 02:22:21 -07:00
Raushan Turganbay
9499e26e2a
[Model] Support VLMs with transformers backend ( #20543 )
...
Signed-off-by: raushan <raushan@huggingface.co>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-20 13:25:50 +00:00
Thomas Parnell
2b504eb770
[Docs] [V1] Update docs to remove enforce_eager limitation for hybrid models. ( #21233 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-07-19 16:09:58 -07:00
Yuxuan Zhang
10eb24cc91
GLM-4 Update ( #20736 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Lu Fang <fanglu@fb.com>
2025-07-19 22:40:31 +00:00
Woosuk Kwon
752c6ade2e
[V0 Deprecation] Deprecate BlockSparse Attention & Phi3-Small ( #21217 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-19 13:53:17 -07:00
Jiayi Yan
6a971ed692
[Docs] Update the link to the 'Prometheus/Grafana' example ( #21225 )
2025-07-19 06:58:07 -07:00