Michael Goin
1b86bd8e18
Add more libraries to rlhf.md ( #26374 )
...
Signed-off-by: Michael Goin <mgoin64@gmail.com>
2025-10-07 20:59:41 +00:00
Paul Pak
320feae6f5
[Model] Lfm2Moe ( #26344 )
...
Signed-off-by: Paul Pak <paulpak58@gmail.com>
2025-10-07 16:03:05 +00:00
antrec
6f59beaf0b
[Model] Add support for ModernBertForTokenClassification ( #26340 )
...
Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>
Signed-off-by: antrec <antoine.recanati@gmail.com>
Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-07 14:29:19 +00:00
fxmarty-amd
41f1cf38f2
[Feature][OCP MX] Support mxfp6 and mixed mxfp6-mxfp4 ( #21166 )
2025-10-07 09:35:26 -04:00
fhl2000
63773a6200
[Docs] add docs for cuda graph v1 ( #24374 )
...
Signed-off-by: fhl <2410591650@qq.com>
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2025-10-07 05:25:05 -07:00
Sergio Paniego Blanco
883b42896a
Add TRL example notebook to RLHF docs ( #26346 )
...
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
2025-10-07 11:31:28 +00:00
Cyrus Leung
7e4cd070b0
[V0 Deprecation] Remove VLLM_USE_V1 from docs and scripts ( #26336 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-07 16:46:44 +08:00
Sage Moore
c50901f3b9
[Docs][DBO] Add initial doc that describes the DBO implementation ( #26024 )
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-10-07 00:47:28 +00:00
Varun Sundar Rabindranath
93540958b8
[Docs] Fix broken table in moe_kernel_features doc ( #26314 )
...
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
2025-10-06 15:58:05 -04:00
Cyrus Leung
44b9af5bb2
[Benchmark] Enable MM Embedding benchmarks ( #26310 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-06 19:51:58 +00:00
abhisheksheth28
77c95f72f7
[Doc] add KAITO to integrations ( #25521 )
...
Signed-off-by: "Abhishek Sheth" <absheth@microsoft.com>
2025-10-06 17:30:03 +08:00
Aritra Roy Gosthipaty
59f30d0448
[Docs] Edit HF Inference Endpoints documentation ( #26275 )
...
Signed-off-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Signed-off-by: ariG23498 <aritra.born2fly@gmail.com>
2025-10-06 10:13:09 +01:00
orangeng
59b477645c
[Doc] Edited minor typo ( #26266 )
...
Signed-off-by: Orange Ng <ngquanhao@outlook.com>
2025-10-05 19:53:09 -07:00
Elieser Pereira
f509a20846
[DOC] Update production-stack.md ( #26177 )
...
Signed-off-by: Elieser Pereira <elieser.pereiraa@gmail.com>
2025-10-05 21:32:48 +00:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort ( #26247 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Maximilien de Bayser
e0986ea07b
Add documentation for granite 4 tool calling ( #26175 )
...
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2025-10-05 07:35:42 +00:00
Cyrus Leung
4570535ec4
[Model] CLIP Embedding Support ( #26010 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-04 06:21:42 -07:00
Harry Mellor
d3d649efec
Support expert parallel in Transformers backend ( #26162 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-04 04:35:04 +00:00
Varun Sundar Rabindranath
7ef40bb983
[GPTOSS][DP/EP][Marlin] Enable GPTOSS DP/EP using Marlin kernels ( #25488 )
...
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-10-03 20:13:13 -04:00
Wenlong Wang
79aa244678
[Multi Modal] Configurable MM Profiling ( #25631 )
...
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-03 03:59:10 -07:00
Cyrus Leung
f9a8084e48
[Model] Use merge_by_field_config for MM models (InternVL family) ( #26153 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-03 01:59:06 -07:00
Harry Mellor
10d765482d
FusedMoE support for the Transformers backend (#22650 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-02 23:12:15 -07:00
Tyler Michael Smith
27edd2aeb4
[Build/CI] Revert back to Ubuntu 20.04, install python 3.12 with uv ( #26103 )
...
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2025-10-02 22:21:01 -07:00
Chenheli Hua
ad87ba927a
[Small] Prevent bypassing media domain restriction via HTTP redirects ( #26035 )
...
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
2025-10-02 10:27:10 -07:00
Cyrus Leung
d00d652998
[CI/Build] Replace vllm.entrypoints.openai.api_server entrypoint with vllm serve command ( #25967 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-02 10:04:57 -07:00
Huy Do
d4e7a1152d
Update base image to 22.04 (jammy) ( #26065 )
...
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-10-02 05:48:04 -07:00
pwschuurman
be22bb6f3d
Run:ai model streamer add GCS package support ( #24909 )
...
Signed-off-by: Peter Schuurman <psch@google.com>
2025-10-01 20:59:13 -07:00
nadathurv
57b46d769e
[Doc] updating torch.compile doc link ( #25989 )
...
Signed-off-by: nadathurv <work.vnadathur@gmail.com>
Signed-off-by: WorldExplored <srreyansh.sethi@gmail.com>
Co-authored-by: Srreyansh Sethi <107075589+WorldExplored@users.noreply.github.com>
2025-10-01 07:04:56 +00:00
Param
99028fda44
Fix INT8 quantization error on Blackwell GPUs (SM100+) ( #25935 )
...
Signed-off-by: padg9912 <phone.and.desktop@gmail.com>
2025-09-30 19:19:53 -07:00
Harry Mellor
2ce26b9b5d
[Docs] Remove API Reference from search index ( #25949 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-30 22:10:02 +00:00
bnellnm
fb610ae684
[Docs] Add moe kernel features doc ( #25297 )
...
Signed-off-by: Bill Nell <bnell@redhat.com>
Signed-off-by: bnellnm <49004751+bnellnm@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-30 19:03:15 +00:00
Cyrus Leung
2f652e6cdf
[Doc] Improve MM Pooling model documentation ( #25966 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-30 18:58:29 +00:00
Sergio Paniego Blanco
099aaee536
Add Hugging Face Inference Endpoints guide to Deployment docs ( #25886 )
...
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-30 14:35:06 +00:00
Sergio Paniego Blanco
1ad3aca682
Updated TRL integration docs ( #25684 )
...
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-30 03:10:55 -07:00
a120092009
8d0afa9b42
[Doc] Add Cambricon MLU support ( #25942 )
...
Signed-off-by: a120092009 <zhaoty0121@gmail.com>
2025-09-30 17:59:47 +08:00
Andrew Sansom
78a47f87ce
Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT Models ( #25717 )
...
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
2025-09-30 08:10:58 +08:00
Nicolò Lucchesi
2e4fe48c37
[NIXL] Increase default KV block eviction timeout on P ( #25897 )
...
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-09-29 21:35:14 +00:00
Naman Lalit
9bedac9623
[Doc] Add documentation for vLLM continuous benchmarking and profiling ( #25819 )
...
Signed-off-by: Naman Lalit <nl2688@nyu.edu>
2025-09-29 20:49:49 +00:00
Jee Jee Li
e61eb5e09d
[Model] Remove MotifForCausalLM ( #25866 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-09-30 00:36:30 +08:00
Yingjun Mou
edbaadd91f
[Bugfix] Fix requirements paths in install instructions ( #25827 )
...
Signed-off-by: yingjun-mou <renzomou@gmail.com>
2025-09-29 03:49:35 -07:00
Yuxuan Zhang
b1ded114b9
Update GLM-4.5 Doc transformers version ( #25830 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
2025-09-28 12:05:51 +00:00
Jialin Ouyang
c216119d64
[Core] GC Debug callback ( #24829 )
...
Signed-off-by: Jialin Ouyang <jialino@meta.com>
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
Co-authored-by: Jialin Ouyang <jialino@meta.com>
2025-09-27 17:53:31 +00:00
yyzxw
ecb37e276a
[docs] transcriptions API audio upload ( #25446 )
...
Signed-off-by: zxw <1020938856@qq.com>
2025-09-27 15:00:35 +00:00
Cyrus Leung
27d7638b94
[Bugfix] Merge MM embeddings by index instead of token IDs ( #16229 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-09-27 08:15:12 +00:00
Russell Bryant
3958b96bf5
Add option to restrict media domains ( #25783 )
...
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
2025-09-27 01:23:52 +00:00
Michael Goin
0002b7f0d1
[Docs] Add Toronto Meetup ( #25773 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-09-26 12:00:46 -07:00
Clouddude
b761df963c
[Doc]: improve CPU(x86) build-wheel-from-source section ( #25617 )
...
Signed-off-by: Kosseila (CloudThrill) <klouddude@gmail.com>
2025-09-26 10:26:33 -07:00
Cyrus Leung
633f943e30
[Doc] Update Batch-level DP docs ( #25757 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-26 02:37:40 -07:00
Xu Wenqing
b03b1b97f6
Support LongCat-Flash-Chat tool call ( #24083 )
...
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
2025-09-26 09:25:39 +00:00
Harry Mellor
70fbdb26e9
Add backward compatibility for guided_... API ( #25615 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-09-25 19:45:25 +08:00