Aaron Pham
|
b37685afbb
|
[CI] Uses Python 3.11 for TPU (#17359)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-04-29 17:39:16 +00:00 |
|
Nicolò Lucchesi
|
792595b59d
|
[TPU][V1][CI] Replace python3 setup.py develop with standard pip install --e on TPU (#17374)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-04-29 10:36:48 -07:00 |
|
casinca
|
0c1c788312
|
[Doc][Typo] Fixing label in new model requests link in overview.md (#17400)
|
2025-04-29 10:29:48 -07:00 |
|
Russell Bryant
|
56d64fbe30
|
[Docs] Propose a deprecation policy for the project (#17063)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-29 10:29:44 -07:00 |
|
Alexei-V-Ivanov-AMD
|
608968b7c5
|
Enabling multi-group kernel tests. (#17115)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-04-29 10:27:27 -07:00 |
|
TY-AMD
|
06ffc7e1d3
|
[Misc][ROCm] Exclude cutlass_mla_decode for ROCm build (#17289)
Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>
|
2025-04-29 10:26:42 -07:00 |
|
Qiming Zhang
|
d3cf61b89b
|
fix gemma3 results all zero (#17364)
Signed-off-by: mayuyuace <qiming1.zhang@intel.com>
|
2025-04-29 09:40:25 -07:00 |
|
mofanke
|
a39203f99e
|
[Bugfix] add qwen3 reasoning-parser fix content is None when disable … (#17369)
Signed-off-by: mofanke <mofanke@gmail.com>
|
2025-04-29 16:32:40 +00:00 |
|
Chen Zhang
|
24e6ad3f16
|
[V1] Remove num_input_tokens from attn_metadata (#17193)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-04-29 09:28:41 -07:00 |
|
Harry Mellor
|
2ef5d106bb
|
Improve literal dataclass field conversion to argparse argument (#17391)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 16:25:08 +00:00 |
|
a2q1p
|
0ed27ef66c
|
Fix: Spelling of inference (#17387)
|
2025-04-29 09:23:39 -07:00 |
|
Harry Mellor
|
900edfa8d4
|
Transformers backend tweaks (#17365)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 09:08:03 -07:00 |
|
Cyrus Leung
|
88ad9ec6b2
|
[Frontend] Support chat_template_kwargs in LLM.chat (#17356)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-29 22:03:35 +08:00 |
|
Harry Mellor
|
40896bdf3f
|
pre-commit autoupdate (#17380)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 06:46:55 -07:00 |
|
Cyrus Leung
|
00ee37efa2
|
[Bugfix] Clean up MiniMax-VL and fix processing (#17354)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-29 20:42:16 +08:00 |
|
Jee Jee Li
|
890f104cdf
|
[Doc] Fix QWen3MOE info (#17381)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-04-29 12:38:32 +00:00 |
|
Harry Mellor
|
4a5e13149a
|
Update docs requirements (#17379)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 11:35:47 +00:00 |
|
Ekagra Ranjan
|
97cc8729f0
|
[Model] Ignore rotary embed load for Cohere model (#17319)
|
2025-04-29 00:30:40 -07:00 |
|
Gregory Shtrasberg
|
4464109219
|
[Build][Bugfix] Restrict setuptools version to <80 (#17320)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-04-29 00:17:23 -07:00 |
|
Hyogeun Oh (오효근)
|
193e78e35d
|
[Fix] Documentation spacing in compilation config help text (#17342)
Signed-off-by: Zerohertz <ohg3417@gmail.com>
|
2025-04-29 00:16:17 -07:00 |
|
ponix-j
|
bdb2cddafc
|
[Misc]Use a platform independent interface to obtain the device attributes (#17100)
|
2025-04-29 06:59:13 +00:00 |
|
Cyrus Leung
|
ebb3930d28
|
[Misc] Move config fields to MultiModalConfig (#17343)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-29 06:37:21 +00:00 |
|
qscqesze
|
cde384cd92
|
[Model] support MiniMax-VL-01 model (#16328)
Signed-off-by: qingjun <qingjun@minimaxi.com>
|
2025-04-29 12:05:50 +08:00 |
|
Chauncey
|
96e06e3cb7
|
[Misc] Add a Jinja template to support Mistral3 function calling (#17195)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-28 19:53:44 -07:00 |
|
Zhengyuan Su (苏政渊)
|
17eb306fcc
|
[Bugfix] Add contiguous call inside rope kernel wrapper (#17091)
Signed-off-by: 苏政渊 <suzhengyuan@moonshot.cn>
Co-authored-by: 苏政渊 <suzhengyuan@moonshot.cn>
|
2025-04-28 19:24:07 -07:00 |
|
Richard Zou
|
165cb56329
|
Ignore '<string>' filepath (#17330)
Signed-off-by: rzou <zou3519@gmail.com>
|
2025-04-28 19:23:29 -07:00 |
|
Richard Barnes
|
d6da8a8ff2
|
[Bugfix] Fix numel() downcast in fused_layernorm_dynamic_per_token_quant.cu (#17316)
|
2025-04-28 19:23:18 -07:00 |
|
Lucia Fang
|
b4ac4fa04d
|
[model] make llama4 compatible with pure dense layers (#17315)
Signed-off-by: Lucia Fang <fanglu@fb.com>
|
2025-04-29 10:22:22 +08:00 |
|
Ekagra Ranjan
|
e136000595
|
[V1][Spec Decode] Make Eagle model arch config driven (#17323)
|
2025-04-29 10:22:02 +08:00 |
|
Michał Moskal
|
86d9fc29cb
|
implement Structural Tag with Guidance backend (#17333)
Signed-off-by: Michal Moskal <michal@moskal.me>
|
2025-04-29 02:21:32 +00:00 |
|
Cyrus Leung
|
506475de5f
|
[Optim] Compute multimodal hash only once per item (#17314)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-29 09:40:35 +08:00 |
|
Ekagra Ranjan
|
cfe4532093
|
[Benchmark] Add single turn MTBench to Serving Bench (#17202)
|
2025-04-28 16:46:15 -07:00 |
|
Michael Goin
|
8fc88d63f1
|
[Model] Add tuned triton fused_moe configs for Qwen3Moe (#17328)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-28 15:20:24 -07:00 |
|
Alex Wu
|
6e74fd4945
|
Support loading transformers models with named parameters (#16868)
Signed-off-by: Alex <alexwu@character.ai>
|
2025-04-28 23:15:58 +01:00 |
|
Simon Mo
|
dcbac4cb4b
|
[Model] Qwen3 Dense FP8 Compat Fixes (#17318)
Signed-off-by: simon-mo <xmo@berkeley.edu>
|
2025-04-28 14:12:01 -07:00 |
|
Charlie Fu
|
ed2462030f
|
[Bugfix] Fix moe weight losing all extra attrs after process_weights_after_loading. (#16854)
Signed-off-by: charlifu <charlifu@amd.com>
|
2025-04-28 21:05:07 +00:00 |
|
Lucas Wilkinson
|
cc5befbced
|
[BugFix] Fix cascade attention - RuntimeError: scheduler_metadata must have shape (metadata_size) (#17283)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
|
2025-04-28 13:55:50 -07:00 |
|
Aaron Pham
|
2c89cd96a8
|
[Chore] cleanup license indicators in light of SPDX (#17259)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
|
2025-04-28 19:43:52 +00:00 |
|
Russell Bryant
|
a0304dc504
|
[Security] Don't bind tcp zmq socket to all interfaces (#17197)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-04-28 10:08:20 -07:00 |
|
Harry Mellor
|
c7941cca18
|
Explicitly explain quant method override ordering and ensure all overrides are ordered (#17256)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-28 16:55:31 +00:00 |
|
Harry Mellor
|
b6dd32aa07
|
Make name of compressed-tensors quant method consistent across vLLM (#17255)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-28 16:28:13 +00:00 |
|
Harry Mellor
|
f94886946e
|
Improve conversion from dataclass configs to argparse arguments (#17303)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-28 16:22:12 +00:00 |
|
Russell Bryant
|
72dfe4c74f
|
[Docs] Add a security guide (#17230)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-28 15:12:17 +00:00 |
|
Cyrus Leung
|
8b464d9660
|
[Misc] Clean up Qwen2.5-Omni code (#17301)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-28 06:20:45 -07:00 |
|
Nicolò Lucchesi
|
889ebb2638
|
[Misc] Minor typo/grammar in platforms/interface.py (#17307)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-04-28 05:45:42 -07:00 |
|
Reid
|
3ad986c28b
|
[doc] update wrong model id (#17287)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-28 04:20:51 -07:00 |
|
Cyrus Leung
|
344e193b7d
|
[Bugfix] Add missing get_language_model to new MLLMs (#17300)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-28 04:09:57 -07:00 |
|
Harry Mellor
|
fb1c933ade
|
Add missing class docstring for PromptAdapterConfig (#17302)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-28 04:06:59 -07:00 |
|
idouba
|
72c5b97231
|
Update tpu_worker.py 's typo (#17288)
|
2025-04-28 04:01:15 -07:00 |
|
Alex Brooks
|
fa93cd9f60
|
[Model] Add Granite Speech Support (#16246)
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-04-28 10:05:00 +00:00 |
|