xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-09 14:50:47 +08:00

Author	SHA1	Message	Date
Qiming Zhang	d3cf61b89b	fix gemma3 results all zero (#17364 ) Signed-off-by: mayuyuace <qiming1.zhang@intel.com>	2025-04-29 09:40:25 -07:00
mofanke	a39203f99e	[Bugfix] add qwen3 reasoning-parser fix content is None when disable … (#17369 ) Signed-off-by: mofanke <mofanke@gmail.com>	2025-04-29 16:32:40 +00:00
Chen Zhang	24e6ad3f16	[V1] Remove num_input_tokens from attn_metadata (#17193 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-04-29 09:28:41 -07:00
Harry Mellor	2ef5d106bb	Improve literal dataclass field conversion to argparse argument (#17391 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-29 16:25:08 +00:00
a2q1p	0ed27ef66c	Fix: Spelling of inference (#17387 )	2025-04-29 09:23:39 -07:00
Harry Mellor	900edfa8d4	Transformers backend tweaks (#17365 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-29 09:08:03 -07:00
Cyrus Leung	88ad9ec6b2	[Frontend] Support `chat_template_kwargs` in `LLM.chat` (#17356 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-29 22:03:35 +08:00
Cyrus Leung	00ee37efa2	[Bugfix] Clean up MiniMax-VL and fix processing (#17354 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-29 20:42:16 +08:00
Ekagra Ranjan	97cc8729f0	[Model] Ignore rotary embed load for Cohere model (#17319 )	2025-04-29 00:30:40 -07:00
Hyogeun Oh (오효근)	193e78e35d	[Fix] Documentation spacing in compilation config help text (#17342 ) Signed-off-by: Zerohertz <ohg3417@gmail.com>	2025-04-29 00:16:17 -07:00
ponix-j	bdb2cddafc	[Misc]Use a platform independent interface to obtain the device attributes (#17100 )	2025-04-29 06:59:13 +00:00
Cyrus Leung	ebb3930d28	[Misc] Move config fields to MultiModalConfig (#17343 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-29 06:37:21 +00:00
qscqesze	cde384cd92	[Model] support MiniMax-VL-01 model (#16328 ) Signed-off-by: qingjun <qingjun@minimaxi.com>	2025-04-29 12:05:50 +08:00
Zhengyuan Su (苏政渊)	17eb306fcc	[Bugfix] Add contiguous call inside rope kernel wrapper (#17091 ) Signed-off-by: 苏政渊 <suzhengyuan@moonshot.cn> Co-authored-by: 苏政渊 <suzhengyuan@moonshot.cn>	2025-04-28 19:24:07 -07:00
Richard Zou	165cb56329	Ignore `'<string>'` filepath (#17330 ) Signed-off-by: rzou <zou3519@gmail.com>	2025-04-28 19:23:29 -07:00
Lucia Fang	b4ac4fa04d	[model] make llama4 compatible with pure dense layers (#17315 ) Signed-off-by: Lucia Fang <fanglu@fb.com>	2025-04-29 10:22:22 +08:00
Ekagra Ranjan	e136000595	[V1][Spec Decode] Make Eagle model arch config driven (#17323 )	2025-04-29 10:22:02 +08:00
Michał Moskal	86d9fc29cb	implement Structural Tag with Guidance backend (#17333 ) Signed-off-by: Michal Moskal <michal@moskal.me>	2025-04-29 02:21:32 +00:00
Cyrus Leung	506475de5f	[Optim] Compute multimodal hash only once per item (#17314 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-29 09:40:35 +08:00
Michael Goin	8fc88d63f1	[Model] Add tuned triton fused_moe configs for Qwen3Moe (#17328 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-04-28 15:20:24 -07:00
Alex Wu	6e74fd4945	Support loading transformers models with named parameters (#16868 ) Signed-off-by: Alex <alexwu@character.ai>	2025-04-28 23:15:58 +01:00
Simon Mo	dcbac4cb4b	[Model] Qwen3 Dense FP8 Compat Fixes (#17318 ) Signed-off-by: simon-mo <xmo@berkeley.edu>	2025-04-28 14:12:01 -07:00
Charlie Fu	ed2462030f	[Bugfix] Fix moe weight losing all extra attrs after `process_weights_after_loading`. (#16854 ) Signed-off-by: charlifu <charlifu@amd.com>	2025-04-28 21:05:07 +00:00
Lucas Wilkinson	cc5befbced	[BugFix] Fix cascade attention - RuntimeError: scheduler_metadata must have shape (metadata_size) (#17283 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2025-04-28 13:55:50 -07:00
Russell Bryant	a0304dc504	[Security] Don't bind tcp zmq socket to all interfaces (#17197 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-04-28 10:08:20 -07:00
Harry Mellor	c7941cca18	Explicitly explain quant method override ordering and ensure all overrides are ordered (#17256 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-28 16:55:31 +00:00
Harry Mellor	b6dd32aa07	Make name of `compressed-tensors` quant method consistent across vLLM (#17255 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-28 16:28:13 +00:00
Harry Mellor	f94886946e	Improve conversion from dataclass configs to argparse arguments (#17303 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-28 16:22:12 +00:00
Cyrus Leung	8b464d9660	[Misc] Clean up Qwen2.5-Omni code (#17301 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-28 06:20:45 -07:00
Nicolò Lucchesi	889ebb2638	[Misc] Minor typo/grammar in `platforms/interface.py` (#17307 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-04-28 05:45:42 -07:00
Cyrus Leung	344e193b7d	[Bugfix] Add missing `get_language_model` to new MLLMs (#17300 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-28 04:09:57 -07:00
Harry Mellor	fb1c933ade	Add missing class docstring for `PromptAdapterConfig` (#17302 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-28 04:06:59 -07:00
idouba	72c5b97231	Update tpu_worker.py 's typo (#17288 )	2025-04-28 04:01:15 -07:00
Alex Brooks	fa93cd9f60	[Model] Add Granite Speech Support (#16246 ) Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-04-28 10:05:00 +00:00
Cyrus Leung	aec9674dbe	[Core] Remove legacy input mapper/processor from V0 (#15686 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-28 15:38:48 +08:00
Wanrui Dai	7fcc4223dc	[Minor][Models] Pass partial_rotary_factor parameter to rope (#17266 ) Signed-off-by: evian <eviantai@u.nus.edu> Co-authored-by: evian <eviantai@u.nus.edu>	2025-04-28 04:28:59 +00:00
Nick Hill	8262a3e23b	[Misc] Validate `stop_token_ids` contents (#17268 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-04-28 03:54:05 +00:00
Michael Goin	cb3f2d8d10	[Bugfix] Fix Mistral3 spatial merge error (#17270 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-04-27 19:40:05 -07:00
Lucas Wilkinson	d8bccde686	[BugFix] Fix vllm_flash_attn install issues (#17267 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Aaron Pham <contact@aarnphm.xyz>	2025-04-27 17:27:56 -07:00
Lily Liu	20e489eaa1	[V1][Spec Decode] Make eagle compatible with prefix caching. (#17137 ) Signed-off-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>	2025-04-27 09:29:43 -07:00
Cyrus Leung	4213475ec7	[Metrics] Fix minor inconsistencies in bucket progression (#17262 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-27 16:19:39 +00:00
cascade	690fe019f0	[Feature] support sequence parallelism using compilation pass (#16155 ) Signed-off-by: cascade812 <cascade812@outlook.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-04-27 06:29:35 -07:00
Kaixi Hou	ed7a29d9f8	[NVIDIA] Support Cutlass MLA for Blackwell GPUs (#16032 ) Signed-off-by: kaixih <kaixih@nvidia.com>	2025-04-27 06:29:21 -07:00
Alex Brooks	756848e79e	[Bugfix] Fix Lora Name Parsing (#17196 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-04-27 20:33:09 +08:00
Flex Wang	18445edd0f	[Misc] Change buckets of histogram_iteration_tokens to [1, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8096] to represent number of tokens (#17033 ) Signed-off-by: sfc-gh-zhwang <flex.wang@snowflake.com>	2025-04-27 12:30:53 +00:00
Jade Zheng	30215ca61f	[MISC] Use string annotation types for class definitions (#17244 ) Signed-off-by: Jade Zheng <zheng.shoujian@outlook.com>	2025-04-27 08:39:57 +00:00
Chen Zhang	838cedade7	[Bugfix] Get a specific type of layer from forward context (#17222 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-04-27 00:58:05 -07:00
Jee Jee Li	4283a28c2f	[Bugfix] Fix QWen2 VL multimodal mapping (#17240 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-04-27 05:53:23 +00:00
Cyrus Leung	93a126fbc7	[Misc] Make cached tokenizer pickle-compatible (#17048 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-27 13:05:00 +08:00
rasmith	8e4b351a0c	[Kernel][Triton][FP8] Adding fp8 and variable length sequence support to Triton FAv2 kernel (#12591 ) Signed-off-by: Randall Smith <Randall.Smith@amd.com>	2025-04-27 00:35:08 +00:00

1 2 3 4 5 ...

4173 Commits