xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-03-25 20:59:08 +08:00

Author	SHA1	Message	Date
haoyangli-amd	06462392e4	[bugfix][quantization] fix quark qwen3 kv_cache quantization (#30308 ) Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>	2025-12-10 03:24:12 +00:00
Tsukasa OI	73a484caa1	[Model][Quantization] Fix / Add GGUF support for Qwen2 MoE models (#30307 ) Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>	2025-12-09 19:13:10 +00:00
wang.yuqi	9c32df6101	[Bugfix] Qwen 3 VL Embedding loading (#30303 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-09 08:04:02 +00:00
shaharmor98	fcd5306f65	Add latent MoE support (#30203 ) Signed-off-by: Shahar Mor <smor@nvidia.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-12-08 17:35:01 +00:00
Daniel Cámpora	184076c3fe	[DeepSeek v3.2] Make top-k work for any logit values. (#27568 ) Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-12-08 06:55:58 -08:00
wang.yuqi	9e77ffca3f	[Model][7/N] Improve all pooling task \| Deprecation as_reward_model. Extract hidden states prefer using new multi-vector retrieval API (#26686 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-08 08:10:09 +00:00
Dazhi Jiang	bcb6f5947f	[Perf] Remove sync point in vit torch sdpa attn backend (#30232 ) Signed-off-by: Dazhi Jiang <dazhi_jiang@163.com>	2025-12-08 07:12:42 +00:00
Cyrus Leung	e83b7e379c	Revert "[Renderer] Separate out `RendererConfig` from `ModelConfig` (#30145 )" (#30199 )	2025-12-07 00:00:22 -08:00
Cyrus Leung	27f4c2fd46	[Renderer] Separate out `RendererConfig` from `ModelConfig` (#30145 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 23:15:42 -08:00
Cyrus Leung	671427efbf	[Model] Move `multimodal_cpu_fields` definition to field config (#30181 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 13:40:02 +00:00
Cyrus Leung	c46b932df2	[Chore] Deprecate `SupportsMultiModal.merge_by_field_config` (#30170 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 07:57:28 +00:00
Peter Salas	e858bc4d14	[Model] Add support for transformer-based Ultravox v0.7 projector (#30089 ) Signed-off-by: Peter Salas <peter@fixie.ai>	2025-12-05 20:55:43 -08:00
Divakar Verma	962d703818	[Bugfix][llama4_eagle] Fix missing 'lm_head' attribute (#29926 ) Signed-off-by: Divakar Verma <divakar.verma@amd.com>	2025-12-05 19:57:26 +00:00
Matthew Bonanni	66e674cdd5	[Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments (#26315 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>	2025-12-05 09:48:43 -08:00
amitz-nv	6038b1b04b	[Frontend][Model] Add 'float16' to possible mamba cache dtype values, override mamba SSM cache dtype value for NemotronH (#29978 ) Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>	2025-12-05 00:34:33 -08:00
Harry Mellor	e10c84e06a	Access `partial_rotary_factor` from `rope_parameters` (#29966 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-04 18:42:49 +00:00
Tao Yun	6dcb07f676	support qwen3-vl handle requests with embeddings (#30037 ) Signed-off-by: taoyun <1069423820@qq.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-12-04 17:34:06 +00:00
Cyrus Leung	b286a311c2	[Chore] Deprecate `merge_by_field_config` arg (#30035 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-04 17:21:24 +00:00
Harry Mellor	9998ea5b57	Delete HF version of Phi 4 MM (#30049 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-04 13:44:50 +00:00
wang.yuqi	74c4d80c6c	[Model][6/N] Improve all pooling task \| Support chunked prefill with ALL pooling (#27145 ) Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-04 13:44:15 +00:00
Cyrus Leung	68eb5c8d97	[Misc] Move functions into `PoolingMetadata` (#30027 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-04 08:21:19 +00:00
TJian	3f1b03739a	[ROCm] [Bugfix] `compute_attn_mask_seqlen` for qwen3 omni (#29974 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-12-04 08:20:24 +00:00
Cyrus Leung	9ae2f60374	[Misc] Various cleanups for MM input processing (#29970 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-04 06:22:20 +00:00
Isotr0py	a21cd9ed23	[Bugfix] Fix incorrect `image_grid_thw` rank for HunyuanOCR from missing `merge_by_field_config=True` (#29950 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-12-03 10:05:10 +00:00
Julien Denize	5e5646e206	[BUGFIX] llama_4_scaling wrongly passed to DeepseekAttention (#29908 ) Signed-off-by: juliendenize <julien.denize@mistral.ai>	2025-12-02 14:51:20 -08:00
Harry Mellor	6fc5841db1	Fix some more Transformers nightly tests (#29872 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-02 21:49:44 +00:00
Navanit Dubey	a2b053dc85	feat(model): Add BitsAndBytes quantization support for Qwen3-Omni-MoE (#29896 ) Signed-off-by: navanit-git <navanitdubey@gmail.com>	2025-12-02 19:28:35 +00:00
Isotr0py	0ec8422171	[Bugfix] Fix incorrect channel order for idefics3 in edge case (#29881 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-02 16:03:52 +00:00
Matthew Bonanni	51c57b51dd	[Bugfix] Fix DeepSeek R1 MTP weight loading (#29545 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>	2025-12-02 15:52:18 +00:00
Cyrus Leung	68ffbca7e4	[Chore] Use `tokenizer.encode` and `tokenizer.decode` directly (#29851 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-02 12:30:40 +00:00
Julien Denize	d8c6210eea	Add Mistral Large 3 and Ministral 3 (#29757 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by: Mickael Seznec <mickael@mistral.ai> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Mickael Seznec <mickael@mistral.ai>	2025-12-02 10:29:00 +00:00
Harry Mellor	f5b0846ba0	Fix some Transformers nightly tests (#29802 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-02 07:05:27 +00:00
Cyrus Leung	653591d5e7	[Chore] Move tokenizer initialization methods (#29793 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-02 13:33:37 +08:00
Johnny Yang	f441d36cee	Add missing return in _check_vllm_model_embed_input_ids (#29834 ) Signed-off-by: Johnny Yang <johnnyyang@google.com>	2025-12-01 19:22:50 -08:00
sangbumlikeagod	092bb73b8a	[Frontend] add 'verbose_json' and 'timestamp' feature on Whisper Transcription/Translation (#24209 ) Signed-off-by: sangbumlikeagod <oironese@naver.com> Signed-off-by: sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com>	2025-12-01 18:19:17 +01:00
Xingyu Liu	21c2627934	[Misc]Remove redundant hidden_size property in ModelConfig (#29749 ) Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-30 17:14:23 +00:00
Cyrus Leung	64bc09ba27	[Core] Enable `inputs_embeds_size` separate from `hidden_size` (#29741 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-30 17:31:12 +08:00
Cyrus Leung	fe3398fab2	[Chore] Enable passing `tokenizer=None` into MM processor (#29724 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-29 06:25:10 -08:00
Cyrus Leung	34a984274e	[Misc] Refactor tokenizer interface (#29693 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-29 04:02:21 -08:00
Jee Jee Li	39e63dec7c	[LoRA] Cleanup LoRA unused code (#29611 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-11-28 22:52:58 -08:00
Jiangyun Zhu	a51f4186f2	[Bugfix] fix dots.llm1.inst (#29687 ) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-28 15:25:26 -08:00
Cyrus Leung	7675ba30de	[Misc] Remove redundant `ClassRegistry` (#29681 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-11-28 15:24:47 -08:00
Isotr0py	f946a8d743	[Chore]: Reorganize model repo operating functions in `transformers_utils` (#29680 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-28 08:46:51 -08:00
Didier Durand	fae6943068	[Doc]: fixing typos in multiple files. (#29685 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-11-28 08:41:41 -08:00
Mingyuan Ma	460d8bbf2d	Remove upstream fa checks (#29471 ) Signed-off-by: mingyuanm <mingyuanm@nvidia.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-28 05:52:42 -08:00
Cyrus Leung	33b06a6f24	[Misc] Remove redundant attention var constants (#29650 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-28 04:35:19 -08:00
Filipp Fisin	5f5521bd5d	Fix parameter order in GPT-OSS weight loading function for non-MXFP4 weights (#29506 ) Signed-off-by: Filipp Fisin <48059208+qGentry@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-11-28 00:45:10 -08:00
Cyrus Leung	b34e8775a3	Revert "[CPU]Update CPU PyTorch to 2.9.0 (#29589 )" (#29647 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-27 22:43:18 -08:00
wang.yuqi	f4b76056ee	Improve enable chunked_prefill & prefix_caching logic. (#26623 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-11-27 22:05:48 -08:00
EanWang211123	37b15e97e8	[Multimodal][Speculative Decoding]Eagle3 mm support, enablement on qwen3vl (#29594 ) Signed-off-by: Tsai, Louie <louie.tsai@intel.com> Signed-off-by: EanWang211123 <wangyiheng@sangfor.com.cn> Co-authored-by: Louie Tsai <louie.tsai@intel.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-11-27 22:05:45 -08:00

1 2 3 4 5 ...

1951 Commits