xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-21 13:55:01 +08:00

Author	SHA1	Message	Date
Roger Wang	0ff70821c9	[Core] Deprecate `xformers` (#29262 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-11-24 04:18:55 +00:00
Lukas Geiger	a9705a290a	[Model][QwenVL] Replace `torch.repeat_interleave` with faster `np.repeat` (#28964 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-19 22:04:23 -08:00
Harry Mellor	a8b70304d6	Update `rope_scaling` to `rope_parameters` in preparation for Transformers v5 (#28542 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-19 09:06:36 -08:00
Lukas Geiger	3d4e7d34be	[Model][QwenVL] Simplify cos/sin rotary embedding indexing (#28962 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-19 05:43:01 +00:00
Canlin Guo	b9489f51e1	[Model][Perf] Use cos and sin cache in QwenVL (#28798 ) Signed-off-by: gcanlin <canlinguosdu@gmail.com>	2025-11-18 11:51:54 +00:00
Shanshan Shen	41b92f7d38	[Model][MM] Extract conv layer as CustomOp (#28455 ) Signed-off-by: shen-shanshan <467638484@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-14 19:16:13 +08:00
Harry Mellor	97d1c99302	Rename clashing method names for vLLM model protocol (#27583 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-12 19:14:33 -08:00
Canlin Guo	bc5bd45c7d	[Refactor] Remove redundant TP gather/split in split_qkv in QwenVL (#28271 ) Signed-off-by: gcanlin <canlinguosdu@gmail.com>	2025-11-12 15:56:47 +00:00
Cyrus Leung	afffd3cc8a	[Model] Pass `mm_features` directly into `get_mrope_input_positions` (#28399 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-11 21:14:48 +08:00
Matthew Bonanni	b30dfa03c5	[Attention] Refactor CUDA attention backend selection logic (#24794 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-11 07:40:44 -05:00
Cyrus Leung	d0e186c16f	[V0 Deprecation] Remove unused `context_len` and `seq_len` from M-RoPE (#28395 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-11 00:30:06 +08:00
Lukas Geiger	e0919f331d	[Core][MM] Add mechanism to configure multimodal fields which should stay on CPU (#28168 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-07 12:14:29 +00:00
Cyrus Leung	879a06579e	[CI/Build] Bump transformers version (#27528 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-31 22:11:07 -07:00
Yan Ma	7e2729b57e	[Multimodal][XPU]Enable vision attn backend for xpu platform (#27525 ) Signed-off-by: Yan Ma <yan.ma@intel.com> Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Co-authored-by: Yejing Lai <yejing.lai@intel.com> Co-authored-by: Guancheng Fu <110874468+gc-fu@users.noreply.github.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>	2025-11-01 04:45:02 +00:00
Cyrus Leung	cbd5e07a51	[Model] Use merge_by_field_config for MM models (Qwen series) (#27546 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-27 05:38:05 +00:00
JartX	65d2cf9511	[BUGFIX][ROCM] ViT FlashAttention on ROCm (no GFX9) and contiguous on qwen3vl ROCm TORCH_SDPA (#27190 ) Signed-off-by: JartX <sagformas@epdcenter.es> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-10-26 15:08:52 +08:00
Isotr0py	42efe609ba	[MM][Bugfix] Replace `PatchEmbed`'s conv3d to linear layer (#27418 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-24 07:32:47 +00:00
Bradley D	570c3e1cd4	[Bugfix] Honor --mm_encoder_attn_backend when used (#27124 ) Co-authored-by: Bradley D <4551889+bradleyhd@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-23 20:09:52 +08:00
Roger Wang	c3a2c6ac5f	[MM][Core] Decouple ViT backend from LM backend (#27061 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-10-21 00:30:10 -07:00
Lukas Geiger	5c2acb270a	[Models][QwenVL] Remove unnecessary `.contiguous()` calls (#27106 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-10-18 07:05:05 -07:00
Cyrus Leung	d2f816d6ff	[Bugfix] Standardize merging multimodal embeddings (#26771 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-14 09:36:21 +00:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
yuafng	86ee949128	Fix tensor device and dtype placement in Qwen2VL model (#26219 ) Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Yuanfeng Li <yuanfengli@meta.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-10-04 06:41:39 -07:00
Chendi.Xue	dd96465fd7	[BugFix][QWEN-VL]fix wrong apply_rotary_emb_torch selection introduced by #24642 (#26123 ) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com> Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-10-03 08:52:26 -07:00
Wenlong Wang	79aa244678	[Multi Modal] Configurable MM Profiling (#25631 ) Signed-off-by: wwl2755 <wangwenlong2755@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-03 03:59:10 -07:00
TJian	9c5ee91b2a	[ROCm] [VL] [Bugfix] Fix vit flash attn dispatcher logic for ROCm (#26104 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-10-02 22:34:53 -07:00
Matthew Bonanni	2aaa423842	[Attention] Move Backend enum into registry (#25893 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-10-02 20:32:24 -07:00
vllmellm	5e4a8223c6	[Qwen][ROCm] Flash Attention Rotary Embeddings (#24642 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>	2025-10-02 08:26:08 -07:00
Isotr0py	bd51f78e39	[V0 Deprecation][Models] Remove all V0 condition for mm embeddings merge (#25331 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: isotr0py <2037008807@qq.com>	2025-09-29 14:09:18 +08:00
Isotr0py	0efd540dbc	[VLM] Update Qwen3-VL max_num_video_tokens calculation for configurable video profiling (#25557 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-09-28 04:21:01 +00:00
Cyrus Leung	27d7638b94	[Bugfix] Merge MM embeddings by index instead of token IDs (#16229 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: NickLucche <nlucches@redhat.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: NickLucche <nlucches@redhat.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-09-27 08:15:12 +00:00
Isotr0py	d4d9899860	[Quantization] Add field to skip unquantized modules for GPTQ config (#25455 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-26 15:47:41 +00:00
Isotr0py	17b4c6685c	[Bugfix] Fix Qwen3-VL max_num_video_tokens calculation for video profiling (#25648 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-25 18:36:01 +08:00
Cyrus Leung	babad6e5dd	[Misc] Move DP for ViT code inside model executor dir (#25459 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-23 09:20:52 +00:00
Cyrus Leung	c98be0a232	[Model] Enable DP for ViT in Qwen2-VL (#25445 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-09-23 05:17:10 +00:00
Woosuk Kwon	1c3ffdbecc	[V0 Deprecation] Remove V0 sampling metadata (#25345 ) Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>	2025-09-21 10:37:11 -07:00
Wenlong Wang	035fd2bd2c	[Multi Modal][Performance] Fused Q,K's apply_rope in more models (#25005 ) Signed-off-by: wwl2755 <wangwenlong2755@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-09-21 03:55:10 +00:00
Aziz	38db529f66	[feat]: Create interface for model-specific M-RoPE (#24194 ) Signed-off-by: AzizCode92 <azizbenothman76@gmail.com> Signed-off-by: Aziz <azizbenothman76@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-09-18 19:18:56 +00:00
Roger Wang	0f7acdd73c	[Model] Support Qwen3-VL Model Series (#24727 ) Signed-off-by: Roger Wang <hey@rogerw.io> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com> Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-17 05:01:04 +00:00
Hyogeun Oh (오효근)	9a8966bcc2	[Docs] Fix warnings in mkdocs build (continued) (#24791 ) Signed-off-by: Zerohertz <ohg3417@gmail.com>	2025-09-13 00:13:44 -07:00
Lukas Geiger	57f94e88ea	[Models] Optimise and simplify `_validate_and_reshape_mm_tensor` (#24742 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-09-12 15:37:37 +00:00
Wenlong Wang	72fc8aa412	[Multi Modal] Add FA3 in VIT (#24347 ) Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>	2025-09-12 21:27:24 +08:00
Chatcharin Sangbutsarakum	60f0843ef8	[Model] Remove unnecessary CUDA sync of Qwen2VL image and video preprocess (#24334 ) Signed-off-by: Win <chatcharinsang@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-09-07 23:11:12 -07:00
Benji Beck	37a6fa95fd	Migrate Qwen2 inputs to TensorSchema (#23475 ) Signed-off-by: Benji Beck <benjibeck@meta.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-09-06 20:07:31 -07:00
Roger Wang	eddaafc1c7	[Multimodal] Improve max video embedding length estimation in V1 (#24312 ) Signed-off-by: Roger Wang <hey@rogerw.me> Co-authored-by: Roger Wang <hey@rogerw.me>	2025-09-06 02:33:19 -07:00
bppps	424fb7a5d2	[BugFix] Fix the issue where image embeddings were incorrectly split.… (#23366 ) Signed-off-by: bppps <bpppsaka@gmail.com> Co-authored-by: zouyu.zzx <zouyu.zzx@alibaba-inc.com> Co-authored-by: bppps <bpppsaka@gmail.com>	2025-08-22 16:56:46 +00:00
TJian	1298c67795	[FEAT] [Performance] Enable DP for ViT in Qwen2.5VL (#22742 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-08-19 15:25:57 +00:00
Cyrus Leung	27e8d1ea3e	[Refactor] Define MultiModalKwargsItems separate from MultiModalKwargs (#23053 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-18 09:52:00 +00:00
Woosuk Kwon	c55bc1db26	[Misc] Remove dead return (#23061 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-17 10:36:46 -07:00

1 2 3 4

151 Commits