xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-12 03:13:36 +08:00

Author	SHA1	Message	Date
Bowen Wang	e9fd658a73	[Feature] Expert Parallelism Load Balancer (EPLB) (#18343 ) Signed-off-by: Bowen Wang <abmfy@icloud.com>	2025-06-26 15:30:21 -07:00
Li, Jiang	0567c8249f	[CPU] Fix torch version in x86 CPU backend (#19258 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-06-26 03:34:47 -07:00
Michael Goin	754b00edb3	[Bugfix] Fix Mistral tool-parser regex for nested JSON (#20093 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-06-26 01:01:17 +00:00
Li, Jiang	53da4cd397	[Bugfix][CPU] Fix InputBatch for pooling models in the CPU v1 (#20014 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-06-24 13:20:04 +00:00
cascade	e6327c9b3e	[Feature] Support sequence parallelism for static fp8 quantization (#19181 ) Signed-off-by: cascade812 <cascade812@outlook.com>	2025-06-23 16:09:02 -04:00
Isotr0py	61f4fc5dc6	[Bugfix][v1] Fix step pooler implementation and step pooling usage in v1 (#19956 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-06-23 18:38:06 +00:00
汪志鹏	c3bf9bad11	[New model support]Support Tarsier2 (#19887 ) Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>	2025-06-21 04:01:51 +00:00
Li, Jiang	79f2f1c2a1	[CPU][CI] Fallback sliding window to v0 and fix CPU pooling model tests (#19901 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-06-20 15:30:36 +00:00
Adrian	f1e840e842	[Model] GPT2ForSequenceClassification model (#19663 ) Signed-off-by: nie3e <adrcwiek@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-06-20 12:07:41 +00:00
Yu-Hang "Maxin" Tang	83ca9ae47b	Mark invariant normalizer in Gemma as non-persistent (#19788 ) Signed-off-by: Yu-Hang Tang <Tang.Maxin@gmail.com>	2025-06-18 22:56:03 -07:00
Maximilien de Bayser	799397ee4f	Support embedding models in V1 (#16188 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Signed-off-by: Max de Bayser <maxdebayser@gmail.com> Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-06-18 21:36:33 -07:00
Chen Zhang	a89209b78d	[v1] Support mamba2 (#19327 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-06-18 20:34:15 +00:00
Isotr0py	ca94d7fa00	[Bugfix] Update multimodel models mapping to fit new checkpoint after Transformers v4.52 (#19151 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-06-17 15:58:38 +00:00
qscqesze	387bdf0ab9	[Model] Add support for MiniMaxM1ForCausalLM (shares architecture with MiniMaxText01ForCausalLM) (#19677 ) Signed-off-by: QscQ <qscqesze@gmail.com>	2025-06-16 09:47:14 -07:00
wang.yuqi	f40f763f12	[CI] Add mteb testing for rerank models (#19344 )	2025-06-16 01:36:43 -07:00
Ning Xie	2f1c19b245	[CI] change spell checker from codespell to typos (#18711 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-06-11 19:57:10 -07:00
Michael Goin	1e473b3010	[CI] Disable failing GGUF model test (#19454 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-06-11 05:12:38 +00:00
wang.yuqi	3952731e8f	[New Model]: Support Qwen3 Embedding & Reranker (#19260 )	2025-06-10 20:07:30 -07:00
Luka Govedič	2d8476e465	[BugFix][V1] Fix memory profiling bug (#18974 ) Signed-off-by: luka <luka@neuralmagic.com>	2025-06-07 10:34:51 -07:00
Isotr0py	d2f0e7e615	[CI/Build] Improve Llama GGUF test robustness (#19287 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-06-07 17:23:28 +08:00
Luis Vega	cb6d572e85	[Model] NemotronH support (#18863 ) Signed-off-by: Luis Vega <2478335+vegaluisjose@users.noreply.github.com> Co-authored-by: Luis Vega <2478335+vegaluisjose@users.noreply.github.com>	2025-06-05 21:29:28 +00:00
Cyrus Leung	01dc9a76db	[CI/Build][Bugfix] Ensure compatibility with transformers 4.52 (#18678 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-06-04 04:49:20 -07:00
wang.yuqi	35cf32df30	Improve the output precision of embedding models (#19092 )	2025-06-04 11:48:57 +00:00
Li, Jiang	4555143ea7	[CPU] V1 support for the CPU backend (#16441 )	2025-06-03 18:43:01 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
汪志鹏	1282bd812e	Add tarsier model support (#18985 ) Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>	2025-06-03 13:13:13 +08:00
Cyrus Leung	6aa8f9a4e7	[Core] Rework dtype resolution (#18751 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-06-01 11:04:23 +08:00
Shawn Huang	e1fadf1197	[Feature] minicpm eagle support (#18943 ) Signed-off-by: huangyuxiang03 <huangyx0321@gmail.com> Co-authored-by: huangyuxiang03 <huangyx0321@gmail.com>	2025-05-30 06:45:56 -07:00
Isotr0py	c9479b2920	[Bugfix] Fix the failing gte embedding test (#18720 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-29 07:39:25 -07:00
wang.yuqi	de65fc8e1e	[CI] improve embed testing (#18747 )	2025-05-28 00:16:35 -07:00
wang.yuqi	3e9ce609bd	[Bugfix] Fix nomic max_model_len (#18755 )	2025-05-27 20:29:53 -07:00
Isotr0py	1f1b1bc03b	[V1][Quantization] Add CUDA graph compatible v1 GGUF support (#18646 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-27 04:40:28 +00:00
Cyrus Leung	38b13dfe78	[CI/Build] Replace `math.isclose` with `pytest.approx` (#18703 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-26 02:05:17 -07:00
Cyrus Leung	fba0642704	[CI/Build][Doc] Update `gte-Qwen2-1.5B-instruct` usage (#18683 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2025-05-25 20:27:50 -07:00
Cyrus Leung	57fd13a707	[Bugfix] Fix profiling dummy data for Pixtral (#18677 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-25 14:05:30 +00:00
Isotr0py	75f81750f3	[VLM] Initialize video input support for InternVL models (#18499 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-05-25 04:51:25 +00:00
Yuanhao WU	a859320575	[Model] Add support for Qwen2.5-Omni-7B-AWQ (Qwen2_5OmniForConditionalGeneration) (#18647 )	2025-05-24 09:15:36 +00:00
Feng XiaoLong	4fc1bf813a	[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454 ) Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com> Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>	2025-05-23 16:16:26 -07:00
Michael Goin	0ddf88e16e	[CI] Enable test_initialization to run on V1 (#16736 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-05-23 15:09:44 -07:00
Harry Mellor	4b0da7b60e	Enable hybrid attention models for Transformers backend (#18494 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-23 10:12:08 +08:00
Chenheli Hua	04eb88dc80	Re-submit: Fix: Proper RGBA -> RGB conversion for PIL images. (#18569 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com>	2025-05-23 01:59:18 +00:00
Harry Mellor	ca86a7cf6e	[CI/Build] Update bamba test model location (#18544 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-22 06:01:07 -07:00
Dhia Eddine Rhaiem	eca18691d2	[MODEL] FalconH1 (#18406 ) Signed-off-by: dhia.rhaiem <dhia.rhaiem@tii.ae> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Ilyas Chahed <ilyas.chahed@tii.ae> Co-authored-by: Jingwei Zuo <jingwei.zuo@tii.ae>	2025-05-21 04:59:06 -07:00
Michael Goin	f4a8a37465	[Minor] Rename quantization nvfp4 to modelopt_fp4 (#18356 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-05-20 09:08:37 -07:00
wang.yuqi	86847700d7	[CI] Add mteb testing to test the accuracy of the embedding model (#17175 )	2025-05-20 06:51:12 -07:00
Isotr0py	f07a673eb2	[Misc] Allow `AutoWeightsLoader` to skip loading weights with specific substr in name (#18358 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-19 20:20:12 -07:00
Isotr0py	390ec88905	[Misc] Consolidate Audio tests into multimodal common generation tests (#18214 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-16 09:18:08 +00:00
Alexei-V-Ivanov-AMD	566ec04c3d	Adding "Basic Models Test" and "Multi-Modal Models Test (Extended) 3" in AMD Pipeline (#18106 ) Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-05-15 08:49:23 -07:00
Cyrus Leung	d62a076e84	[Model] GritLM supports other attention backends (#18109 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-14 03:33:19 -07:00
rongfu.leng	82e7f9bb03	[Misc] replace does not exist model (#18119 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-05-14 02:13:47 -07:00

... 2 3 4 5 6 ...

633 Commits