xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-21 07:55:01 +08:00

Author	SHA1	Message	Date
fxmarty-amd	7e0b121812	[Bugfix] Add missing `packed_modules_mapping` to `DeepseekV2ForCausalLM` (#22352 ) Signed-off-by: Felix Marty <Felix.Marty@amd.com>	2025-08-07 06:30:48 -07:00
Jee Jee Li	fc91da5499	[Model] Remove DSV2 unused code (#21903 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-30 00:55:03 -07:00
Chendi.Xue	08d2bd78da	[BUGFIX] deepseek-v2-lite failed due to fused_qkv_a_proj name update (#21414 ) Signed-off-by: Chendi.Xue <chendi.xue@intel.com>	2025-07-22 20:33:57 -07:00
Mickaël Seznec	4fb56914c5	[perf] Add fused MLA QKV + strided layernorm (#21116 ) Signed-off-by: Mickael Seznec <mickael@mistral.ai> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-07-22 07:07:44 -07:00
Rui Qiao	217937221b	Elastic Expert Parallel Initial Support (#20775 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-07-18 17:46:09 -07:00
Seiji Eicher	ad6c2e1a0b	Correct PPMissingLayer handling in Deepseek-V2-Lite PP deployment (#20665 ) Signed-off-by: Seiji Eicher <seiji@anyscale.com>	2025-07-09 20:34:40 -07:00
Chendi.Xue	5a52f389dd	[BUGFIX][DEEPSEEK][MODEL_LOAD] fix w13, w2 weight not initialized assert (#20202 ) Signed-off-by: Chendi Xue <chendi.xue@intel.com>	2025-06-29 19:46:19 -07:00
Bowen Wang	e9fd658a73	[Feature] Expert Parallelism Load Balancer (EPLB) (#18343 ) Signed-off-by: Bowen Wang <abmfy@icloud.com>	2025-06-26 15:30:21 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
Harry Mellor	26d0419309	Update deprecated type hinting in `models` (#18132 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-14 22:06:50 -07:00
bnellnm	f9c069c85e	Modularize fused experts and integrate PPLX kernels (#15956 )	2025-05-14 13:11:54 -07:00
Lucas Wilkinson	5e6f939484	[Attention] MLA move rotary embedding to cuda-graph region (#17668 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2025-05-09 11:14:42 +08:00
Lucas Wilkinson	afcb3f8863	[Attention] MLA move o_proj q_proj into cuda-graph region (#17484 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2025-05-02 03:16:26 +00:00
Woosuk Kwon	b411418ff0	[Chore] Remove Sampler from Model Code (#17084 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-04-24 02:49:33 -07:00
DefTruth	905e91e9ac	Revert "[Model] use AutoWeightsLoader for deepseek_v2, internlm2" (#16453 )	2025-04-11 06:44:22 +00:00
Aaron Ang	a9bd832fc5	[Model] use AutoWeightsLoader for deepseek_v2, internlm2 (#16383 ) Signed-off-by: Aaron Ang <aaron.angyd@gmail.com>	2025-04-09 23:01:00 -07:00
Jinzhen Lin	db10422184	[Bugfix] fix deepseek fp16 scale bug (#14809 ) Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-04-08 16:56:09 -04:00
billishyahao	742369d35a	[Frontend][Bugfix] support prefill decode disaggregation on deepseek (#14824 ) Signed-off-by: billishyahao <bill.he@amd.com> Co-authored-by: Zhai Feiyue <80079571+ZhaiFeiyue@users.noreply.github.com>	2025-03-20 00:00:33 -07:00
Concurrensee	c982ac5722	[Bugfix] Fix FP16 overflow for DeepSeek V2 (#13232 ) Signed-off-by: Yida Wu <yida.wu@amd.com>	2025-03-10 20:46:59 -07:00
Tyler Michael Smith	4f5b059f14	Clean up unused padding_idx variables across many model definitions (#13240 ) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-03-04 21:27:00 +00:00
Michael Goin	2b04c209ee	[Bugfix] Allow shared_experts skip quantization for DeepSeekV2/V3 (#14100 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-03-03 14:20:24 -07:00
wang.yuqi	e584b85afd	[Misc] duplicate code in deepseek_v2 (#14106 )	2025-03-03 14:10:11 +08:00
Yang Chen	58d1b2aa77	[Attention] MLA support for V1 (#13789 ) Signed-off-by: Yang Chen <yangche@fb.com>	2025-02-27 13:14:17 -05:00
Harry Mellor	24679788ed	DeepSeek V2/V3/R1 only place `lm_head` on last pp rank (#13833 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-26 01:24:57 +00:00
Lucas Wilkinson	4a8cfc7551	[Bugfix] Fix deepseek-v2 error: "missing 1 required positional argument: 'residual'" (#13802 )	2025-02-24 20:33:59 -08:00
Harry Mellor	cdc1fa12eb	Remove unused kwargs from model definitions (#13555 )	2025-02-24 17:13:52 -08:00
Jongseok Park	781096e385	Expert Parallelism (EP) Support for DeepSeek V2 (#12583 )	2025-02-24 07:33:20 -08:00
Lucia Fang	f525c0be8b	[Model][Speculative Decoding] DeepSeek MTP spec decode (#12755 ) Signed-off-by: Lu Fang <fanglu@fb.com> Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>	2025-02-19 17:06:23 +08:00
Szymon Ożóg	aa375dca9f	[Bugfix] Missing quant_config in deepseek embedding layer (#12836 )	2025-02-06 21:35:09 -08:00
Isotr0py	85ac82d228	[Kernel] Make rotary_embedding ops more flexible with input shape (#12777 )	2025-02-06 08:46:13 -08:00
Michael Goin	449d1bce02	[Misc] Remove duplicated DeepSeek V2/V3 model definition (#12793 )	2025-02-05 23:16:20 -08:00
Isotr0py	98fd089fc9	[VLM] Add MLA with pure RoPE support for deepseek-vl2 models (#12729 )	2025-02-04 20:44:26 -08:00
Russell Bryant	e489ad7a21	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 ) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-02-02 11:58:18 -08:00
Lucas Wilkinson	cabaf4eff3	[Attention] MLA decode optimizations (#12528 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by: simon-mo <xmo@berkeley.edu> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: simon-mo <simon.mo@hey.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> Co-authored-by: Tyler Michael Smith <tysmith@redhat.com> Co-authored-by: Alexander Matveev <59768536+alexm-neuralmagic@users.noreply.github.com> Co-authored-by: simon-mo <xmo@berkeley.edu>	2025-01-30 23:49:37 -08:00
Isotr0py	dd7c9ad870	[Bugfix] Remove hardcoded `head_size=256` for Deepseek v2 and v3 (#12067 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-16 10:11:54 +00:00
Concurrensee	cf6bbcb493	[Misc] Fix Deepseek V2 fp8 kv-scale remapping (#11947 ) Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>	2025-01-12 23:05:06 -08:00
Isotr0py	f967e51f38	[Model] Initialize support for Deepseek-VL2 models (#11578 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-01-12 00:17:24 -08:00
youkaichao	eebad39f26	[torch.compile] support all attention backends (#10558 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-22 14:04:42 -08:00
Isotr0py	c4e464333e	[Misc] Add uninitialized params tracking for `AutoWeightsLoader` (#10327 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-18 09:07:46 +08:00
Roger Wang	643ecf7b11	[V1] Refactor model executable interface for all text-only language models (#10374 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2024-11-17 05:18:46 +00:00
youkaichao	f89d18ff74	[6/N] pass whole config to inner model (#10205 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-11 06:41:46 +00:00
youkaichao	1a95f10ee7	[5/N] pass the whole config to model (#9983 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-09 14:17:28 +08:00
Joe Runde	d58268c56a	[V1] Make v1 more testable (#9888 ) Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>	2024-11-06 11:57:35 -08:00
Aaron Pham	21063c11c7	[CI/Build] drop support for Python 3.8 EOL (#8464 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>	2024-11-06 07:11:55 +00:00
youkaichao	76ed5340f0	[torch.compile] add deepseek v2 compile (#9775 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-10-28 14:35:17 -07:00
Cyrus Leung	7e7eae338d	[Misc] Standardize RoPE handling for Qwen2-VL (#9250 )	2024-10-16 13:56:17 +08:00
Murali Andoorveedu	0f6d7a9a34	[Models] Add remaining model PP support (#7168 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:56:58 +08:00
afeldman-nm	428dd1445e	[Core] Logprobs support in Multi-step (#7652 )	2024-08-29 19:19:08 -07:00
Dipika Sikka	d3bdfd3ab9	[Misc] Update Fused MoE weight loading (#7334 )	2024-08-13 14:57:45 -04:00
Cyrus Leung	7025b11d94	[Bugfix] Fix weight loading for Chameleon when TP>1 (#7410 )	2024-08-13 05:33:41 +00:00

1 2 3

105 Commits