xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-22 14:16:14 +08:00

Author	SHA1	Message	Date
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Cyrus Leung	27d7638b94	[Bugfix] Merge MM embeddings by index instead of token IDs (#16229 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: NickLucche <nlucches@redhat.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: NickLucche <nlucches@redhat.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-09-27 08:15:12 +00:00
Woosuk Kwon	1c3ffdbecc	[V0 Deprecation] Remove V0 sampling metadata (#25345 ) Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>	2025-09-21 10:37:11 -07:00
Lukas Geiger	de533ab2a1	[Models] Improve iteration over layers (#19497 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-08-29 09:26:34 +08:00
Jee Jee Li	a7b8788d2c	[Misc] Modify the organization of GLM series (#22171 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-03 23:51:20 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
Harry Mellor	26d0419309	Update deprecated type hinting in `models` (#18132 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-14 22:06:50 -07:00
Woosuk Kwon	b411418ff0	[Chore] Remove Sampler from Model Code (#17084 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-04-24 02:49:33 -07:00
Kyle Sayers	82e7e19a6e	[SupportsQuant] Chameleon, Chatglm, Commandr (#15952 ) Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-04-03 08:25:22 -07:00
Jee Jee Li	91276c5721	[Model] Adding torch compile annotations to chatglm (#15624 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-03-28 21:14:09 +08:00
Cyrus Leung	f53a0586b9	[Bugfix] Fix prompt format of GLM4V (#14539 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-13 11:37:17 +00:00
Harry Mellor	cdc1fa12eb	Remove unused kwargs from model definitions (#13555 )	2025-02-24 17:13:52 -08:00
Jee Jee Li	105b8ce4c0	[Misc] Reduce LoRA-related static variable (#13166 )	2025-02-22 00:21:30 -08:00
Cyrus Leung	1bc3b5e71b	[VLM] Separate text-only and vision variants of the same model architecture (#13157 )	2025-02-13 06:19:15 -08:00
Cyrus Leung	51f0b5f7f6	[Bugfix] Clean up and fix multi-modal processors (#13012 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-02-10 10:45:21 +00:00
Jee Jee Li	86222a3dab	[VLM] Merged multi-modal processor for GLM4V (#12449 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-02-08 20:32:16 +00:00
Kyle Sayers	7ff7a638b6	[Model][Quant] Fix GLM, Fix fused module mappings for quantization (#12634 ) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: mgoin <michael@neuralmagic.com>	2025-02-05 05:32:06 +00:00
Russell Bryant	e489ad7a21	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 ) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-02-02 11:58:18 -08:00
Isotr0py	edaae198e7	[Misc] Add BNB support to GLM4-V model (#12184 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-19 19:49:22 +08:00
sixgod	4b657d3292	[Model] Add cogagent model support vLLM (#11742 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2025-01-11 19:05:56 +00:00
Cyrus Leung	8d9b6721e7	[VLM] Abstract out multi-modal data parsing in merged processor (#11620 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-30 15:01:35 +00:00
Roger Wang	2f0a0a17a4	[V1] Refactor model executable interface for multimodal models (#10570 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2024-11-26 20:46:11 +00:00
Jee Jee Li	1700c543a5	[Bugfix] Fix LoRA weight sharding (#10450 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-11-23 17:23:17 -08:00
youkaichao	eebad39f26	[torch.compile] support all attention backends (#10558 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-22 14:04:42 -08:00
Jee Jee Li	382b6a4852	[Misc] Avoid misleading warning messages (#10438 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-11-19 08:54:58 +00:00
B-201	5be4e52b65	[Model][LoRA]LoRA support added for glm-4v (#10418 ) Signed-off-by: B-201 <Joy25810@foxmail.com>	2024-11-18 12:57:10 +00:00
Isotr0py	c4e464333e	[Misc] Add uninitialized params tracking for `AutoWeightsLoader` (#10327 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-18 09:07:46 +08:00
youkaichao	504ac53d18	[misc] error early for old-style class (#10304 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-13 18:55:39 -08:00
Cyrus Leung	0b8bb86bf1	[1/N] Initial prototype for multi-modal processor (#10044 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-13 12:39:03 +00:00
youkaichao	f89d18ff74	[6/N] pass whole config to inner model (#10205 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-11 06:41:46 +00:00
youkaichao	1a95f10ee7	[5/N] pass the whole config to model (#9983 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-09 14:17:28 +08:00
Cyrus Leung	e0191a95d8	[0/N] Rename `MultiModalInputs` to `MultiModalKwargs` (#10040 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-09 11:31:02 +08:00
Joe Runde	d58268c56a	[V1] Make v1 more testable (#9888 ) Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>	2024-11-06 11:57:35 -08:00
Aaron Pham	21063c11c7	[CI/Build] drop support for Python 3.8 EOL (#8464 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>	2024-11-06 07:11:55 +00:00
zifeitong	43300bd98a	[Bugfix] Properly propagate trust_remote_code settings (#10047 ) Signed-off-by: Zifei Tong <zifeitong@gmail.com>	2024-11-05 16:34:40 -08:00
youkaichao	74b529ceee	[bugfix] fix chatglm dummy_data_for_glmv (#9955 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-02 08:03:33 -07:00
Cyrus Leung	836e8ef6ee	[Bugfix] Fix PP for ChatGLM and Molmo (#9422 )	2024-10-24 06:12:05 +00:00
Cyrus Leung	cee711fdbb	[Core] Rename input data types (#8688 )	2024-10-16 10:49:37 +00:00
sixgod	6cf1167c1a	[Model] Add GLM-4v support and meet vllm==0.6.2 (#9242 )	2024-10-11 17:36:13 +00:00
Murali Andoorveedu	0f6d7a9a34	[Models] Add remaining model PP support (#7168 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:56:58 +08:00
afeldman-nm	428dd1445e	[Core] Logprobs support in Multi-step (#7652 )	2024-08-29 19:19:08 -07:00
Zijian Hu	f4fc7337bf	[Bugfix] support `tie_word_embeddings` for all models (#5724 )	2024-08-19 20:00:04 -07:00
Cyrus Leung	7025b11d94	[Bugfix] Fix weight loading for Chameleon when TP>1 (#7410 )	2024-08-13 05:33:41 +00:00
Qubitium-ModelCloud	ee93f4f92a	[CORE] Quantized lm-head Framework (#4442 ) Co-authored-by: Robert Shaw <rshaw@neuralmagic.com> Co-authored-by: ZX <zx@lbx.dev>	2024-07-02 22:25:17 +00:00
Murali Andoorveedu	c5832d2ae9	[Core] Pipeline Parallel Support (#4412 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>	2024-07-02 10:58:08 -07:00
Cyrus Leung	98cf2ed678	[Model][Bugfix] Implicit model flags and reenable Phi-3-Vision (#5896 )	2024-06-27 09:08:10 -07:00
Cyrus Leung	96354d6a29	[Model] Add base class for LoRA-supported models (#5018 )	2024-06-27 16:03:04 +08:00
Cody Yu	a3a73ab069	[Misc] Load FP8 kv-cache scaling factors from checkpoints (#4893 ) The 2nd PR for #4532. This PR supports loading FP8 kv-cache scaling factors from a FP8 checkpoint (with .kv_scale parameter).	2024-05-22 13:28:20 -07:00
SangBin Cho	2e9a2227ec	[Lora] Support long context lora (#4787 ) Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through. It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors. Follow up of https://github.com/vllm-project/vllm/pull/3095/files	2024-05-18 16:05:23 +09:00

1 2

69 Commits