xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-17 10:15:50 +08:00

Author	SHA1	Message	Date
Russell Bryant	776dbd74f1	[CI/Build] mypy: Resolve some errors from checking vllm/engine (#9267 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-10-16 22:55:59 +00:00
Junhao Li	5b8a1fde84	[Model][Bugfix] Add FATReLU activation and support for openbmb/MiniCPM-S-1B-sft (#9396 )	2024-10-16 16:40:24 +00:00
Mor Zusman	fb60ae9b91	[Kernel][Model] Improve continuous batching for Jamba and Mamba (#9189 )	2024-10-16 12:12:43 -04:00
Isotr0py	cf1d62a644	[Model] Support SDPA attention for Molmo vision backbone (#9410 )	2024-10-16 11:52:01 +00:00
Cyrus Leung	cee711fdbb	[Core] Rename input data types (#8688 )	2024-10-16 10:49:37 +00:00
Cyrus Leung	7abba39ee6	[Model] VLM2Vec, the first multimodal embedding model in vLLM (#9303 )	2024-10-16 14:31:00 +08:00
Cyrus Leung	7e7eae338d	[Misc] Standardize RoPE handling for Qwen2-VL (#9250 )	2024-10-16 13:56:17 +08:00
Reza Salehi	ed920135c8	[Bugfix] Molmo text-only input bug fix (#9397 ) Co-authored-by: sanghol <sanghol@allenai.org> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-10-16 04:56:09 +00:00
Michael Goin	22f8a69549	[Misc] Directly use compressed-tensors for checkpoint definitions (#8909 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-15 15:40:25 -07:00
hhzhang16	55e081fbad	[Bugfix] Update InternVL input mapper to support image embeds (#9351 )	2024-10-14 21:29:19 -07:00
Tyler Michael Smith	169b530607	[Bugfix] Clean up some cruft in mamba.py (#9343 )	2024-10-15 00:24:25 +00:00
Xiang Xu	f0fe4fe86d	[Model] Make llama3.2 support multiple and interleaved images (#9095 )	2024-10-14 15:24:26 -07:00
Reza Salehi	dfe43a2071	[Model] Molmo vLLM Integration (#9016 ) Co-authored-by: sanghol <sanghol@allenai.org> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-10-14 07:56:24 -07:00
Jee Jee Li	250e26a63e	[Bugfix]Fix MiniCPM's LoRA bug (#9286 )	2024-10-12 09:36:47 -07:00
sixgod	6cf1167c1a	[Model] Add GLM-4v support and meet vllm==0.6.2 (#9242 )	2024-10-11 17:36:13 +00:00
Burkhard Ringlein	f710090d8e	[Kernel] adding fused moe kernel config for L40S TP4 (#9245 )	2024-10-11 08:54:22 -07:00
Tyler Michael Smith	7342a7d7f8	[Model] Support Mamba (#6484 )	2024-10-11 15:40:06 +00:00
Cyrus Leung	e808156f30	[Misc] Collect model support info in a single process per model (#9233 )	2024-10-11 11:08:11 +00:00
youkaichao	cbc2ef5529	[misc] hide best_of from engine (#9261 ) Co-authored-by: Brendan Wong <bjwpokemon@gmail.com>	2024-10-10 21:30:44 -07:00
youkaichao	e00c094f15	[torch.compile] generic decorators (#9258 )	2024-10-10 15:54:23 -07:00
youkaichao	e4d652ea3e	[torch.compile] integration with compilation control (#9058 )	2024-10-10 12:39:36 -07:00
whyiug	04de9057ab	[Model] support input image embedding for minicpmv (#9237 )	2024-10-10 15:00:47 +00:00
Isotr0py	07c11cf4d4	[Bugfix] Fix lm_head weights tying with lora for llama (#9227 )	2024-10-10 21:11:56 +08:00
youkaichao	de895f1697	[misc] improve model support check in another process (#9208 )	2024-10-09 21:58:27 -07:00
Li, Jiang	ca77dd7a44	[Hardware][CPU] Support AWQ for CPU backend (#7515 )	2024-10-09 10:28:08 -06:00
Cyrus Leung	8bfaa4e31e	[Bugfix] fix composite weight loading and EAGLE weight loading (#9160 )	2024-10-09 00:36:55 -07:00
Hui Liu	cdc72e3c80	[Model] Remap FP8 kv_scale in CommandR and DBRX (#9174 )	2024-10-09 06:43:06 +00:00
chenqianfzh	2f4117c38e	support bitsandbytes quantization with more models (#9148 )	2024-10-08 19:52:19 -06:00
Cyrus Leung	151ef4efd2	[Model] Support NVLM-D and fix QK Norm in InternViT (#9045 ) Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2024-10-07 11:55:12 +00:00
Isotr0py	f19da64871	[Core] Refactor GGUF parameters packing and forwarding (#8859 )	2024-10-07 10:01:46 +00:00
Cyrus Leung	8c6de96ea1	[Model] Explicit interface for vLLM models and support OOT embedding models (#9108 )	2024-10-07 06:10:35 +00:00
youkaichao	18b296fdb2	[core] remove beam search from the core (#9105 )	2024-10-07 05:47:04 +00:00
Cyrus Leung	b22b798471	[Model] PP support for embedding models and update docs (#9090 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-10-06 16:35:27 +08:00
Xin Yang	15986f598c	[Model] Support Gemma2 embedding model (#9004 )	2024-10-05 06:57:05 +00:00
hhzhang16	53b3a33027	[Bugfix] Fixes Phi3v & Ultravox Multimodal EmbeddingInputs (#8979 )	2024-10-04 22:05:37 -07:00
Chongming Ni	cc90419e89	[Hardware][Neuron] Add on-device sampling support for Neuron (#8746 ) Co-authored-by: Ashraf Mahgoub <ashymahg@amazon.com>	2024-10-04 16:42:20 -07:00
ElizaWszola	05d686432f	[Kernel] Zero point support in fused MarlinMoE kernel + AWQ Fused MoE (#8973 ) Co-authored-by: Dipika <dipikasikka1@gmail.com> Co-authored-by: Dipika Sikka <ds3822@columbia.edu>	2024-10-04 12:34:44 -06:00
Roger Wang	26aa325f4f	[Core][VLM] Test registration for OOT multimodal models (#8717 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:38:25 -07:00
Prashant Gupta	9ade8bbc8d	[Model] add a bunch of supported lora modules for mixtral (#9008 ) Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>	2024-10-04 16:24:40 +00:00
whyiug	3d826d2c52	[Bugfix] Reshape the dimensions of the input image embeddings in Qwen2VL (#9071 )	2024-10-04 14:34:58 +00:00
Cyrus Leung	0e36fd4909	[Misc] Move registry to its own file (#9064 )	2024-10-04 10:01:37 +00:00
Murali Andoorveedu	0f6d7a9a34	[Models] Add remaining model PP support (#7168 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:56:58 +08:00
Domen Vreš	2838d6b38e	[Bugfix] Weight loading fix for OPT model (#9042 ) Co-authored-by: dvres <dvres@fri.uni-lj.si>	2024-10-03 19:53:29 -04:00
Divakar Verma	01843c89b8	[Misc] log when using default MoE config (#8971 )	2024-10-03 04:31:07 +00:00
Shawn Tan	19f0d25796	[Model] Adding Granite MoE. (#8206 ) Co-authored-by: Nick Hill <nickhill@us.ibm.com>	2024-10-03 09:33:57 +08:00
Sergey Shlyapnikov	f58d4fccc9	[OpenVINO] Enable GPU support for OpenVINO vLLM backend (#8192 )	2024-10-02 17:50:01 -04:00
Lily Liu	1570203864	[Spec Decode] (1/2) Remove batch expansion (#8839 )	2024-10-01 16:04:42 -07:00
Cyrus Leung	4f341bd4bf	[Doc] Update list of supported models (#8987 )	2024-10-02 00:35:39 +08:00
Alex Brooks	1fe0a4264a	[Bugfix] Fix Token IDs Reference for MiniCPM-V When Images are Provided With No Placeholders (#8991 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2024-10-01 09:52:44 +00:00
Isotr0py	bc4eb65b54	[Bugfix] Fix Fuyu tensor parallel inference (#8986 )	2024-10-01 17:51:41 +08:00

1 2 3 4 5 ...

837 Commits