xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-12 00:14:42 +08:00

Author	SHA1	Message	Date
Cyrus Leung	3f3e92e1f2	[Model] Automatic conversion of classification and reward models (#11469 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-24 18:22:22 +00:00
Jee Jee Li	196c34b0ac	[Misc] Move weights mapper (#11443 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-12-24 13:05:25 +00:00
Jee Jee Li	b1b1038fbd	[Bugfix] Fix Qwen2-VL LoRA weight loading (#11430 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-12-24 09:56:10 +00:00
Michael Goin	60fb4f3bcf	[Bugfix] Add kv cache scales to gemma2.py (#11269 )	2024-12-23 19:30:45 +00:00
Dipika Sikka	b866cdbd05	[Misc] Add assertion and helpful message for marlin24 compressed models (#11388 )	2024-12-24 02:23:38 +08:00
Michael Goin	5bfb30a529	[Bugfix] Fix CFGGuide and use outlines for grammars that can't convert to GBNF (#11389 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2024-12-23 23:06:20 +08:00
Roger Wang	c2d1b075ba	[Bugfix] Fix issues for `Pixtral-Large-Instruct-2411` (#11393 ) Signed-off-by: ywang96 <ywang@example.com> Co-authored-by: ywang96 <ywang@example.com>	2024-12-21 10:15:03 +00:00
George	51ff216d85	[Bugfix] update should_ignore_layer (#11354 ) Signed-off-by: George Ohashi <george@neuralmagic.com>	2024-12-21 06:36:23 +00:00
omer-dayan	995f56236b	[Core] Loading model from S3 using RunAI Model Streamer as optional loader (#10192 ) Signed-off-by: OmerD <omer@run.ai>	2024-12-20 16:46:24 +00:00
Wallas Henrique	86c2d8fd1c	[Bugfix] Fix spec decoding when seed is none in a batch (#10863 ) Signed-off-by: Wallas Santos <wallashss@ibm.com>	2024-12-20 05:15:31 +00:00
Isotr0py	276738ce0f	[Bugfix] Fix broken CPU compressed-tensors test (#11338 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-12-19 17:37:31 +00:00
Isotr0py	e24113a8fe	[Model] Refactor Qwen2-VL to use merged multimodal processor (#11258 ) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-19 16:28:00 +00:00
Roger Wang	7379b3d4b2	[V1] Fix multimodal profiling for `Molmo` (#11325 ) Signed-off-by: ywang96 <ywang@example.com> Co-authored-by: ywang96 <ywang@example.com>	2024-12-19 16:27:22 +00:00
Yehoshua Cohen	6c7f881541	[Model] Add JambaForSequenceClassification model (#10860 ) Signed-off-by: Yehoshua Cohen <yehoshuaco@ai21.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Yehoshua Cohen <yehoshuaco@ai21.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-19 22:48:06 +08:00
Cyrus Leung	a0f7d53beb	[Bugfix] Cleanup Pixtral HF code (#11333 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-19 13:22:00 +00:00
Cyrus Leung	6142ef0ada	[VLM] Merged multimodal processor for Qwen2-Audio (#11303 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-19 06:14:17 +00:00
Michael Goin	a30482f054	[CI] Expand test_guided_generate to test all backends (#11313 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2024-12-19 04:00:38 +00:00
Tyler Michael Smith	5a9da2e6e9	[Bugfix][Build/CI] Fix sparse CUTLASS compilation on CUDA [12.0, 12.2) (#11311 ) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2024-12-19 02:43:30 +00:00
Isotr0py	996aa70f00	[Bugfix] Fix broken phi3-v mm_processor_kwargs tests (#11263 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-12-18 10:16:40 -08:00
Dipika Sikka	60508ffda9	[Kernel]: Cutlass 2:4 Sparsity + FP8/Int8 Quant Support (#10995 ) Co-authored-by: Faraz Shahsavan <faraz.shahsavan@gmail.com> Co-authored-by: ilmarkov <markovilya197@gmail.com> Co-authored-by: Rahul Tuli <rahul@neuralmagic.com> Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>	2024-12-18 09:57:16 -05:00
Wallas Henrique	8b79f9e107	[Bugfix] Fix guided decoding with tokenizer mode mistral (#11046 )	2024-12-17 22:34:08 -08:00
Roger Wang	59c9b6ebeb	[V1][VLM] Proper memory profiling for image language models (#11210 ) Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: ywang96 <ywang@example.com>	2024-12-16 22:10:57 -08:00
Isotr0py	d927dbcd88	[Model] Refactor Ultravox to use merged input processor (#11198 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-12-16 10:09:53 +00:00
Jani Monoses	bddbbcb132	[Model] Support Cohere2ForCausalLM (Cohere R7B) (#11203 )	2024-12-16 09:56:19 +00:00
Cyrus Leung	96d673e0f8	[Bugfix] Fix error handling of unsupported sliding window (#11213 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-15 10:59:42 -07:00
Jee Jee Li	15859f2357	[[Misc]Upgrade bitsandbytes to the latest version 0.45.0 (#11201 )	2024-12-15 03:03:06 +00:00
Cyrus Leung	93abf23a64	[VLM] Fully dynamic prompt replacement in merged input processor (#11199 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-14 17:52:18 +00:00
Russell Bryant	48259264a4	[Core] Update outlines and increase its threadpool size (#11140 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-12-14 07:46:18 +00:00
Roger Wang	969da7d70b	[V1][VLM] Fix edge case bug for InternVL2 (#11165 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2024-12-13 11:09:30 +00:00
Cyrus Leung	eeec9e3390	[Frontend] Separate pooling APIs in offline inference (#11129 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-13 10:40:07 +00:00
Jani Monoses	7cd7409142	PaliGemma 2 support (#11142 )	2024-12-13 07:40:07 +00:00
youkaichao	be39e3cd18	[core] clean up cudagraph batchsize padding logic (#10996 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-13 06:57:50 +00:00
Dipika Sikka	3989a79824	[Bugfix] Update starcoder2 to remap k/v scale names for kv_cache quantization (#11148 )	2024-12-13 05:07:20 +00:00
Pooya Davoodi	1efce68605	[Bugfix] Use runner_type instead of task in GritLM (#11144 ) Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io>	2024-12-13 04:09:53 +00:00
Cody Yu	2c97eca1ff	[Misc] Validate grammar and fail early (#11119 )	2024-12-12 18:34:26 +00:00
Jeff Cook	5d712571af	[Bugfix] Quick fix to make Pixtral-HF load correctly again after 39e227c7ae. (#11024 )	2024-12-12 18:09:20 +00:00
Pooya Davoodi	1da8f0e1dd	[Model] Add support for embedding model GritLM (#10816 ) Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io>	2024-12-12 06:39:16 +00:00
Cyrus Leung	8f10d5e393	[Misc] Split up pooling tasks (#10820 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-11 01:28:00 -08:00
B-201	2e32f5d28d	[Bugfix] Fix Idefics3 fails during multi-image inference (#11080 ) Signed-off-by: B-201 <Joy25810@foxmail.com>	2024-12-11 01:27:07 -08:00
Kevin H. Luu	9974fca047	[ci/build] Fix entrypoints test and pin outlines version (#11088 )	2024-12-11 01:01:53 -08:00
Mor Zusman	ffa48c9146	[Model] PP support for Mamba-like models (#10992 ) Signed-off-by: mzusman <mor.zusmann@gmail.com>	2024-12-10 21:53:37 -05:00
Russell Bryant	e739194926	[Core] Update to outlines >= 0.1.8 (#10576 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-12-10 12:08:16 -08:00
Jeff Cook	e35879c276	[Bugfix] Fix xgrammar failing to read a vocab_size from LlavaConfig on PixtralHF. (#11043 )	2024-12-10 14:54:22 +08:00
Tyler Michael Smith	28b3a1c7e5	[V1] Multiprocessing Tensor Parallel Support for v1 (#9856 ) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2024-12-10 06:28:14 +00:00
Patrick von Platen	bc192a2b09	[Pixtral] Improve loading (#11040 )	2024-12-10 06:09:32 +00:00
Isotr0py	d1f6d1c8af	[Model] Add has_weight to RMSNorm and re-enable weights loading tracker for Mamba (#10739 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-12-10 10:23:07 +08:00
Isotr0py	a811dd6608	[Model] merged input processor for Phi-3-Vision models (#10977 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-12-09 12:55:10 -08:00
Roger Wang	a11f326528	[V1] Initial support of multimodal models for V1 re-arch (#10699 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2024-12-08 12:50:51 +00:00
Cyrus Leung	c889d5888b	[Doc] Explicitly state that PP isn't compatible with speculative decoding yet (#10975 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-07 17:20:49 +00:00
Cyrus Leung	39e227c7ae	[Model] Update multi-modal processor to support Mantis(LLaVA) model (#10711 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-07 17:10:05 +00:00

1 2 3 4 5 ...

1101 Commits