xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-08-04 01:37:03 +08:00

Author	SHA1	Message	Date
Gregory Shtrasberg	e97f802b2d	[FP8][Kernel] Dynamic kv cache scaling factors computation (#11906 ) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Co-authored-by: Micah Williamson <micah.williamson@amd.com>	2025-01-23 18:04:03 +00:00
Kevin H. Luu	64ea24d0b3	[ci/lint] Add back default arg for pre-commit (#12279 ) Signed-off-by: kevin <kevin@anyscale.com>	2025-01-22 01:15:27 +00:00
Cyrus Leung	df76e5af26	[VLM] Simplify post-processing of replacement info (#12269 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-21 16:48:13 -08:00
Nicolò Lucchesi	5fe6bf29d6	[BugFix] Fix GGUF tp>1 when vocab_size is not divisible by 64 (#12230 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-01-21 12:23:14 +08:00
Cyrus Leung	18572e3384	[Bugfix] Fix `HfExampleModels.find_hf_info` (#12223 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-20 15:35:36 +00:00
Cyrus Leung	b37d82791e	[Model] Upgrade Aria to transformers 4.48 (#12203 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-20 17:58:48 +08:00
Cyrus Leung	59a0192fb9	[Core] Interface for accessing model from `VllmRunner` (#10353 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-20 15:00:59 +08:00
Isotr0py	83609791d2	[Model] Add Qwen2 PRM model support (#12202 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-20 14:59:46 +08:00
Martin Gleize	bbe5f9de7d	[Model] Support for fairseq2 Llama (#11442 ) Signed-off-by: Martin Gleize <mgleize@meta.com> Co-authored-by: mgleize user <mgleize@a100-st-p4de24xlarge-4.fair-a100.hpcaas>	2025-01-19 10:40:40 -08:00
Roger Wang	81763c58a0	[V1] Add V1 support of Qwen2-VL (#12128 ) Signed-off-by: Roger Wang <ywang@roblox.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: imkero <kerorek@outlook.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-19 19:52:13 +08:00
Isotr0py	02798ecabe	[Model] Port deepseek-vl2 processor, remove dependency (#12169 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-18 13:59:39 +08:00
Isotr0py	62b06ba23d	[Model] Add support for deepseek-vl2-tiny model (#12068 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-16 17:14:48 +00:00
Roger Wang	874f7c292a	[Bugfix] Fix max image feature size for Llava-one-vision (#12104 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2025-01-16 14:54:06 +00:00
RunningLeon	97eb97b5a4	[Model]: Support internlm3 (#12037 )	2025-01-15 11:35:17 +00:00
Isotr0py	d14e98d924	[Model] Support GGUF models newly added in `transformers` 4.46.0 (#9685 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-01-13 00:13:44 +00:00
Isotr0py	f967e51f38	[Model] Initialize support for Deepseek-VL2 models (#11578 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-01-12 00:17:24 -08:00
Nicolò Lucchesi	d697dc01b4	[Bugfix] Fix RobertaModel loading (#11940 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-01-11 14:05:09 +00:00
Cyrus Leung	a991f7d508	[Doc] Basic guide for writing unit tests for new models (#11951 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-11 21:27:24 +08:00
Cyrus Leung	7a3a83e3b8	[CI/Build] Move model-specific multi-modal processing tests (#11934 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-11 13:50:05 +08:00
Li, Jiang	aa1e77a19c	[Hardware][CPU] Support MOE models on x86 CPU (#11831 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-01-10 11:07:58 -05:00
Harry Mellor	d85c47d6ad	Replace "online inference" with "online serving" (#11923 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-01-10 12:05:56 +00:00
Maximilien de Bayser	1fe554bac3	treat do_lower_case in the same way as the sentence-transformers library (#11815 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com>	2025-01-09 11:05:43 +08:00
Cyrus Leung	2a0596bc48	[VLM] Reorganize profiling/processing-related code (#11812 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-08 18:59:58 +08:00
Cyrus Leung	8f37be38eb	[Bugfix] Comprehensively test and fix LLaVA-NeXT feature size calculation (#11800 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-07 18:25:02 +08:00
Cyrus Leung	08fb75c72e	[Bugfix] Fix LLaVA-NeXT feature size precision error (for real) (#11772 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-07 01:10:54 +00:00
Jee Jee Li	32c9eff2ff	[Bugfix][V1] Fix molmo text-only inputs (#11676 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-01-06 15:22:25 +00:00
Cyrus Leung	ba214dffbe	[Bugfix] Fix precision error in LLaVA-NeXT (#11735 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-04 23:45:57 +08:00
Cyrus Leung	eed11ebee9	[VLM] Merged multi-modal processors for LLaVA-NeXT-Video and LLaVA-OneVision (#11717 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-04 11:40:53 +00:00
Aurick Qiao	e1a5c2f0a1	[Model] Whisper model implementation (#11280 ) Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>	2025-01-03 16:39:19 +08:00
Cyrus Leung	8c38ee7007	[VLM] Merged multi-modal processor for LLaVA-NeXT (#11682 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-02 16:39:27 +00:00
Cyrus Leung	a115ac46b5	[VLM] Move supported limits and max tokens to merged multi-modal processor (#11669 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2025-01-01 15:44:42 +00:00
Roger Wang	e7c7c5e822	[V1][VLM] V1 support for selected single-image models. (#11632 ) Signed-off-by: Roger Wang <ywang@roblox.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Isotr0py <2037008807@qq.com>	2024-12-31 21:17:22 +00:00
youkaichao	328841d002	[bugfix] interleaving sliding window for cohere2 model (#11583 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-28 16:55:42 +00:00
Isotr0py	d34be24bb1	[Model] Support InternLM2 Reward models (#11571 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-12-28 06:14:10 +00:00
Cyrus Leung	101418096f	[VLM] Support caching in merged multi-modal processor (#11396 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-27 17:22:48 +00:00
Robert Shaw	46d4359450	[CI] Fix broken CI (#11543 )	2024-12-26 18:49:16 -08:00
Cyrus Leung	51a624bf02	[Misc] Move some multimodal utils to modality-specific modules (#11494 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-26 04:23:20 +00:00
Cyrus Leung	3f3e92e1f2	[Model] Automatic conversion of classification and reward models (#11469 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-24 18:22:22 +00:00
Isotr0py	e24113a8fe	[Model] Refactor Qwen2-VL to use merged multimodal processor (#11258 ) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-19 16:28:00 +00:00
Yehoshua Cohen	6c7f881541	[Model] Add JambaForSequenceClassification model (#10860 ) Signed-off-by: Yehoshua Cohen <yehoshuaco@ai21.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Yehoshua Cohen <yehoshuaco@ai21.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-19 22:48:06 +08:00
Cyrus Leung	6142ef0ada	[VLM] Merged multimodal processor for Qwen2-Audio (#11303 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-19 06:14:17 +00:00
Isotr0py	996aa70f00	[Bugfix] Fix broken phi3-v mm_processor_kwargs tests (#11263 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-12-18 10:16:40 -08:00
Wallas Henrique	8b79f9e107	[Bugfix] Fix guided decoding with tokenizer mode mistral (#11046 )	2024-12-17 22:34:08 -08:00
Isotr0py	d927dbcd88	[Model] Refactor Ultravox to use merged input processor (#11198 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-12-16 10:09:53 +00:00
Jani Monoses	bddbbcb132	[Model] Support Cohere2ForCausalLM (Cohere R7B) (#11203 )	2024-12-16 09:56:19 +00:00
Cyrus Leung	93abf23a64	[VLM] Fully dynamic prompt replacement in merged input processor (#11199 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-14 17:52:18 +00:00
Cyrus Leung	eeec9e3390	[Frontend] Separate pooling APIs in offline inference (#11129 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-13 10:40:07 +00:00
youkaichao	be39e3cd18	[core] clean up cudagraph batchsize padding logic (#10996 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-13 06:57:50 +00:00
Pooya Davoodi	1efce68605	[Bugfix] Use runner_type instead of task in GritLM (#11144 ) Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io>	2024-12-13 04:09:53 +00:00
Pooya Davoodi	1da8f0e1dd	[Model] Add support for embedding model GritLM (#10816 ) Signed-off-by: Pooya Davoodi <pooya.davoodi@parasail.io>	2024-12-12 06:39:16 +00:00

... 3 4 5 6 7 ...

474 Commits