xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-15 08:35:48 +08:00

Author	SHA1	Message	Date
Reza Barazesh	37efc63b64	[V0 deprecation] Guided decoding (#21347 ) Signed-off-by: Reza Barazesh <rezabarazesh@meta.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-29 03:15:30 -07:00
rongfu.leng	2cc571199b	[feature] add log non default args in LLM (#21680 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-07-28 02:21:22 -07:00
Cyrus Leung	86ae693f20	[Deprecation][2/N] Replace `--task` with `--runner` and `--convert` (#21470 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-27 19:42:40 -07:00
Cyrus Leung	46d81d6951	[V1] Get supported tasks from model runner instead of model config (#21585 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-25 05:36:45 -07:00
Michael Goin	82ec66f514	[V0 Deprecation] Remove Prompt Adapters (#20588 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-23 16:36:48 -07:00
Wang Yijun	44554a0068	Add tokenization_kwargs to encode for embedding model truncation (#21033 )	2025-07-22 08:24:00 -07:00
Cyrus Leung	45badd05d0	[Core] Set pooling params based on task and model (#21128 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-18 05:41:17 -07:00
Nicolò Lucchesi	020f58abcd	[Core] Support multiple tasks per model (#20771 ) Signed-off-by: NickLucche <nlucches@redhat.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-12 19:40:11 -07:00
Alex Brooks	41060c6e08	[Core] Add Support for Default Modality Specific LoRAs [generate / chat completions] (#19126 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-07-10 21:09:37 +01:00
shineran96	4bed167768	[Model][VLM] Support JinaVL Reranker (#20260 ) Signed-off-by: shineran96 <shinewang96@gmail.com>	2025-07-10 10:43:43 -07:00
wang.yuqi	110df74332	[Model][Last/4] Automatic conversion of CrossEncoding model (#19675 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-07 14:46:04 +00:00
Cyrus Leung	c18b3b8e8b	[Bugfix] Add `use_cross_encoder` flag to use correct activation in `ClassifierPooler` (#20527 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-06 14:01:48 -07:00
wang.yuqi	6f1229f91d	[Model][2/N] Automatic conversion of CrossEncoding model (#19978 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-03 13:59:23 +00:00
Lifans	9ec1e3065a	[Misc][Doc] Add missing comment for LLM (#20285 ) Signed-off-by: Lifan Shen <lifans@meta.com>	2025-07-01 19:04:24 -07:00
Kyle Sayers	d8cf819a9a	[Core] [Bugfix] [Multimodal] Fix multimodal profiling and generation for SFT/PTQed models (#20058 ) Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-06-30 17:26:49 +00:00
Nick Hill	8619e7158c	[BugFix] Fix multi-node offline data parallel (#19937 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-06-24 12:45:20 -07:00
Rabin Adhikari	8ca81bb069	Fix: Check the type of params to be a Sequence not list. (#19910 ) Signed-off-by: Rabin Adhikari <rabin.adk1@gmail.com>	2025-06-20 23:03:17 +00:00
Alex Brooks	ead2110297	[Core][Bugfix] Fix Online MM Beam Search (#19688 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-06-19 17:18:07 +00:00
NekoMimiUnagi	466166dcfd	[Frontend] Add optional token-level progress bar to `LLM.beam_search` (#19301 ) Signed-off-by: Ruosen Li <rxl190028@utdallas.edu> Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Signed-off-by: Ubuntu <ubuntu@ip-172-31-71-179.ec2.internal> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-06-19 03:21:41 -04:00
Maximilien de Bayser	799397ee4f	Support embedding models in V1 (#16188 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Signed-off-by: Max de Bayser <maxdebayser@gmail.com> Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-06-18 21:36:33 -07:00
maobaolong	08500011d3	[Fix] Convert kv_transfer_config from dict to KVTransferConfig (#19262 )	2025-06-14 12:32:07 -07:00
Luka Govedič	3597b06a4f	[CUDA] Enable full cudagraph for FlashMLA (#18581 ) Signed-off-by: luka <luka@neuralmagic.com>	2025-06-13 18:12:26 +00:00
Reid	6cd4ae8acd	[Frontend] Add tqdm_leave_pbar to control progress bar visibility (#19357 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-06-10 04:55:09 +00:00
Kseniya Parkhamchuk	8335667c22	[Frontend] Remove unreachable code from llm.py (#19288 ) Signed-off-by: KsuParkhamchuk <k.parkhamchuk@gmail.com>	2025-06-09 10:22:10 +08:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
Cyrus Leung	c29034037d	[Deprecation] Disallow pos-args other than `model` when initializing `LLM` (#18802 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-29 09:36:58 -07:00
Alex Brooks	321331b8ae	[Core] Add Lora Support to Beam Search (#18346 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-05-28 08:58:24 -07:00
Harry Mellor	4c2b38ce9e	Enable Pydantic mypy checks and convert configs to Pydantic dataclasses (#17599 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-28 12:46:04 +00:00
Mark McLoughlin	06a0338015	[V1][Metrics] Add API for accessing in-memory Prometheus metrics (#17010 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-27 09:37:06 +00:00
Hyogeun Oh (오효근)	a68e293cb9	[Doc] Convert Sphinx directives ( `{class}`, `{meth}`, `{attr}`, ...) to MkDocs format for better documentation linking (#18663 ) Signed-off-by: Zerohertz <ohg3417@gmail.com>	2025-05-27 01:44:20 -07:00
Cyrus Leung	273cb3b4d9	[Doc] Fix top-level API links/docs (#18621 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-23 09:46:56 -07:00
Harry Mellor	a1fe24d961	Migrate docs from Sphinx to MkDocs (#18145 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-23 02:09:53 -07:00
CYJiang	fae453f8ce	[Misc] refactor: simplify input validation and num_requests handling in _convert_v1_inputs (#18482 ) Signed-off-by: googs1025 <googs1025@gmail.com>	2025-05-23 10:15:32 +08:00
lkchen	6685890d11	[Fix] Move "model_config" as keyword args in chat_utils.py (#18098 ) Signed-off-by: Linkun <github@lkchen.net>	2025-05-13 23:27:26 -07:00
Harry Mellor	4b2ed7926a	Improve configs - the rest! (#17562 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-09 15:18:44 -07:00
Cyrus Leung	96722aa81d	[Frontend] Chat template fallbacks for multimodal models (#17805 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-07 23:05:54 -07:00
Harry Mellor	d6484ef3c3	Add full API docs and improve the UX of navigating them (#17485 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-03 19:42:43 -07:00
Cyrus Leung	cb234955df	[Misc] Clean up input processing (#17582 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-02 08:11:53 -07:00
Cyrus Leung	1903c0b8a3	[Frontend] Show progress bar for adding requests (#17525 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-01 05:15:32 -07:00
Harry Mellor	13698db634	Improve configs - `ModelConfig` (#17130 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-30 10:38:22 +08:00
Gabriel Marinho	1c2bc7ead0	Truncation control for embedding models (#14776 ) Signed-off-by: Gabriel Marinho <gmarinho@ibm.com> Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com>	2025-04-30 09:24:57 +08:00
Cyrus Leung	88ad9ec6b2	[Frontend] Support `chat_template_kwargs` in `LLM.chat` (#17356 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-04-29 22:03:35 +08:00
Russell Bryant	f8acd01ff7	[V1] Add `structural_tag` support using xgrammar (#17085 )	2025-04-26 14:06:37 +00:00
Nick Hill	70116459c3	[BugFix][Frontend] Fix `LLM.chat()` tokenization (#16081 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-04-25 22:20:05 +00:00
Alex Brooks	7feae92c1f	[Doc] Move todo out of beam search docstring (#17183 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-04-25 04:44:58 -07:00
Harry Mellor	0a05ed57e6	Simplify `TokenizerGroup` (#16790 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-24 04:43:56 -07:00
Alex Brooks	6b40996ae8	[Core][Bugfix] Fix Offline MM Beam Search (#16390 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-04-15 10:33:02 +08:00
wang.yuqi	fbf722c6e6	[Frontend] support matryoshka representation / support embedding API dimensions (#16331 )	2025-04-11 23:23:10 -07:00
Benjamin Kitor	82eb61dd4c	[misc] use tqdm.auto where appropriate (#16290 ) Signed-off-by: Benjamin Kitor <bkitor@gigaio.com>	2025-04-09 21:54:54 -07:00
Alex Brooks	69ecaa7c79	[Misc] Add warning for multimodal data in LLM.beam_search (#16241 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-04-08 04:05:27 -07:00

1 2 3 4 5

234 Commits