xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-07 01:47:03 +08:00

Author	SHA1	Message	Date
Ben Browning	e1dd706cd1	[Frontend] Respect Chat Completion parallel_tool_calls param (#26233 ) Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>	2025-11-25 09:56:15 +00:00
Nick Hill	db2906108a	[Misc] Streamline unique id generation (#29375 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-25 08:30:11 +00:00
Nick Hill	a178a0b40b	[BugFix] Fix duplicate id tool-call race condition (#29355 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-25 01:54:26 +00:00
Software Developer	4d01b64284	[Bugfix] - Add Trace Headers to Beam Search Path (#29100 ) Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>	2025-11-20 20:00:33 +00:00
rookie	56f45eddaf	[Frontend] Optimize beam search loop by sorting and then splicing (#19347 ) Signed-off-by: zhangguozhu <zhangguozhu@360.cn> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: zhangguozhu <zhangguozhu@360.cn> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-11-20 09:02:30 -08:00
Nicolò Lucchesi	6f1e7f7226	[DisaggEverything] Tokens in<>out `/generate` endpoint (#24261 ) Signed-off-by: NickLucche <nlucches@redhat.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-14 09:58:01 -07:00
Srreyansh Sethi	360bd8762f	[Frontend] Added chat-style multimodal support to /classify. (#27516 ) Signed-off-by: WorldExplored <srreyansh.sethi@gmail.com> Signed-off-by: Srreyansh Sethi <107075589+WorldExplored@users.noreply.github.com> Signed-off-by: vnadathur <glvikramn@gmail.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by: vnadathur <236933696+vnadathur@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: vnadathur <glvikramn@gmail.com> Co-authored-by: wang.yuqi <noooop@126.com> Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-11-14 11:03:55 +00:00
Andrew Xia	7c38ed0f1c	[Frontend] split append tool output (#28333 ) Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>	2025-11-13 04:03:23 +00:00
Andrew Xia	4b94ed8f92	[Frontend][2/n] remove empty content from _parse_tool_calls_from_content (#28331 ) Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>	2025-11-10 14:07:49 -08:00
Chauncey	59a50afa08	[Frontend] OpenAI Responses API supports Tool/Function calling - non-harmony (#26874 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-11-06 10:40:03 +00:00
Chauncey	0976711f3b	[Refactor] to simplify and extract the shared logic between chat completion and responses (#27961 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-11-05 15:46:39 +08:00
Chenguang Zheng	103a468bbf	[bugfix] Missing cached item in beam search (#27874 ) Signed-off-by: fake0fan <645327136@qq.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-10-31 17:34:27 +00:00
cong-meta	a2981c4272	[EP/DP][API Server] Enable DP-aware routing in OpenAI API requests (#24945 ) Co-authored-by: Cong Chen <prowindy@gmail.com>	2025-10-30 12:10:16 -07:00
Cyrus Leung	d31f7844f8	[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-19 05:20:55 -07:00
Cyrus Leung	d2740fafbf	[Chore] Separate out `vllm.utils.collections` (#26990 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-16 08:35:35 +00:00
Cyrus Leung	f6cdc9a02f	[Chore] Rename `utils` submodules (#26920 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-16 03:58:13 +00:00
Cyrus Leung	828523ad8e	[Chore] Separate out `vllm.utils.async_utils` (#26913 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-15 15:33:00 +00:00
Cyrus Leung	136a17fe6e	[Chore] Separate out `vllm.utils.func` (#26904 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-15 13:03:58 +00:00
Max Wittig	fd85c9f426	[Bugfix][FE]: Always include usage with `--enable-force-include-usage` (#20983 ) Signed-off-by: Max Wittig <max.wittig@siemens.com> Signed-off-by: Antoine Auger <antoineauger@users.noreply.github.com> Co-authored-by: Antoine Auger <antoineauger@users.noreply.github.com>	2025-10-14 09:17:39 +02:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Chauncey	d0bed837ac	[Refactor]Reduce duplicate code in serving_chat (#26627 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-10-11 12:04:49 +00:00
Cyrus Leung	ad430a67ca	[Metrics] Log multi-modal cache stats and fix reset (#26285 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-10 01:45:55 -07:00
Cyrus Leung	4bdf7ac593	[Bugfix] Fix SHM cache initialization (#26427 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-09 02:48:04 -07:00
Cyrus Leung	391612e78b	[Frontend] Consolidate tokenizer init code (#26276 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-06 09:34:52 +00:00
Harry Mellor	4e256cadc2	Remove all references to `yapf` as it's no longer used (#26251 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 09:18:11 -07:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Isotr0py	a42d2df75f	[Frontend] Cache chat template kwargs resolution (#26227 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-10-04 15:32:30 +00:00
Ben Browning	ea25a76c05	[BugFix] Use async Mistral Tokenizer in Chat Completions (#26134 ) Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-10-04 09:42:08 +08:00
Cyrus Leung	d78fda7cda	[Renderer] Move Processor out of LLMEngine (#26165 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-03 15:08:22 +00:00
Yang Liu	812b7f54a8	[Renderer] Move Processor out of AsyncLLM (#24138 ) Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-03 11:29:45 +00:00
Hyogeun Oh (오효근)	b419937c78	[Docs] Fix warnings in mkdocs build (continued) (#25163 ) Signed-off-by: Zerohertz <ohg3417@gmail.com>	2025-09-18 08:23:26 -07:00
Harry Mellor	c1eda615ba	Fix model name included in responses (#24663 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-11 10:47:51 -07:00
Flora Feng	77f62613f9	Consolidate rendering parameters into RenderConfig dataclass (#24543 ) Signed-off-by: sfeng33 <4florafeng@gmail.com>	2025-09-10 08:44:47 +00:00
Flora Feng	15cb047e25	Extend renderer with embedding support and integrate completion endpoint (#24405 ) Signed-off-by: sfeng33 <4florafeng@gmail.com>	2025-09-10 01:46:46 +08:00
Chenheli Hua	01dfb5e982	[Frontend] User-provided uuids for medias in chat. (RFC #22044 ) (#23449 ) Signed-off-by: Roger Wang <hey@rogerw.io> Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Roger Wang <hey@rogerw.me> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.me> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-09-08 06:42:20 -07:00
Flora Feng	0661cb9df3	Add renderer-based prompt processing for embedding and classification endpoints (#24356 ) Signed-off-by: sfeng33 <4florafeng@gmail.com>	2025-09-07 08:26:48 +00:00
Flora Feng	712b273f65	[Refactor] Introduce basic Renderer for completion-style request (#24010 ) Signed-off-by: sfeng33 <4florafeng@gmail.com>	2025-09-04 05:21:12 +00:00
Chenheli Hua	f399182e8c	Run ruff format on a few files. (#24075 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com>	2025-09-02 17:55:32 +00:00
Woosuk Kwon	5685370271	[Chore][V0 Deprecation] Move LogProb to a separate file (#24055 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-01 12:07:53 -07:00
Christian Pinto	1cb39dbcdd	[Misc] IO Processor plugins for pooling models (#22820 ) Signed-off-by: Christian Pinto <christian.pinto@ibm.com> Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com>	2025-08-31 23:07:12 -07:00
Roger Wang	749be00a98	[Core][Multimodal] Allow passing `multi_modal_uuids` as multimodal identifiers. (#23394 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-08-30 18:01:22 -07:00
Gabriel Marinho	5b8077b8ac	Fix wrong truncate_prompt_tokens type hint (#22761 ) Signed-off-by: Gabriel Marinho <gmarinho@ibm.com> Signed-off-by: Gabriel Marinho <104592062+gmarinho2@users.noreply.github.com> Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com>	2025-08-30 20:39:38 +00:00
22quinn	4d7fe40fc0	[RL][BugFix] Fix missing tokenizer error for token-in-token-out (#23904 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-08-30 01:09:55 +08:00
Chen Zhang	3210264421	[Frontend] Add --log-error-stack to print stack trace for error response (#22960 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-27 04:58:59 +00:00
Andrew Sansom	78863f8c5c	[BugFix] Add support for loading prompt embeds tensors serialized on unavailable devices and sparse tensors (#22962 ) Signed-off-by: Andrew Sansom <andrew@protopia.ai>	2025-08-16 06:25:10 +00:00
Roger Wang	da2705198f	[Misc] clear and separate error messages for input too long and input + max-tokens too long (#22803 ) Signed-off-by: Roger Wang <hey@rogerw.me>	2025-08-13 07:22:56 -07:00
Andrew Sansom	e2c8f1edec	[PERF] Use pybase64 to more quickly decode prompt embeddings (#22469 ) Signed-off-by: Andrew Sansom <andrew@protopia.ai>	2025-08-07 19:15:32 -07:00
Moritz Sanft	370661856b	[Frontend] Update OpenAI error response to upstream format (#22099 ) Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>	2025-08-06 23:06:00 -07:00
Woosuk Kwon	ec7cb19224	[gpt-oss] Add loop for built-in tool call (#22374 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com> Co-authored-by: simon-mo <xmo@berkeley.edu> Co-authored-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com> Co-authored-by: Minseok Lee <47620120+minseokl@users.noreply.github.com> Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>	2025-08-06 10:32:21 -07:00
Alexandre JUAN	2f6e6b33fb	[Bugfix] Fix isinstance check for tensor types in _load_prompt_embeds to use dtype comparison (#21612 ) Signed-off-by: Alexandre Juan <a.juan@netheos.net>	2025-07-25 20:11:10 -07:00

1 2 3

140 Commits