xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-24 05:55:02 +08:00

Author	SHA1	Message	Date
Cyrus Leung	ad430a67ca	[Metrics] Log multi-modal cache stats and fix reset (#26285 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-10 01:45:55 -07:00
Cyrus Leung	4bdf7ac593	[Bugfix] Fix SHM cache initialization (#26427 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-09 02:48:04 -07:00
Cyrus Leung	391612e78b	[Frontend] Consolidate tokenizer init code (#26276 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-06 09:34:52 +00:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Yang Liu	812b7f54a8	[Renderer] Move Processor out of AsyncLLM (#24138 ) Signed-off-by: Yang <lymailforjob@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-03 11:29:45 +00:00
Cyrus Leung	0ad9951c41	[Input] Remove unused `prompt` field (#26097 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-03 00:23:21 -07:00
Nick Hill	169313b9f8	[Misc] Make handling of SamplingParams clearer in n>1 case (#26032 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-10-01 19:31:39 -07:00
Kenichi Maehashi	3b7c20a6b5	[Bugfix] Apply same sampling parameters for both `n=1` and `n>1` (#26005 ) Signed-off-by: Kenichi Maehashi <maehashi@preferred.jp>	2025-10-01 14:37:35 +00:00
Woosuk Kwon	c99db8c8dd	[V0 Deprecation] Remove V0 core (#25321 ) Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai> Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-20 19:58:26 -07:00
Aaron Pham	29283e8976	[Chore] Cleanup guided namespace, move to structured outputs config (#22772 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-09-18 09:20:27 +00:00
Zhuohan Li	6c47f6bfa4	[Core] Remove tokenizer group in vLLM (#24078 ) Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>	2025-09-17 08:42:59 +00:00
Michael Goin	9d2a44606d	[UX] Remove AsyncLLM torch profiler disabled log (#24609 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-09-12 10:08:44 -07:00
RichardoMu	40b6c9122b	[V1] feat:add engine v1 tracing (#20372 ) Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com> Signed-off-by: Ye Zhang <zhysishu@gmail.com> Signed-off-by: RichardoMu <44485717+RichardoMrMu@users.noreply.github.com> Signed-off-by: simon-mo <simon.mo@hey.com> Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Mu Huai <tianbowen.tbw@antgroup.com> Co-authored-by: Ye Zhang <zhysishu@gmail.com> Co-authored-by: Benjamin Bartels <benjamin@bartels.dev> Co-authored-by: simon-mo <simon.mo@hey.com> Co-authored-by: 瑜琮 <ly186375@antfin.com> Co-authored-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-09-11 17:10:39 -07:00
Chauncey	e680723eba	[Bugfix] Disable the statslogger if the api_server_count is greater than 1 (#22227 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-09-08 15:28:03 -07:00
Seiji Eicher	60b755cbcb	[Misc] Have AsyncLLM `custom_stat_loggers` extend default logger list (#20952 ) Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-09-04 14:25:30 -07:00
Raghavan	05d839c19e	Fix(async): Add support for truncate_prompt_tokens in AsyncLLM (#23800 )	2025-08-28 22:55:06 -07:00
Yong Hoon Shin	cb293f6a79	[V1] Enable prefill optimization for Gemma3n (#22628 ) Signed-off-by: Yong Hoon Shin <yhshin@meta.com>	2025-08-28 14:54:30 -07:00
Cyrus Leung	69244e67e6	[Core] Use key-only cache for `BaseMultiModalProcessor` (#23018 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-27 14:19:13 +08:00
Chenheli Hua	e58c5a9768	[Core] Add torch profiler CPU traces for AsyncLLM. (#21794 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com>	2025-08-20 02:32:47 +00:00
Nick Hill	ad0297d113	[Misc] Support passing multiple request ids at once to `AsyncLLM.abort()` (#22944 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-08-15 17:00:36 -07:00
iAmir97	7655dc3e45	[Bugfix] Add reset prefix cache for online serving (#22726 ) Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com> Signed-off-by: iAmir97 <71513472+iAmir97@users.noreply.github.com> Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-08-14 04:04:18 -07:00
Nick Hill	ccdae737a0	[BugFix] Don't cancel asyncio tasks directly from destructors (#22476 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-08-08 01:13:18 -07:00
Cyrus Leung	1712543df6	[CI/Build] Fix multimodal tests (#22491 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-08 00:31:19 -07:00
Nick Hill	8d524ce79f	[BugFix] Improve internal DP load balancing (#21617 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-08-01 19:45:27 -07:00
Harry Mellor	2d7b09b998	Deprecate `--disable-log-requests` and replace with `--enable-log-requests` (#21739 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-01 17:16:37 +01:00
Cyrus Leung	46d81d6951	[V1] Get supported tasks from model runner instead of model config (#21585 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-25 05:36:45 -07:00
Robert Shaw	d5b981f8b1	[DP] Internal Load Balancing Per Node [`one-pod-per-node`] (#21238 ) Signed-off-by: Robert Shaw <robshaw@redhat.com> Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-07-23 20:57:32 -07:00
Michael Goin	82ec66f514	[V0 Deprecation] Remove Prompt Adapters (#20588 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-23 16:36:48 -07:00
Christian Pinto	8560a5b258	[Core][Model] PrithviMAE Enablement on vLLM v1 engine (#20577 ) Signed-off-by: Christian Pinto <christian.pinto@ibm.com>	2025-07-23 11:00:23 -07:00
Wang Yijun	44554a0068	Add tokenization_kwargs to encode for embedding model truncation (#21033 )	2025-07-22 08:24:00 -07:00
Robert Shaw	29d1ffc5b4	[DP] Fix Prometheus Logging (#21257 ) Signed-off-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <robshaw@redhat.com>	2025-07-21 09:11:35 -07:00
Rui Qiao	217937221b	Elastic Expert Parallel Initial Support (#20775 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-07-18 17:46:09 -07:00
kourosh hakhamaneshi	e2148dc5ea	[Bugfix] Add check_health to v1 async client. (#19821 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2025-06-18 21:47:01 -07:00
Maximilien de Bayser	799397ee4f	Support embedding models in V1 (#16188 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Signed-off-by: Max de Bayser <maxdebayser@gmail.com> Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-06-18 21:36:33 -07:00
Adolfo Victoria	ca27f0f9c1	[Bugfix][Core] Update cancellation logic in `generate()` to handle Generator exits (#19225 ) Co-authored-by: Adolfo Victoria <adovi@meta.com>	2025-06-06 20:17:54 +00:00
CYJiang	23027e2daf	[Misc] refactor: simplify EngineCoreClient.make_async_mp_client in AsyncLLM (#18817 ) Signed-off-by: googs1025 <googs1025@gmail.com>	2025-06-04 15:37:25 -07:00
jmswen	c8dcc15921	Allow AsyncLLMEngine.generate to target a specific DP rank (#19102 ) Signed-off-by: Jon Swenson <jmswen@gmail.com>	2025-06-04 08:26:47 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
Rui Qiao	bdce64f236	[V1] Support DP with Ray (#18779 )	2025-06-02 21:15:13 -07:00
Nick Hill	2dbe8c0774	[Perf] API-server scaleout with many-to-many server-engine comms (#17546 )	2025-05-30 08:17:00 -07:00
Seiji Eicher	7891fdf0c6	[V1] Fix _pickle.PicklingError: Can't pickle <class 'transformers_modules.deepseek-ai.DeepSeek-V2-Lite... (#18640 ) Signed-off-by: Seiji Eicher <seiji@anyscale.com>	2025-05-24 20:07:20 -07:00
Cyrus Leung	61e0a506a3	[Bugfix] Avoid repeatedly creating dummy data during engine startup (#17935 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-12 22:40:19 -07:00
Nick Hill	3d13ca0e24	[BugFix] Fix `--disable-log-stats` in V1 server mode (#17600 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-05-08 04:08:15 +00:00
rongfu.leng	d803786731	[V1][Bugfix]: vllm v1 verison metric num_gpu_blocks is None (#15755 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-04-30 18:20:39 +08:00
Gabriel Marinho	1c2bc7ead0	Truncation control for embedding models (#14776 ) Signed-off-by: Gabriel Marinho <gmarinho@ibm.com> Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com>	2025-04-30 09:24:57 +08:00
Nick Hill	df6f3ce883	[Core] Remove prompt string from engine core data structures (#17214 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-04-25 23:41:05 -07:00
Zijing Liu	53e8cf53a4	[V1][Metrics] Allow V1 AsyncLLM to use custom logger (#14661 ) Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Mark McLoughlin <markmc@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-04-25 22:05:40 -07:00
Daniel Li	48cb2109b6	[V1] Move usage stats to worker and start logging TPU hardware (#16211 )	2025-04-25 14:06:01 -06:00
Yinghai Lu	fe92176321	Add collective_rpc to llm engine (#16999 ) Signed-off-by: Yinghai Lu <yinghai@thinkingmachines.ai>	2025-04-24 20:16:52 +00:00
Harry Mellor	0a05ed57e6	Simplify `TokenizerGroup` (#16790 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-04-24 04:43:56 -07:00

1 2 3

103 Commits