xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-05 03:57:09 +08:00

Author	SHA1	Message	Date
youkaichao	6e650f56a1	[torch.compile] decouple compile sizes and cudagraph sizes (#12243 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-24 02:01:30 +08:00
Cody Yu	7206ce4ce1	[Core] Support `reset_prefix_cache` (#12284 )	2025-01-22 18:52:27 +00:00
Konrad Zawora	96f6a7596f	[Bugfix] Fix HPU multiprocessing executor (#12167 ) Signed-off-by: Konrad Zawora <kzawora@habana.ai>	2025-01-23 02:07:07 +08:00
youkaichao	68ad4e3a8d	[Core] Support fully transparent sleep mode (#11743 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-22 14:39:32 +08:00
Aleksandr Malyshev	69196a9bc7	[BUGFIX] When skip_tokenize_init and multistep are set, execution crashes (#12277 ) Signed-off-by: maleksan85 <maleksan@amd.com> Co-authored-by: maleksan85 <maleksan@amd.com>	2025-01-21 23:30:46 +00:00
Adrian Cole	347eeebe3b	[Misc] Remove experimental dep from tracing.py (#12007 ) Signed-off-by: Adrian Cole <adrian.cole@elastic.co>	2025-01-21 11:51:55 -08:00
Jannis Schönleber	9705b90bcf	[Bugfix] fix race condition that leads to wrong order of token returned (#10802 ) Signed-off-by: Jannis Schönleber <joennlae@gmail.com>	2025-01-21 09:47:04 -08:00
Cyrus Leung	59a0192fb9	[Core] Interface for accessing model from `VllmRunner` (#10353 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-20 15:00:59 +08:00
Yuan Tang	d2643128f7	[DOC] Add missing docstring in LLMEngine.add_request() (#12195 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-01-20 14:59:00 +08:00
Yuan Tang	c5c06209ec	[DOC] Fix typo in docstring and assert message (#12194 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-01-20 14:58:29 +08:00
youkaichao	87a0c076af	[core] allow callable in collective_rpc (#12151 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-17 20:47:01 +08:00
Jee Jee Li	07934cc237	[Misc][LoRA] Improve the readability of LoRA error messages (#12102 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-01-17 19:32:28 +08:00
youkaichao	bf53e0c70b	Support torchrun and SPMD-style offline inference (#12071 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-16 19:58:53 +08:00
maang-h	57e729e874	[Doc]: Update `OpenAI-Compatible Server` documents (#12082 )	2025-01-15 16:07:45 +00:00
youkaichao	ad34c0df0f	[core] platform agnostic executor via collective_rpc (#11256 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-15 13:45:21 +08:00
maang-h	87054a57ab	[Doc]: Update the Json Example of the `Engine Arguments` document (#12045 )	2025-01-14 17:03:04 +00:00
Joe Runde	ac2f3f7fee	[Bugfix] Validate lora adapters to avoid crashing server (#11727 ) Signed-off-by: Joe Runde <Joseph.Runde@ibm.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-01-10 15:56:36 +08:00
Jie Fu (傅杰)	a4e2b26856	[Bugfix] Significant performance drop on CPUs with --num-scheduler-steps > 1 (#11794 )	2025-01-07 16:15:50 -08:00
Cyrus Leung	ee77fdb5de	[Doc][2/N] Reorganize Models and Usage sections (#11755 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-06 21:40:31 +08:00
youkaichao	b12e87f942	[platforms] enable platform plugins (#11602 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-30 20:24:45 +08:00
Rajveer Bachkaniwala	b5cbe8eeb3	[Bugfix] Last token measurement fix (#11376 ) Signed-off-by: rajveerb <46040700+rajveerb@users.noreply.github.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-12-28 11:34:46 +08:00
Rafael Vasquez	32aa2059ad	[Docs] Convert rST to MyST (Markdown) (#11145 ) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>	2024-12-23 22:35:38 +00:00
yansh97	94d545a1a1	[Doc] Fix typo in the help message of '--guided-decoding-backend' (#11440 )	2024-12-23 20:20:44 +00:00
Ricky Xu	584f0ae40d	[V1] Make AsyncLLMEngine v1-v0 opaque (#11383 ) Signed-off-by: Ricky Xu <xuchen727@hotmail.com>	2024-12-21 15:14:08 +08:00
omer-dayan	995f56236b	[Core] Loading model from S3 using RunAI Model Streamer as optional loader (#10192 ) Signed-off-by: OmerD <omer@run.ai>	2024-12-20 16:46:24 +00:00
Yanyi Liu	5aef49806d	[Feature] Add load generation config from model (#11164 ) Signed-off-by: liuyanyi <wolfsonliu@163.com> Signed-off-by: Yanyi Liu <wolfsonliu@163.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2024-12-19 10:50:38 +00:00
Alexander Matveev	fdea8ec167	[V1] VLM - enable processor cache by default (#11305 ) Signed-off-by: Alexander Matveev <alexm@neuralmagic.com>	2024-12-18 18:54:46 -05:00
Konrad Zawora	866fa4550d	[Bugfix] Restore support for larger block sizes (#11259 ) Signed-off-by: Konrad Zawora <kzawora@habana.ai>	2024-12-17 16:39:07 -08:00
Cody Yu	bf8717ebae	[V1] Prefix caching for vision language models (#11187 ) Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>	2024-12-17 16:37:59 -08:00
Joe Runde	2d1b9baa8f	[Bugfix] Fix request cancellation without polling (#11190 )	2024-12-17 12:26:32 -08:00
wangxiyuan	e88db68cf5	[Platform] platform agnostic for EngineArgs initialization (#11225 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2024-12-16 22:11:06 -08:00
youkaichao	551603feff	[core] overhaul memory profiling and fix backward compatibility (#10511 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-16 13:32:25 -08:00
chenqianfzh	69ba344de8	[Bugfix] Fix block size validation (#10938 )	2024-12-15 16:38:40 -08:00
Brad Hilton	9c3dadd1c9	[Frontend] Add `logits_processors` as an extra completion argument (#11150 ) Signed-off-by: Brad Hilton <brad.hilton.nw@gmail.com>	2024-12-14 16:46:42 +00:00
Cyrus Leung	eeec9e3390	[Frontend] Separate pooling APIs in offline inference (#11129 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-13 10:40:07 +00:00
Gregory Shtrasberg	00c1bde5d8	[ROCm][AMD] Disable auto enabling chunked prefill on ROCm (#11146 ) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>	2024-12-13 05:31:26 +00:00
Jeremy Arnold	9f3974a319	Fix logging of the vLLM Config (#11143 )	2024-12-12 12:05:57 -08:00
Alexander Matveev	4e11683368	[V1] VLM preprocessor hashing (#11020 ) Signed-off-by: Roger Wang <ywang@roblox.com> Signed-off-by: Alexander Matveev <alexm@neuralmagic.com> Co-authored-by: Michael Goin <michael@neuralmagic.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-12-12 00:55:30 +00:00
Cyrus Leung	cad5c0a6ed	[Doc] Update docs to refer to pooling models (#11093 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-11 13:36:27 +00:00
Cyrus Leung	8f10d5e393	[Misc] Split up pooling tasks (#10820 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-11 01:28:00 -08:00
Woosuk Kwon	134810b3d9	[V1][Bugfix] Always set enable_chunked_prefill = True for V1 (#11061 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-12-10 14:41:23 -08:00
Joe Runde	9b9cef3145	[Bugfix] Backport request id validation to v0 (#11036 ) Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>	2024-12-10 16:38:23 +00:00
youkaichao	ebf778061d	monitor metrics of tokens per step using cudagraph batchsizes (#11031 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-09 22:35:36 -08:00
Cyrus Leung	391d7b2763	[Bugfix] Fix usage of `deprecated` decorator (#11025 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-10 13:45:47 +08:00
youkaichao	46004e83a2	[misc] clean up and unify logging (#10999 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-08 17:28:27 -08:00
Roger Wang	a11f326528	[V1] Initial support of multimodal models for V1 re-arch (#10699 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2024-12-08 12:50:51 +00:00
youkaichao	fd57d2b534	[torch.compile] allow candidate compile sizes (#10984 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-08 11:05:21 +00:00
Russell Bryant	69d357ba12	[Core] Cleanup startup logging a bit (#10961 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-12-07 02:30:23 +00:00
youkaichao	b031a455a9	[torch.compile] add logging for compilation time (#10941 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-12-06 10:07:15 +00:00
Cyrus Leung	aa39a8e175	[Doc] Create a new "Usage" section (#10827 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-12-05 11:19:35 +08:00

1 2 3 4 5 ...

514 Commits