xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-05 17:57:10 +08:00

Author	SHA1	Message	Date
tlipoca9	8a6e108e76	fix: kimi_k2 return empty tool call list (#22149 ) Signed-off-by: tlipoca9 <tlipoca9@gmail.com>	2025-08-04 19:15:31 -07:00
Woosuk Kwon	9af654cc38	[Responses API] Ignore `store=True` and process the request by default (#22185 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-04 05:12:48 -07:00
Woosuk Kwon	6d98843b31	[Responses API] Disable response store by default (#22137 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-03 04:04:21 -07:00
Cyrus Leung	f5d0f4784f	[Frontend] Improve error message for too many mm items (#22114 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-02 02:20:38 -07:00
Nick Hill	8d524ce79f	[BugFix] Improve internal DP load balancing (#21617 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-08-01 19:45:27 -07:00
Harry Mellor	2d7b09b998	Deprecate `--disable-log-requests` and replace with `--enable-log-requests` (#21739 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-01 17:16:37 +01:00
Nick Hill	3146519add	[BugFix] Don't change title of top-level process (#22032 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-08-01 07:37:55 -07:00
wuhang	e6680f9e25	[Bugfix] Add log prefix in non-dp mode engine core (#21889 ) Signed-off-by: wuhang <wuhang6@huawei.com>	2025-08-01 09:04:16 +00:00
Sungyoon Jeong	98df153abf	[Frontend] Align tool_choice="required" behavior with OpenAI when tools is empty (#21052 ) Signed-off-by: Sungyoon Jeong <sungyoon.jeong@furiosa.ai>	2025-08-01 07:54:17 +00:00
Cyrus Leung	b4e081cb15	[Bugfix] Disable multi-modal preprocessor cache for DP (#21896 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-01 08:03:56 +01:00
Song	9484641616	[Model] Add step3 vl (#21998 ) Signed-off-by: oliveryuan <yuansong@step.ai> Co-authored-by: oliveryuan <yuansong@step.ai>	2025-07-31 23:19:06 +08:00
Cyrus Leung	9532a6d563	[Deprecation] Remove deprecated args and methods (#21907 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-30 23:46:38 -07:00
Sanchit Gandhi	ec02e536df	[Bugfix] Relax lang pin for voxtral (#21833 ) Signed-off-by: Sanchit Gandhi <sgandhi3141@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-07-30 20:38:52 -07:00
Yan Pashkovsky	bf668b5bf5	[Feature] Support multiple api keys in server (#18548 ) Signed-off-by: Yan Pashkovsky <yanp.bugz@gmail.com>	2025-07-30 07:03:23 -07:00
rongfu.leng	da3e0bd6e5	[Bugfix] we should use metavar is not choices (#21902 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-07-30 06:51:58 -07:00
wang.yuqi	65f311ce59	[Frontend] Add LLM.reward specific to reward models (#21720 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-29 20:56:03 -07:00
Cyrus Leung	44bc46da60	[Bugfix] Actually disable processing cache when API server is scaled out (#21839 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-29 20:36:04 -07:00
Reza Barazesh	37efc63b64	[V0 deprecation] Guided decoding (#21347 ) Signed-off-by: Reza Barazesh <rezabarazesh@meta.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-29 03:15:30 -07:00
Nick Hill	7234fe2685	[Misc] Rework process titles (#21780 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-07-29 05:14:47 +00:00
Keyang Ru	9ace2eaf35	[Bugfix] Improve JSON extraction in LlamaToolParser (#19024 ) Signed-off-by: keru <keyang.ru@oracle.com> Co-authored-by: keru <keyang.ru@oracle.com>	2025-07-28 12:36:58 +00:00
rongfu.leng	2cc571199b	[feature] add log non default args in LLM (#21680 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-07-28 02:21:22 -07:00
Hongsheng Liu	7656cf4cf3	[Bugfix] [issue-21565] Fix the incompatibility issue with stream and named function calling when Thinking is disabled (#21573 ) Signed-off-by: wangzi <3220100013@zju.edu.cn> Co-authored-by: wangzi <3220100013@zju.edu.cn>	2025-07-27 22:43:50 -07:00
Yuxuan Zhang	93269bb43e	Fix GLM tool parser (#21668 ) Co-authored-by: Chenhui Zhang <zhang.chenhui@outlook.com>	2025-07-28 10:46:38 +08:00
Cyrus Leung	86ae693f20	[Deprecation][2/N] Replace `--task` with `--runner` and `--convert` (#21470 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-27 19:42:40 -07:00
Alexandre JUAN	2f6e6b33fb	[Bugfix] Fix isinstance check for tensor types in _load_prompt_embeds to use dtype comparison (#21612 ) Signed-off-by: Alexandre Juan <a.juan@netheos.net>	2025-07-25 20:11:10 -07:00
mgazz	e189b50f53	Add support for Prithvi in Online serving mode (#21518 ) Signed-off-by: Michele Gazzetti <michele.gazzetti1@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-07-25 07:01:27 -07:00
kourosh hakhamaneshi	9fe98d4250	[Frontend] Add request_id to the Request object so they can be controlled better via external load balancers (#21009 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2025-07-25 06:49:11 -07:00
Cyrus Leung	46d81d6951	[V1] Get supported tasks from model runner instead of model config (#21585 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-25 05:36:45 -07:00
Nick Hill	9c8b2c2a8a	[DP] Support api-server-count > 0 in hybrid DP LB mode (#21510 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-07-24 20:18:16 -07:00
Cyrus Leung	34ddcf9ff4	[Frontend] `run-batch` supports V1 (#21541 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-24 20:05:55 -07:00
Chauncey	6da0078523	[Feat] Allow custom naming of vLLM processes (#21445 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-07-24 03:15:23 -07:00
Shintarou Okada	6eca337ce0	Replace `--expand-tools-even-if-tool-choice-none` with `--exclude-tools-when-tool-choice-none` for v0.10.0 (#20544 ) Signed-off-by: okada <kokuzen@gmail.com> Signed-off-by: okada shintarou <okada@preferred.jp> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-24 02:56:36 -07:00
Yuxuan Zhang	85bda9e7d0	remove GLM-4.5 quantization wrong Code (#21435 )	2025-07-24 01:52:43 -07:00
Julien Denize	6d8d0a24c0	Add think chunk (#21333 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai>	2025-07-23 21:51:32 -07:00
Robert Shaw	d5b981f8b1	[DP] Internal Load Balancing Per Node [`one-pod-per-node`] (#21238 ) Signed-off-by: Robert Shaw <robshaw@redhat.com> Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-07-23 20:57:32 -07:00
deven-labovitch	63d92abb7c	[Frontend] Set MAX_AUDIO_CLIP_FILESIZE_MB via env var instead of hardcoding (#21374 ) Signed-off-by: Deven Labovitch <deven@videa.ai>	2025-07-23 20:22:19 -07:00
Michael Goin	82ec66f514	[V0 Deprecation] Remove Prompt Adapters (#20588 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-23 16:36:48 -07:00
Guillaume Calmettes	7aaa2bd5a8	[Bugfix] ensure tool_choice is popped when `tool_choice:null` is passed in json payload (#19679 ) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>	2025-07-23 00:30:05 -07:00
Yiheng Xu	4594fc3b28	[Model] Add Qwen3CoderToolParser (#21396 ) Signed-off-by: simon-mo <xmo@berkeley.edu> Co-authored-by: simon-mo <xmo@berkeley.edu>	2025-07-22 15:05:57 -07:00
Wang Yijun	44554a0068	Add tokenization_kwargs to encode for embedding model truncation (#21033 )	2025-07-22 08:24:00 -07:00
Cyrus Leung	042af0c8d3	[Model][1/N] Support multiple poolers at model level (#21227 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-21 02:22:21 -07:00
Yuxuan Zhang	10eb24cc91	GLM-4 Update (#20736 ) Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Lu Fang <fanglu@fb.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Lu Fang <fanglu@fb.com>	2025-07-19 22:40:31 +00:00
22quinn	b3d82108e7	[Bugfix][Frontend] Fix openai CLI arg `middleware` (#21220 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-07-19 02:40:38 -07:00
Rui Qiao	217937221b	Elastic Expert Parallel Initial Support (#20775 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-07-18 17:46:09 -07:00
Cyrus Leung	45badd05d0	[Core] Set pooling params based on task and model (#21128 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-18 05:41:17 -07:00
Cyrus Leung	90bd2ab6e3	[Model] Update pooling model interface (#21058 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-17 16:05:40 +00:00
wangxiyuan	89e3c4e9b4	[Misc] Avoid unnecessary import (#21106 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-07-17 12:57:41 +00:00
Asher	5a7fb3ab9e	[Model] Add ToolParser and MoE Config for Hunyuan A13B (#20820 ) Signed-off-by: Asher Zhang <asherszhang@tencent.com>	2025-07-17 09:10:09 +00:00
Chauncey	fdc5b43d20	[Bugfix]: Fix final_res_batch list index out of range error (#21055 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-07-17 00:29:09 -07:00
Mac Misiura	18bdcf4113	feat - add a new endpoint `get_tokenizer_info` to provide tokenizer/chat-template information (#20575 ) Signed-off-by: m-misiura <mmisiura@redhat.com>	2025-07-16 21:52:14 +08:00

1 2 3 4 5 ...

884 Commits