xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-19 03:04:28 +08:00

Author	SHA1	Message	Date
wang.yuqi	d9e00dbd1f	[Performance] V1 Classify Models E2E Performance Optimization (#23541 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-08-29 03:12:32 -07:00
Maximilien de Bayser	2554b27baa	[V0 Deprecation] Remove pooling model support in V0 (#23434 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-29 00:04:02 -07:00
Didier Durand	d3da2eea54	[Doc]: fix typos in Python scripts (#23828 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-08-28 05:37:38 -07:00
rongfu.leng	daa1273b14	[Bugfix] when set offline model running error (#23711 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-08-28 07:27:45 +00:00
Jan Kessler	a11adafdca	Gracefully handle edge cases in harmony utils (#23155 ) Signed-off-by: Jan Kessler <jakessle@uni-mainz.de> Co-authored-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-08-27 20:14:00 -07:00
Chen Zhang	142ac08030	[Frontend] Optimize beam search performance by limiting concurrency (#23599 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-27 04:59:14 +00:00
Chen Zhang	3210264421	[Frontend] Add --log-error-stack to print stack trace for error response (#22960 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-27 04:58:59 +00:00
Yiheng Xu	786835807b	[Bugfix]: Qwen3 Coder Tool Parser (#23099 ) Signed-off-by: Yiheng Xu <charlesyihengxu@gmail.com> Co-authored-by: Aaron Pham <contact@aarnphm.xyz>	2025-08-26 19:58:32 -07:00
wuhang	6891205b16	[Feature][Responses API] Support MCP tool in background mode (#23494 ) Signed-off-by: wuhang <wuhang6@huawei.com>	2025-08-27 01:06:58 +00:00
Federico	585e0bde36	[Bugfix] UnboundLocalError when GptOss reasoning specified (#23054 ) Signed-off-by: Federico <65908512+coval3nte@users.noreply.github.com>	2025-08-27 00:29:52 +00:00
Hyogeun Oh (오효근)	730d0ac8b9	[Docs] Fix warnings in `mkdocs build` (#23649 ) Signed-off-by: Zerohertz <ohg3417@gmail.com> Signed-off-by: Hyogeun Oh (오효근) <ohg3417@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-26 18:19:23 +00:00
Guillaume Calmettes	ebd5a77bb5	feat: add usage to TranscriptionResponse (text and json response_format) (#23576 ) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>	2025-08-26 05:26:26 -07:00
Matúš Námešný	384dd1b0a8	[Bugfix] Add missing enable_log_outputs parameter to init_app_state function (#23634 ) Signed-off-by: Matúš Námešný <matus.namesny@ameria.com>	2025-08-26 12:13:15 +00:00
Bin Jia	959783fb99	[fix] fix seed-oss-parser (#23560 ) Signed-off-by: jiabin.00 <jiabin.00@bytedance.com>	2025-08-25 23:16:36 -07:00
ZiTian Zhao	2da02dd0d8	[Fix] DeepSeek V3.1 tool parser error message (#23492 ) Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>	2025-08-25 00:56:39 -07:00
Yu Guo	49ab23b3cc	[gpt-oss] use reasoning channel for reasoning text in serving_chat (#22920 ) Signed-off-by: Yu Guo <yuguo@meta.com>	2025-08-25 06:29:34 +00:00
Jiangyun Zhu	c55c028998	[gpt-oss] Streaming Output for Python Tool (#23409 ) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>	2025-08-24 04:42:38 +00:00
Xu Wenqing	b8f17f5d98	Support DeepSeek-V3.1 tool call (#23454 ) Signed-off-by: Xu Wenqing <xuwq1993@qq.com>	2025-08-23 05:50:16 +00:00
Didier Durand	22cf679aad	[Doc]: fix various typos in multiple files (#23179 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-08-22 10:38:46 -07:00
Guillaume Calmettes	0ba1b54ac6	[gpt-oss] add input/output usage in responses api when harmony context is leveraged (#22667 ) Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>	2025-08-22 08:32:24 +00:00
Bin Jia	5964069367	[New Model] Add Seed-Oss model (#23241 ) Signed-off-by: jiabin.00 <jiabin.00@bytedance.com> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-08-22 04:58:10 +00:00
Cyrus Leung	8896eb72eb	[Deprecation] Remove `prompt_token_ids` arg fallback in `LLM.generate` and `LLM.embed` (#18800 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-22 10:56:57 +08:00
Kebe	5368f76855	[Feature][Responses API] Support logprobs(non-stream) (#23319 ) Signed-off-by: Kebe <mail@kebe7jun.com>	2025-08-21 23:09:16 +00:00
Chen Zhang	8a19303173	[BugFix][gpt-oss] Fix Chat Completion with Multiple Output Message (#23318 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-21 10:31:11 -07:00
Russell Bryant	4e51fa8cba	Do not use eval() to convert unknown types (#23266 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-08-20 13:28:30 -07:00
Chen Zhang	b95697d731	[Frontend] improve error logging of chat completion (#22957 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-20 13:03:37 -07:00
bigmoyan	582bbe6bd7	[Fix] correct tool_id for kimi-k2 when use tool_choice=required (#21259 ) Co-authored-by: wangzhengtao <wangzhengtao@msh.team>	2025-08-20 12:59:54 -07:00
Russell Bryant	f77a0802b7	Limit HTTP header count and size (#23267 ) Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>	2025-08-20 17:57:37 +00:00
Marko Rosenmueller	80141bbf2f	fix: use cache_salt for gpt-oss (#23186 ) Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>	2025-08-19 18:12:25 +00:00
22quinn	f7cf5b512e	[Frontend] Add `/collective_rpc` API endpoint (#23075 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-08-19 17:29:32 +00:00
Yuge Zhang	24f4d1a224	Add return_token_ids parameter to OpenAI API endpoints (#22587 ) Signed-off-by: Yuge Zhang <scottyugochang@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Simon Mo <simon.mo@hey.com>	2025-08-19 09:48:31 -07:00
Breno Baldas Skuk	ac6eb49de3	fix: OpenAI SDK compat (ResponseTextConfig) (#23126 ) Signed-off-by: breno.skuk <breno.skuk@hcompany.ai> Signed-off-by: Breno Baldas Skuk <breno.skuk@hcompany.ai> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>	2025-08-18 15:22:59 -07:00
afeldman-nm	bf7f470b22	[V1] Logits processors extensibility (#19912 ) Signed-off-by: Andrew Feldman <afeldman@redhat.com> Signed-off-by: Andrew Feldman <afeld2012@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Andrew Feldman <afeld2012@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-16 12:59:17 -07:00
Woonggi Min	68373d3126	[Frontend] Added support for HermesToolParser for models without special tokens (#16890 ) Signed-off-by: minpeter <kali2005611@gmail.com>	2025-08-16 17:38:42 +00:00
Andrew Sansom	78863f8c5c	[BugFix] Add support for loading prompt embeds tensors serialized on unavailable devices and sparse tensors (#22962 ) Signed-off-by: Andrew Sansom <andrew@protopia.ai>	2025-08-16 06:25:10 +00:00
Nick Hill	f6b5040590	[Frontend] Avoid list copies in `serving_chat.py` (#22947 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-08-16 02:06:30 +00:00
Csrayz	a0632a3e03	[Frontend] Expose do_log_stats interval to env (#22905 ) Signed-off-by: Csrayz <jover@cmbchina.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-15 13:00:20 +00:00
Roger Wang	da2705198f	[Misc] clear and separate error messages for input too long and input + max-tokens too long (#22803 ) Signed-off-by: Roger Wang <hey@rogerw.me>	2025-08-13 07:22:56 -07:00
Kdump	653124bd46	[Frontend] Add chunked processing to handle long inputs in embedding models (#22280 ) Signed-off-by: x22x22 <wadeking@qq.com> Signed-off-by: Kdump <rootshellexp@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-13 04:14:24 -07:00
Chen Zhang	6807af8f46	[gpt-oss] upgrade gpt-oss to v0.0.3 and add version check (#22768 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-12 21:37:26 -07:00
Chen Zhang	ad344ef552	[gpt-oss] Small bug fixes for frontend (#22512 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-11 22:04:38 -07:00
Chen Zhang	95a935fc48	[gpt-oss] Support streaming in response API (#22431 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-11 17:46:59 -07:00
wang.yuqi	84cf78acee	[Model] Pooling models default to using chunked prefill & prefix caching if supported. (#20930 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-08-11 09:41:37 -07:00
Harry Mellor	bc1d02ac85	[Docs] Add comprehensive CLI reference for all large `vllm` subcommands (#22601 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-11 00:13:33 -07:00
Maximilien de Bayser	39052dbca8	Support token_type_ids in V1 with less code changes (#21985 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com>	2025-08-10 22:54:59 -07:00
yyweiss	baece8c3d2	[Frontend] Add unix domain socket support (#18097 ) Signed-off-by: <yyweiss@gmail.com> Signed-off-by: yyw <yyweiss@gmail.com>	2025-08-08 16:23:44 -07:00
Chen Zhang	fe6d8257a1	[gpt-oss] Support tool call and implement MCP tool server (#22427 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-08-08 15:06:37 -07:00
Andrew Sansom	e2c8f1edec	[PERF] Use pybase64 to more quickly decode prompt embeddings (#22469 ) Signed-off-by: Andrew Sansom <andrew@protopia.ai>	2025-08-07 19:15:32 -07:00
Cyrus Leung	139d155781	[Frontend] Use engine argument to control MM cache size (#22441 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-07 09:47:10 -07:00
Woosuk Kwon	399d2a10e2	Fix pre-commit error in main (#22462 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-07 08:54:39 -07:00

1 2 3 4 5 ...

848 Commits