Chauncey
cdb57015a7
[Misc] Replace print with logger ( #15923 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-04-02 01:37:38 -07:00
yihong
93491aefc7
[BugFix] make sure socket close ( #15875 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-01 13:10:24 -07:00
Jennifer Zhao
38327cf454
[Model] Aya Vision ( #15441 )
...
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-04-01 16:30:43 +00:00
Michael Goin
51d7c6a2b2
[Model] Support Mistral3 in the HF Transformers format ( #15505 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-04-01 06:10:05 -07:00
Wei Zeng
30d6a015e0
[Feature] specify model in config.yaml ( #15798 )
...
Signed-off-by: weizeng <weizeng@roblox.com>
2025-04-01 01:20:06 -07:00
Kinfey
a164aea35d
[Frontend] Add Phi-4-mini function calling support ( #14886 )
...
Signed-off-by: Kinfey <kinfeylo@microsoft.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-03-31 22:50:05 -07:00
wwl2755
94744ba41a
[V1] [Feature] Collective RPC ( #15444 )
...
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
2025-03-29 03:39:14 -07:00
Jinzhen Lin
5b800f0932
[Bugfix] set VLLM_WORKER_MULTIPROC_METHOD=spawn for vllm.entrypoionts.openai.api_server ( #15700 )
...
Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
2025-03-28 21:12:26 -07:00
Varun Sundar Rabindranath
1286211f57
[Bugfix] LoRA V1: add and fix entrypoints tests ( #15715 )
...
Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
2025-03-28 21:10:41 -07:00
pengyuange
de1cb38769
[Model] Support Skywork-R1V ( #15397 )
...
Signed-off-by: jiacai.liu <932997367@qq.com>
Co-authored-by: jiacai.liu <932997367@qq.com>
2025-03-28 20:39:21 -07:00
daniel-salib
f3f8d8fff4
implement prometheus fast-api-instrumentor for http service metrics ( #15657 )
2025-03-29 00:12:02 +00:00
Reid
26df46ee59
[Misc] cli auto show default value ( #15582 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-03-28 22:23:00 +00:00
Reid
fd5fd26902
[Frontend] update priority for --api-key and VLLM_API_KEY ( #15588 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-03-28 19:40:12 +08:00
Ce Gao
32b14baf8a
[Refactor][Frontend] Keep all logic about reasoning into one class ( #14428 )
...
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
2025-03-28 00:23:30 -07:00
Jason (Siyu) Zhu
cec8c7d7f8
Refactor error handling for multiple exceptions in preprocessing ( #15650 )
...
Signed-off-by: JasonZhu1313 <jasonchu13@outlook.com>
2025-03-28 03:27:20 +00:00
Yuan Tang
66aa4c0bf4
[Feature] Add middleware to log API Server responses ( #15593 )
...
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-03-27 17:49:38 +00:00
Alex Brooks
1711b929b6
[Model] Add Reasoning Parser for Granite Models ( #14202 )
...
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Co-authored-by: Joe Runde <joe@joerun.de>
2025-03-26 14:28:07 +00:00
wwl2755
99f536f830
[Misc] Enhance warning information to user-defined chat template ( #15408 )
...
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
2025-03-26 02:21:15 -07:00
daniel-salib
5aefd6ac31
Fix raw_request extraction in load_aware_call decorator ( #15382 )
...
Signed-off-by: Daniel Salib <danielsalib@meta.com>
2025-03-25 22:29:54 -07:00
Maximilien de Bayser
e977c11111
Add workaround for shared field_names in pydantic model class ( #13925 )
...
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2025-03-25 20:31:08 +00:00
Chauncey
10b34e36b9
[Bugfix] Fixed the issue of not being able to input video and image simultaneously ( #15387 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-03-25 03:48:08 +00:00
Cyrus Leung
cbcdf2c609
[Bugfix] Fix chat template loading ( #15143 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-03-24 13:50:09 +00:00
Robin
d6cd59f122
[Frontend] Support tool calling and reasoning parser ( #14511 )
...
Signed-off-by: WangErXiao <863579016@qq.com>
2025-03-23 14:00:07 -07:00
Cyrus Leung
baec0d4de9
Revert "[Feature] specify model in config.yaml ( #14855 )" ( #15293 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-21 08:30:23 -07:00
Wei Zeng
0fa3970deb
[Feature] specify model in config.yaml ( #14855 )
...
Signed-off-by: weizeng <weizeng@roblox.com>
2025-03-21 00:26:03 -07:00
Chauncey
ae65f3e237
[Misc]fixed disable these http request logs ( #14754 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-03-19 21:53:40 -07:00
maobaolong
26dd972adb
[FEAT]Support reset prefix cache by specified device ( #15003 )
2025-03-19 10:54:41 -07:00
Simon Mo
3b457143d2
[Bugfix] Register serializers for V0 MQ Engine ( #15009 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-03-18 09:14:47 -04:00
Sebastian Schoennenbeck
dd732028f5
[Bugfix][Frontend] Fix validation of logprobs in ChatCompletionRequest ( #14352 )
...
Signed-off-by: Sebastian Schönnenbeck <sebastian.schoennenbeck@comma-soft.com>
2025-03-18 05:50:05 -07:00
Jun Duan
74bc397b0a
[Core] Expose API endpoint /is_sleeping ( #14312 )
...
Signed-off-by: Jun Duan <jun.duan.phd@outlook.com>
2025-03-15 06:28:14 -07:00
Robert Shaw
d4d93db2c5
[V1] V1 Enablement Oracle ( #13726 )
...
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2025-03-14 22:02:20 -07:00
daniel-salib
73deea2fdb
[Frontend] track server_load ( #13950 )
2025-03-14 09:53:17 -07:00
Russell Bryant
0b0d6421b2
[Frontend] Fix log message to use http vs https ( #14774 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-14 09:21:09 -07:00
Guillaume Calmettes
fd8e055ffb
[BugFix]: properly catch templating error when preprocess input ( #13976 )
...
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
2025-03-14 05:58:34 -07:00
Cyrus Leung
ab93f1360f
[VLM] Various cleanup and fixes ( #14806 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-14 05:58:19 -07:00
Cyrus Leung
f53a0586b9
[Bugfix] Fix prompt format of GLM4V ( #14539 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-13 11:37:17 +00:00
Woosuk Kwon
c0c25e25fa
[Model] Add support for Gemma 3 ( #14660 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-12 08:36:33 -07:00
Randy Chen
36e0c8f7da
[Feature] Add vllm bench CLI ( #13993 )
...
Signed-off-by: Randy Chen <acad.randyjhc@gmail.com>
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
2025-03-12 00:31:48 +00:00
Woosuk Kwon
0967110e42
[Minor] Update the tqdm bar for parallel sampling ( #14571 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-03-10 14:23:48 -07:00
Harry Mellor
3b352a2f92
Correct capitalisation: VLLM -> vLLM ( #14562 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-10 16:36:21 +00:00
Chauncey
b0746fae3d
[Frontend] support image embeds ( #13955 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-03-10 12:36:03 +00:00
Yuchen Yan
73ae0b44e9
[Bugfix] Fix tqdm progress bar when SamplingParams.n > 1 ( #12428 )
...
Signed-off-by: Yuchen Yan <740987012@qq.com>
2025-03-08 20:14:53 -08:00
Harry Mellor
47512b3200
Default to generation_config from model ( #12622 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-08 14:46:15 +08:00
மனோஜ்குமார் பழனிச்சாமி
cc10281498
[Misc] Set default value of seed to None ( #14274 )
...
Signed-off-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
2025-03-07 10:40:01 +00:00
Harry Mellor
bf0560bda9
Reinstate best_of for V0 ( #14356 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-06 08:34:22 -08:00
Nicolò Lucchesi
fa82b93853
[Frontend][Docs] Transcription API streaming ( #13301 )
...
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-03-06 10:39:35 +00:00
Vincent
a4f1ee35d6
Deprecate best_of Sampling Parameter in anticipation for vLLM V1 ( #13997 )
...
Signed-off-by: vincent-4 <vincentzhongy+githubvincent4@gmail.com>
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-05 20:22:43 +00:00
Benjamin Chislett
32985bed7c
[Frontend] Allow return_tokens_as_token_ids to be passed as a request param ( #14066 )
...
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
2025-03-05 06:30:40 +00:00
Congcong Chen
0a995d5434
[Model] New model support for Phi-4-multimodal-instruct ( #14119 )
2025-03-04 20:57:01 -08:00
Harry Mellor
e5b2f1601a
[Frontend] Do prompt_logprobs clamping for chat as well as completions ( #14225 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-04 20:13:06 +00:00