Peng-YM
48a5fff66e
[Bugfix] Missing tokens in return_token_ids when tool parsers is enabled in streaming mode ( #29074 )
...
Signed-off-by: Peng-YM <1048217874pengym@gmail.com>
2025-12-04 19:09:39 +00:00
Xu Wenqing
ffdd18111b
Add DeepSeek-V3.2 tool parser. ( #29848 )
...
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
2025-12-04 08:46:34 +00:00
daniel-salib
404fc4bfc0
[Frontend] refactor harmony utils output message parsing ( #29820 )
...
Signed-off-by: Daniel Salib <danielsalib@meta.com>
2025-12-04 15:36:57 +08:00
Cyrus Leung
9ae2f60374
[Misc] Various cleanups for MM input processing ( #29970 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-04 06:22:20 +00:00
Benjamin Bartels
fca3f46658
[Frontend] Fixes anthropic /v1/messages streaming not containing input_tokens on first chunk ( #29971 )
...
Signed-off-by: bbartels <benjamin@bartels.dev>
2025-12-04 05:50:27 +00:00
Wentao Ye
ac1886588f
[CI] Fix re import error ( #29973 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-12-03 15:16:54 -05:00
avigny
dd5d1ef780
[Bugfix] Mistral tool parser streaming update ( #19425 )
...
Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: Jeff Cook <jeff@jeffcook.io>
Co-authored-by: sfbemerk <benjaminmerkel@mail.de>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-03 17:45:31 +00:00
Yu Jiaqi
9ae3c55b10
SigLIP example add chat_template ( #29902 )
...
Signed-off-by: piood <2477084691@qq.com>
2025-12-03 16:12:58 +00:00
Chauncey
b78772c433
[Frontend] supports deepseekv32 chat template ( #29837 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-12-03 20:53:44 +08:00
Chauncey
3f42b05fbc
[Refactor] [1/N] to simplify the vLLM serving architecture ( #28040 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-12-03 01:26:39 -08:00
Andrew Xia
3a7751485b
[responsesAPI] support input output messages for non harmony models ( #29549 )
...
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
2025-12-02 23:59:23 -08:00
Russell Bryant
b08025a83b
[Docs] Discuss api key limitations in security guide ( #29922 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-12-02 20:57:28 -08:00
Julien Denize
1b1e35aaf9
[BUGFIX] Fix regex pattern for Mistral Tool Call ( #29918 )
...
Signed-off-by: juliendenize <julien.denize@mistral.ai>
2025-12-02 14:51:58 -08:00
Andrew Xia
52cb349fc0
[responsesAPI][3] ResponsesParser to set up non harmony MCP ( #29413 )
...
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
2025-12-02 11:24:45 -05:00
Cyrus Leung
68ffbca7e4
[Chore] Use tokenizer.encode and tokenizer.decode directly ( #29851 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-02 12:30:40 +00:00
Julien Denize
d8c6210eea
Add Mistral Large 3 and Ministral 3 ( #29757 )
...
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Mickael Seznec <mickael@mistral.ai>
2025-12-02 10:29:00 +00:00
Zhuohan Li
d0cd728907
[Core] Support reseting all running requests' KV while calling reset_prefix_cache ( #28827 )
...
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-12-02 02:25:05 +00:00
Andrew Xia
fa8804ad9c
[responsesAPI][4] fix responseOutputItem Kimi K2 thinking bug ( #29555 )
...
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
2025-12-02 02:11:35 +00:00
sangbumlikeagod
092bb73b8a
[Frontend] add 'verbose_json' and 'timestamp' feature on Whisper Transcription/Translation ( #24209 )
...
Signed-off-by: sangbumlikeagod <oironese@naver.com>
Signed-off-by: sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com>
2025-12-01 18:19:17 +01:00
Cyrus Leung
f0a28bf661
[Misc] Unify tokenizer registration ( #29767 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-01 11:34:58 +00:00
daniel-salib
014ece97c7
[Frontend] Add tool filtering support to ToolServer ( #29224 )
...
Signed-off-by: Daniel Salib <danielsalib@meta.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-12-01 08:03:57 +00:00
wang.yuqi
62de4f4257
[Frontend] Resettle pooling entrypoints ( #29634 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2025-12-01 15:30:43 +08:00
Cyrus Leung
2afcec4dec
[Misc] Update TokenizerLike interface and move get_cached_tokenizer ( #29730 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-30 14:59:47 +08:00
Cyrus Leung
fe3398fab2
[Chore] Enable passing tokenizer=None into MM processor ( #29724 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-29 06:25:10 -08:00
Cyrus Leung
34a984274e
[Misc] Refactor tokenizer interface ( #29693 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-29 04:02:21 -08:00
Didier Durand
04a797cd0e
[Doc]: fixing typos in various files. ( #29717 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-29 01:15:39 -08:00
Cyrus Leung
8d9338fae4
[Chore] Rename Processor to InputProcessor ( #29682 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-28 09:35:41 -08:00
Cyrus Leung
0808eb813b
[Misc] Remove yapf directives ( #29675 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-28 15:07:23 +00:00
HappyAmazonian
f8151b66fa
Revert "Supress verbose logs from model_hosting_container_standards (… ( #29335 )
...
Signed-off-by: Shen Teng <sheteng@amazon.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-28 05:29:05 -08:00
maang-h
51906c8c55
[Docs] Improve priority parameter documentation ( #29572 )
...
Signed-off-by: maang <maang_h@163.com>
Signed-off-by: maang-h <55082429+maang-h@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-11-27 02:09:24 -08:00
Andrew Xia
b07555d26f
[responsesAPI][2] parse ResponseFunctionToolCallOutputItem ( #29383 )
...
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
2025-11-25 10:27:26 -08:00
Harry Mellor
a1f2676879
Scheduled removal of override_pooler_config and disable_log_requests ( #29402 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-25 16:08:57 +00:00
Ben Browning
e1dd706cd1
[Frontend] Respect Chat Completion parallel_tool_calls param ( #26233 )
...
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-11-25 09:56:15 +00:00
Andrew Xia
a685b47c57
[responsesAPI] refactor construct_input_messages ( #29359 )
...
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
2025-11-25 09:47:10 +00:00
Nick Hill
db2906108a
[Misc] Streamline unique id generation ( #29375 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-11-25 08:30:11 +00:00
Nick Hill
7992324f23
[BugFix] Use unique ids for different transcription prompts ( #29372 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-11-25 06:55:16 +00:00
Harry Mellor
316c8492bf
Scheduled removal of guided_* config fields ( #29326 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-25 05:24:05 +00:00
Nick Hill
a178a0b40b
[BugFix] Fix duplicate id tool-call race condition ( #29355 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-11-25 01:54:26 +00:00
Aydin Abiar
656516c315
[Bugfix] properly handle nested json with llama3 tool parser ( #27701 )
...
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Aydin Abiar <62435714+Aydin-ab@users.noreply.github.com>
Co-authored-by: Aydin Abiar <aydin@anyscale.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-11-24 15:28:51 +00:00
Mads Kildegård
ea38474ac5
[Frontend][Responses API] Multi-turn (with type: "output_text") support for non-harmony requests ( #29175 )
...
Signed-off-by: Mads Kildegård <mkildegaard99@gmail.com>
2025-11-22 09:58:22 +00:00
Andrew Xia
742e9ff6b3
[responsesAPI] parse reasoning item input ( #28248 )
...
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-22 15:42:11 +08:00
sfbemerk
2092ce8c39
Tool Call Parser logs should not contain user input / model output except on DEBUG ( #29160 )
...
Signed-off-by: Benjamin Merkel <benjamin.merkel@tngtech.com>
Co-authored-by: Benjamin Merkel <benjamin.merkel@tngtech.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-11-21 20:57:19 +08:00
Cyrus Leung
aab0102a26
[V0 deprecation] Remove more V0 references ( #29088 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-21 11:56:59 +00:00
Alex Brooks
b4734b9550
[Bugfix] Fix default MM LoRA alignment for single str prompts ( #29140 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
2025-11-21 13:32:30 +08:00
Cyrus Leung
56e96b37e4
[V0 Deprecation] Remove best_of ( #29090 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-21 11:40:40 +08:00
jeremyteboul
0730414999
[Core] Add audio_embeds support to chat completions ( #29059 )
...
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
2025-11-21 11:39:47 +08:00
Software Developer
4d01b64284
[Bugfix] - Add Trace Headers to Beam Search Path ( #29100 )
...
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
2025-11-20 20:00:33 +00:00
rookie
56f45eddaf
[Frontend] Optimize beam search loop by sorting and then splicing ( #19347 )
...
Signed-off-by: zhangguozhu <zhangguozhu@360.cn>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: zhangguozhu <zhangguozhu@360.cn>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-11-20 09:02:30 -08:00
Samit
371b1d4c61
[RL] Add Pause and Resume Generation for Asynchronous RL Training ( #28037 )
...
Signed-off-by: SamitHuang <285365963@qq.com>
Signed-off-by: Samit <285365963@qq.com>
Signed-off-by: samithuang <285365963@qq.com>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-11-20 03:01:03 -08:00
Quentin Gallouédec
1c7bcc55b8
[Frontend] Allow parsed tool arguments ( #28820 )
...
Signed-off-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-19 22:20:12 -08:00