Chauncey
|
910abdbd08
|
[Bugfix] fixed top_logprobs: -1 does not appear to work as intended (#26470)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-11 00:41:17 +08:00 |
|
Mark McLoughlin
|
e519281920
|
[Metrics] Add test for multi-modal cache stats logging (#26588)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-10-10 16:00:50 +00:00 |
|
Chauncey
|
1e6848a65d
|
[CI] fix test_run_batch.py::test_completions - AssertionError (#26578)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-10 22:16:28 +08:00 |
|
Chauncey
|
720d3cd0f0
|
[CI] fix ruff format (#26579)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-10 03:02:12 -07:00 |
|
Ashwin Phadke
|
ab196edefb
|
Remove LoRA bias support (#25807)
Signed-off-by: Ashwin Phadke <ashwinphadke12@rediffmail.com>
Signed-off-by: Ashwin Phadke <23502062+ashwin-phadke@users.noreply.github.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-10 09:50:33 +00:00 |
|
Luis Tomas Bolivar
|
3ee202ea1e
|
[GPT-OSS] Add support for arrays at tool message content (#25593)
Signed-off-by: Luis Tomas Bolivar <ltomasbo@redhat.com>
|
2025-10-10 09:00:45 +00:00 |
|
Cyrus Leung
|
ad430a67ca
|
[Metrics] Log multi-modal cache stats and fix reset (#26285)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-10 01:45:55 -07:00 |
|
Ben Browning
|
da4455609d
|
[Chore]: One pythonic tool parser test uses the wrong parser (#26515)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
|
2025-10-10 04:03:55 +00:00 |
|
Julien Denize
|
c6187f55f7
|
Refactor MistralTokenizer (#26358)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
2025-10-09 22:48:58 +00:00 |
|
Cyrus Leung
|
4bdf7ac593
|
[Bugfix] Fix SHM cache initialization (#26427)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-09 02:48:04 -07:00 |
|
Cyrus Leung
|
dc7976dd9f
|
[Misc] Upgrade more code to Python 3.10 (#26463)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-09 10:43:53 +01:00 |
|
Thomas Parnell
|
31a4b3e6c4
|
Revert #24446 and #26168 (#26332)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2025-10-07 16:38:19 -06:00 |
|
Cyrus Leung
|
1e4ecca1d0
|
[V0 Deprecation] Remove VLLM_USE_V1 from tests (#26341)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-07 15:42:31 +00:00 |
|
Andrew Xia
|
185d8ed44f
|
[responsesAPI][bugfix] serialize harmony messages (#26185)
Signed-off-by: Andrew Xia <axia@meta.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-10-07 07:07:53 +00:00 |
|
Harry Mellor
|
6c04638214
|
Fix per file ruff ignores related to line length (#26262)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-06 05:12:40 +00:00 |
|
wuhang
|
91ac7f764d
|
[CI][gpt-oss] Enable python tool tests in CI (#24315)
Signed-off-by: wuhang <wuhang6@huawei.com>
|
2025-10-06 04:20:06 +00:00 |
|
Harry Mellor
|
1c0c68202c
|
Fix per file ruff ignores related to typing (#26254)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 16:37:55 +00:00 |
|
Harry Mellor
|
4e256cadc2
|
Remove all references to yapf as it's no longer used (#26251)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 09:18:11 -07:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Cyrus Leung
|
a964e5e6c3
|
[Bugfix] Allow --skip-tokenizer-init with echo and return_token_ids (#26238)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-05 05:38:53 +00:00 |
|
Cyrus Leung
|
119f00630b
|
[Renderer] Clean up renderer code (#26216)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-04 17:05:29 +00:00 |
|
Yannick Schnider
|
f05fea1f5e
|
[Core] Enable decode of context length equal to max model length (#26168)
Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>
|
2025-10-04 09:59:26 +00:00 |
|
Ben Browning
|
ea25a76c05
|
[BugFix] Use async Mistral Tokenizer in Chat Completions (#26134)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-10-04 09:42:08 +08:00 |
|
Andrew Xia
|
831b124151
|
[responsesAPI] add better error messaging for long prompts (#25724)
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-10-03 14:33:13 -07:00 |
|
Yang Liu
|
812b7f54a8
|
[Renderer] Move Processor out of AsyncLLM (#24138)
Signed-off-by: Yang <lymailforjob@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-03 11:29:45 +00:00 |
|
kyt
|
2ed3f20dba
|
[openai] Fix missing tool usage check (system message) (#24768)
Signed-off-by: kyt <eluban4532@gmail.com>
|
2025-10-03 18:55:44 +08:00 |
|
HUIJONG JEONG
|
3e70e3d4d5
|
add(v1): RequestStatesStats to RequestOutput (#24947)
Signed-off-by: huijjj <huijong.jeong@squeezebits.com>
|
2025-10-03 08:56:25 +00:00 |
|
Andrew Xia
|
e5017cd6d6
|
[gpt-oss] disable tool server initialization if no tool in request (#25790)
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-10-03 05:08:35 +00:00 |
|
Andrew Xia
|
5db1870bb9
|
[gpt-oss] use vLLM instead of openai types for streaming (#25186)
Signed-off-by: Andrew Xia <axia@meta.com>
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-09-30 22:47:07 +00:00 |
|
Cyrus Leung
|
2f652e6cdf
|
[Doc] Improve MM Pooling model documentation (#25966)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-30 18:58:29 +00:00 |
|
Andrew Sansom
|
78a47f87ce
|
Test Prompt Embeds/LoRA compatibility and Enable LoRA Support for OPT Models (#25717)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
|
2025-09-30 08:10:58 +08:00 |
|
Russell Bryant
|
7977e5027c
|
Add filtering for chat template kwargs (#25794)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-27 10:46:49 +00:00 |
|
Russell Bryant
|
3958b96bf5
|
Add option to restrict media domains (#25783)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Chenheli Hua <huachenheli@outlook.com>
|
2025-09-27 01:23:52 +00:00 |
|
Matthew Bonanni
|
3468f17ebe
|
[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (#25489)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
|
2025-09-25 17:37:50 +00:00 |
|
Cyrus Leung
|
0bcc3a160d
|
[CI/Build] Fix flaky entrypoints test (#25663)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-25 12:19:40 +00:00 |
|
Ben Browning
|
5caaeb714c
|
[Bugfix] [Frontend] Cleanup gpt-oss non-streaming chat tool calls (#25514)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
|
2025-09-24 03:20:38 +00:00 |
|
Andrew Xia
|
95bc60e4cb
|
[gpt-oss][bugfix] remove logic to require resp_ in ResponseAPI (#25428)
Signed-off-by: Andrew Xia <axia@meta.com>
|
2025-09-23 15:46:46 -07:00 |
|
Andreas Hartel
|
4322c553a6
|
[Test]: Hermes tool parser stream output error in Qwen3 case (#25203)
Signed-off-by: Andreas Hartel <andreas.hartel@aleph-alpha.com>
|
2025-09-23 17:56:31 +08:00 |
|
Alec S
|
45d7d852d3
|
[Frontend] Responses API MCP tools for built in tools and to pass through headers (#24628)
Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com>
Co-authored-by: Alec Solder <alecs@fb.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-09-22 23:38:19 +00:00 |
|
WeiQing Chen
|
0eecb31663
|
[Bugfix] Fix hermes tool parser handling of non-string argument types (#22002)
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Signed-off-by: David Chen <530634352@qq.com>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-09-22 11:35:39 +08:00 |
|
Yang Liu
|
04d3752329
|
[Bugfix][V0 Deprecation][CI] use async mock and await for async method (#25325)
Signed-off-by: Yang <lymailforjob@gmail.com>
|
2025-09-22 07:06:16 +08:00 |
|
Woosuk Kwon
|
72dd1595b4
|
[CI] Skip tests failing on main (#25326)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-20 19:57:46 -07:00 |
|
Woosuk Kwon
|
52c2a8d4ad
|
[V0 Deprecation] Remove LLMEngine (#25033)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-20 17:56:30 -07:00 |
|
Chauncey
|
f91480b2d4
|
[Bugfix] fix tool call arguments is empty (#25223)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: xin.li <xin.li@daocloud.io>
|
2025-09-20 13:29:54 +08:00 |
|
Andrew Sansom
|
c7e713616a
|
test: Remove vestigial skip for prompt embeds tests after landing v1 Prompt Embeds support (#25291)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
|
2025-09-19 17:33:40 -07:00 |
|
Alec S
|
e69e0b8b5f
|
[Frontend] Responses API messages out, just harmony for now (#24985)
Signed-off-by: Alec Solder <alecs@fb.com>
Co-authored-by: Alec Solder <alecs@fb.com>
Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-09-19 21:40:16 +00:00 |
|
Chauncey
|
47fd08aaf9
|
[CI/Build] fix test function_calling (#25072)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-09-19 12:16:32 -06:00 |
|
Cyrus Leung
|
6c117cff7d
|
[Frontend] Pass API server count to each process (#23717)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-09-20 01:15:19 +08:00 |
|
Harry Mellor
|
058525b997
|
Move PoolerConfig from config/__init__.py to config/pooler.py (#25181)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-19 11:02:55 +00:00 |
|
Andrew Xia
|
6d8246aaff
|
[gpt-oss] Add ResponseReasoningPartAddedEvent, ResponseReasoningPartDoneEvent for streaming (#24938)
Signed-off-by: Andrew Xia <axia@meta.com>
|
2025-09-18 19:11:59 -07:00 |
|