59 Commits

Author SHA1 Message Date
Breno Baldas Skuk
ac6eb49de3
fix: OpenAI SDK compat (ResponseTextConfig) (#23126)
Signed-off-by: breno.skuk <breno.skuk@hcompany.ai>
Signed-off-by: Breno Baldas Skuk <breno.skuk@hcompany.ai>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-08-18 15:22:59 -07:00
Michael Goin
bf756321c7
[CI Bugfix] Pin openai<1.100 to unblock CI (#23118)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-18 12:14:01 -07:00
rongfu.leng
904063907c
[Misc] fix openai version (#22485)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-08-08 01:12:54 -07:00
Harry Mellor
ecbea55ca2
Update hf_xet pin to resolve hangs (#22356)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-06 20:31:41 -07:00
Woosuk Kwon
134a8ee8fd
[gpt-oss] Add openai-harmony as default dependency (#22332)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>
Co-authored-by: Minseok Lee <47620120+minseokl@users.noreply.github.com>
Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>
2025-08-06 00:10:14 -07:00
Harry Mellor
796bae07c5
Update transformers to v4.55 (#21931)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-05 22:56:14 -07:00
Woosuk Kwon
dd16bdc798
Increase openai-python version (#22316)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-05 21:43:21 -07:00
Woosuk Kwon
0f81b310db
[Misc] Remove upper bound in openai package version (#22060)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-01 02:11:40 -07:00
Chauncey
6da0078523
[Feat] Allow custom naming of vLLM processes (#21445)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-07-24 03:15:23 -07:00
Julien Denize
6d8d0a24c0
Add think chunk (#21333)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
2025-07-23 21:51:32 -07:00
Chauncey
b5c3b68359
[Misc] bump xgrammar version to v0.1.21 (#20992)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-07-15 19:42:16 -07:00
Patrick von Platen
e7e3e6d263
Voxtral (#20970)
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-15 07:35:30 -07:00
Maroon Ayoub
66f6fbd393
[Prefix Cache] Add reproducible prefix-cache block hashing using SHA-256 + CBOR (64bit) (#20511)
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
2025-07-14 02:45:31 +00:00
Woosuk Kwon
f45a332886
[Sched] Enhance the logic to remove stopped requests from queues (#20739) 2025-07-12 15:33:13 -07:00
Luka Govedič
762be26a8e
[Bugfix] Upgrade depyf to 0.19 and streamline custom pass logging (#20777)
Signed-off-by: Luka Govedic <lgovedic@redhat.com>
Signed-off-by: luka <lgovedic@redhat.com>
2025-07-11 00:15:22 -07:00
Nathan Hoos
d6902ce79f
[V0][V1][Core] Add outlines integration for V1, and update V0 integration. (#15975)
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
2025-07-10 15:30:26 -04:00
Jee Jee Li
4ff79a136e
[Misc] Set the minimum openai version (#20539)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-07 09:15:26 +00:00
Nicolò Lucchesi
d1b689c445
[Bugfix] Fix flaky test_streaming_response test (#20363)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-07-03 14:46:24 +00:00
Prashant Gupta
22e9d42040
[Misc] add xgrammar for arm64 (#18359)
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
2025-07-01 07:02:20 +00:00
Dipika Sikka
a57d57fa72
[Quantization] Bump to use latest compressed-tensors (#20033)
Signed-off-by: Dipika <dipikasikka1@gmail.com>
Co-authored-by: Kyle Sayers <kylesayrs@gmail.com>
2025-06-26 20:50:06 -07:00
h-avsha
3443aaf8dd
Move to a faster base64 implementation (#19984)
Signed-off-by: h-avsha <avshalom.manevich@hcompany.ai>
2025-06-24 20:33:51 -07:00
Ning Xie
c3fec47bb7
[MISC] bump huggingface_hub pkg to 0.33.0 (#19547)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-06-16 05:22:28 -07:00
汪志鹏
ace5cdaff0
[Fix] bump mistral common to support magistral (#19533)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
2025-06-12 22:28:12 -07:00
kourosh hakhamaneshi
e6aab5de29
Revert "[Build/CI] Add tracing deps to vllm container image (#15224)" (#19378) 2025-06-12 17:26:40 -07:00
Kyle Sayers
ebb2f383b8
[Quantization] Bump compressed-tensors version (#19295)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-06-09 14:33:15 -07:00
Dipika Sikka
94870359cd
[Quantization] Bump compressed-tensors version; update NVFP4A16 test model (#19224)
Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>
2025-06-06 01:21:54 -07:00
Luka Govedič
f274581f44
[BugFix] Update pydantic to fix error on python 3.10 (#18852)
Signed-off-by: luka <luka@neuralmagic.com>
2025-05-29 03:05:46 -07:00
Cyrus Leung
82e2339b06
[Doc] Move examples and further reorganize user guide (#18666)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-26 07:38:04 -07:00
dylan
243eb9199f
[Bugfix]: handle hf-xet CAS error when loading Qwen3 weights in vLLM (#18701) 2025-05-26 07:10:56 -07:00
Feng XiaoLong
4fc1bf813a
[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454)
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com>
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>
2025-05-23 16:16:26 -07:00
Chauncey
dc1a821768
[Feature][V1] Support tool_choice: required when using Xgrammar as the StructuredOutputBackend. (#17845)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-05-12 23:01:31 -07:00
Shanshan Shen
760e3ecc8f
[V1][Structured Output] Update llguidance (>= 0.7.11) to avoid AttributeError (no StructTag) (#17839)
Signed-off-by: shen-shanshan <467638484@qq.com>
2025-05-08 20:14:18 -07:00
Mikhail Podvitskii
c747d84576
[Installation] OpenTelemetry version update (#17771)
Signed-off-by: Mikhail Podvitskii <podvitskiymichael@gmail.com>
2025-05-07 22:32:49 -07:00
Christian Heimes
1a6af1453d
Only depend on importlib-metadata for Python < 3.10 (#17776)
Signed-off-by: Christian Heimes <christian@python.org>
2025-05-07 07:51:06 -07:00
Russell Bryant
7a0a146c54
[Build] Require setuptools >= 77.0.3 for PEP 639 (#17389)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-30 23:25:36 -07:00
Rahul Tuli
200bbf92e8
Bump Compressed Tensors version to 0.9.4 (#17478)
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-04-30 15:24:45 -07:00
Gregory Shtrasberg
4464109219
[Build][Bugfix] Restrict setuptools version to <80 (#17320)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-04-29 00:17:23 -07:00
Nick Hill
05fcd1b430
[V1][Perf] Faster incremental detokenization (#15137)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-04-17 07:45:24 -07:00
Taneem Ibrahim
70e7ed841d
[BugFix]: Update minimum pyzmq version (#16549)
Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
2025-04-14 20:06:03 -07:00
Russell Bryant
fee5b8d37f
[Build/CI] Add tracing deps to vllm container image (#15224)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-09 19:14:06 +00:00
Russell Bryant
cb84e45ac7
[Core] Upgrade to xgrammar 0.1.18, add cache size limit (#16283)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-08 19:13:22 -07:00
Harry Mellor
90cb44eb02
Update to transformers==4.51.1 (#16257)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-08 06:53:39 -07:00
Isotr0py
f6b32efb7f
[Bugfix] Fix and reorganize broken GGUF tests and bump gguf version (#16194)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-04-08 13:38:13 +08:00
Miles Williams
f94ab12f79
[Misc] Update compressed-tensors to version 0.9.3 (#16196)
Signed-off-by: Miles Williams <42222518+mlsw@users.noreply.github.com>
2025-04-07 19:09:06 -07:00
Lu Fang
55dcce91df
Upstream Llama4 Support to Main (#16113)
Signed-off-by: Aston Zhang <22279212+astonzhang@users.noreply.github.com>
Signed-off-by: Chris Thi <chris.c.thi@gmail.com>
Signed-off-by: drisspg <drisspguessous@gmail.com>
Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
Signed-off-by: Lu Fang <fanglu@meta.com>
Signed-off-by: Xiaodong Wang <xdwang@meta.com>
Signed-off-by: Yang Chen <yangche@fb.com>
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Lu Fang <lufang@fb.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-07 08:06:27 -07:00
Harry Mellor
55acf86bf8
Fix huggingface-cli[hf-xet] -> huggingface-cli[hf_xet] (#15969)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-02 23:37:30 +00:00
Harry Mellor
51826d51fa
Add minimum version for huggingface_hub to enable Xet downloads (#15873)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-02 02:03:36 -07:00
Russell Bryant
14e53ed11f
[V1] Fix json_object support with xgrammar (#15488)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-02 02:00:08 -07:00
Isotr0py
4e5a0f6ae2
[Misc] Allow using OpenCV as video IO fallback (#15055)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-01 15:55:13 +00:00
Harry Mellor
e5ef4fa99a
Upgrade transformers to v4.50.3 (#13905)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-31 08:59:37 -07:00