Russell Bryant
|
8eafe5eaea
|
[CI/Build] Ignore ruff warning up007 (#13182)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-02-13 11:48:31 +08:00 |
|
Murali Andoorveedu
|
4c0d93f4b2
|
[V1][Bugfix] Copy encoder input ids to fix set iteration issue during VLM abort (#13173)
Signed-off-by: andoorve <37849411+andoorve@users.noreply.github.com>
|
2025-02-12 12:58:11 -08:00 |
|
Michael Goin
|
14b7899d10
|
[CI] Fix failing FP8 cpu offload test (#13170)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-02-12 19:16:06 +00:00 |
|
Michael Goin
|
09972e716c
|
[Bugfix] Allow fallback to AWQ from AWQMarlin at per-layer granularity (#13119)
|
2025-02-12 09:19:53 -08:00 |
|
Qubitium-ModelCloud
|
36a08630e8
|
[CORE] [QUANT] Support for GPTQModel's dynamic quantization per module override/control (#7086)
|
2025-02-12 09:19:43 -08:00 |
|
Russell Bryant
|
2c2b560f48
|
[CI/Build] Use mypy matcher for pre-commit CI job (#13162)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-02-12 17:12:22 +00:00 |
|
Lu Fang
|
042c3419fa
|
Introduce VLLM_CUDART_SO_PATH to allow users specify the .so path (#12998)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-02-12 09:06:13 -08:00 |
|
Jee Jee Li
|
82cabf53a3
|
[Misc] Delete unused LoRA modules (#13151)
|
2025-02-12 08:58:24 -08:00 |
|
Rafael Vasquez
|
314cfade02
|
[Frontend] Generate valid tool call IDs when using tokenizer-mode=mistral (#12332)
|
2025-02-12 08:29:56 -08:00 |
|
Cyrus Leung
|
985b4a2b19
|
[Bugfix] Fix num video tokens calculation for Qwen2-VL (#13148)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-02-12 11:55:23 +00:00 |
|
bnellnm
|
f4d97e4fc2
|
[Bug] [V1] Try fetching stop_reason from EngineOutput before checking the request (#13108)
|
2025-02-12 02:39:16 -08:00 |
|
Shiyan Deng
|
f1042e86f0
|
[Misc] AMD Build Improvements (#12923)
|
2025-02-12 02:36:10 -08:00 |
|
Maximilien de Bayser
|
7c4033acd4
|
Further reduce the HTTP calls to huggingface.co (#13107)
|
2025-02-12 02:34:09 -08:00 |
|
dependabot[bot]
|
d59def4730
|
Bump actions/setup-python from 5.3.0 to 5.4.0 (#12672)
|
2025-02-12 16:41:22 +08:00 |
|
dependabot[bot]
|
0c7d9effce
|
Bump helm/chart-testing-action from 2.6.1 to 2.7.0 (#12463)
|
2025-02-12 16:41:06 +08:00 |
|
dependabot[bot]
|
dd3b4a01f8
|
Bump actions/stale from 9.0.0 to 9.1.0 (#12462)
|
2025-02-12 00:40:25 -08:00 |
|
dependabot[bot]
|
a0597c6b75
|
Bump helm/kind-action from 1.10.0 to 1.12.0 (#11612)
|
2025-02-12 00:40:19 -08:00 |
|
Lingfan Yu
|
e92694b6fe
|
[Neuron][Kernel] Support Longer Sequences in NKI-based Flash PagedAttention and Improve Efficiency (#12921)
Signed-off-by: Lingfan Yu <lingfany@amazon.com>
|
2025-02-11 21:12:37 -08:00 |
|
Kevin H. Luu
|
842b0fd402
|
[ci] Add more source file dependencies for some tests (#13123)
Signed-off-by: <>
Co-authored-by: EC2 Default User <ec2-user@ip-172-31-20-117.us-west-2.compute.internal>
|
2025-02-11 20:38:10 -08:00 |
|
Christian Pinto
|
974dfd4971
|
[Model] IBM/NASA Prithvi Geospatial model (#12830)
|
2025-02-11 20:34:30 -08:00 |
|
Keyun Tong
|
3ee696a63d
|
[RFC][vllm-API] Support tokenizer registry for customized tokenizer in vLLM (#12518)
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
|
2025-02-12 12:25:58 +08:00 |
|
Russell Bryant
|
72c2b68dc9
|
[Misc] Move pre-commit suggestion back to the end (#13114)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-02-11 22:34:16 +00:00 |
|
Yuan Tang
|
14ecab5be2
|
[Bugfix] Guided decoding falls back to outlines when fails to import xgrammar (#12976)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-02-11 18:17:44 +00:00 |
|
Harry Mellor
|
deb6c1c6b4
|
[Doc] Improve OpenVINO installation doc (#13102)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-02-11 18:02:46 +00:00 |
|
Li, Jiang
|
565c1efa65
|
[CI/Build][Bugfix] Fix CPU backend default threads num (#13077)
|
2025-02-11 16:55:56 +00:00 |
|
Szymon Ożóg
|
2b25b7d2e1
|
Fix initializing GGUF weights for ColumnParallelLinear when using tensor parallel > 1 (#13023)
|
2025-02-11 08:38:48 -08:00 |
|
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
|
6c4dbe23eb
|
[BugFix] Pop instead of del CUDA_VISIBLE_DEVICES (#12962)
Signed-off-by: Hollow Man <hollowman@opensuse.org>
|
2025-02-12 00:21:50 +08:00 |
|
MoonRide303
|
21f5d50fa5
|
[Bugfix] Do not use resource module on Windows (#12858) (#13029)
|
2025-02-11 08:21:18 -08:00 |
|
Jewon Lee
|
bf3e05215c
|
[Misc] Fix typo at comments at metrics.py (#13024)
|
2025-02-11 08:20:37 -08:00 |
|
Harry Mellor
|
ad9776353e
|
Set torch_dtype in TransformersModel (#13088)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-02-11 23:51:19 +08:00 |
|
Mark McLoughlin
|
75e6e14516
|
[V1][Metrics] Add several request timing histograms (#12644)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-02-11 10:14:00 -05:00 |
|
மனோஜ்குமார் பழனிச்சாமி
|
110f59a33e
|
[Bugfix] fix flaky test (#13089)
Signed-off-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
|
2025-02-11 14:41:20 +00:00 |
|
wangxiyuan
|
2e3b969ec0
|
[Platform] add pre_register_and_update function (#12432)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-02-11 22:06:46 +08:00 |
|
Yuhong Guo
|
da317197dd
|
[Build] Fix cuda link target of cumem_allocator in CPU env (#12863)
Signed-off-by: YuhongGuo <yuhong.gyh@antgroup.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-02-11 21:55:57 +08:00 |
|
Gregory Shtrasberg
|
7539bbc6a6
|
[ROCm] Using a more precise memory profiling (#12624)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-02-11 21:47:10 +08:00 |
|
Mengqing Cao
|
9cf4759493
|
[executor] init local_rank as device index (#13027)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2025-02-11 21:20:53 +08:00 |
|
Cody Yu
|
41c5dd45b9
|
[V1][Metrics] Add GPU prefix cache hit rate % gauge (#12592)
|
2025-02-11 08:27:25 +00:00 |
|
Ce Gao
|
fc6485d277
|
[Bugfix]: Reasoning output bug according to the chat template change (#13025)
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
|
2025-02-11 15:49:03 +08:00 |
|
Varun Sundar Rabindranath
|
78a141d768
|
[Misc] LoRA - Refactor Punica ops tests (#12970)
Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2025-02-11 07:26:03 +00:00 |
|
Russell Bryant
|
c320ca8edd
|
[Core] Don't do platform detection at import time (#12933)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-02-11 07:25:25 +00:00 |
|
Woosuk Kwon
|
58047c6f04
|
[Benchmark] Add BurstGPT to benchmark_serving (#13063)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2025-02-10 21:25:30 -08:00 |
|
Florian Greinacher
|
cb080f32e3
|
[Bugfix] Support missing tool parameters in mistral tokenizer (#12884)
Signed-off-by: Florian Greinacher <florian.greinacher@siemens.com>
|
2025-02-11 03:33:33 +00:00 |
|
Simon Mo
|
2c0f58203c
|
[Docs] Annouce Meta Meetup (#13065)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-02-10 18:24:29 -08:00 |
|
Woosuk Kwon
|
2ff4857678
|
[V1][Minor] Move scheduler outputs to a separate file (#13062)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-02-11 02:10:06 +00:00 |
|
Kevin H. Luu
|
91e876750e
|
[misc] Fix setup.py condition to avoid AMD from being mistaken with CPU (#13022)
Signed-off-by: kevin <kevin@anyscale.com>
|
2025-02-10 18:06:16 -08:00 |
|
Farzad Abdolhosseini
|
08b2d845d6
|
[Model] Ultravox Model: Support v0.5 Release (#12912)
Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai>
|
2025-02-10 22:02:48 +00:00 |
|
மனோஜ்குமார் பழனிச்சாமி
|
2ae889052c
|
Fix seed parameter behavior in vLLM (#13007)
Signed-off-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
|
2025-02-10 23:26:50 +08:00 |
|
Cyrus Leung
|
51f0b5f7f6
|
[Bugfix] Clean up and fix multi-modal processors (#13012)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-02-10 10:45:21 +00:00 |
|
Kevin H. Luu
|
fde71262e0
|
[misc] Add retries with exponential backoff for HF file existence check (#13008)
|
2025-02-10 01:15:02 -08:00 |
|
Yuan Tang
|
243137143c
|
[Doc] Add link to tool_choice tracking issue in tool_calling.md (#13003)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-02-10 06:09:33 +00:00 |
|