Russell Bryant
|
5390d6664f
|
[Doc] Add the start of an arch overview page (#10368)
|
2024-11-19 09:52:11 +00:00 |
|
Jee Jee Li
|
382b6a4852
|
[Misc] Avoid misleading warning messages (#10438)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-11-19 08:54:58 +00:00 |
|
Travis Johnson
|
272e31c0bd
|
[Bugfix] Guard for negative counter metrics to prevent crash (#10430)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
|
2024-11-19 04:57:10 +00:00 |
|
Michael Goin
|
74f8c2cf5f
|
Add openai.beta.chat.completions.parse example to structured_outputs.rst (#10433)
|
2024-11-19 04:37:46 +00:00 |
|
Mengqing Cao
|
8c1fb50705
|
[Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2024-11-19 11:22:26 +08:00 |
|
Jee Jee Li
|
7eb719df13
|
[Bugfix]Fix Phi-3 BNB online quantization (#10417)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-11-19 03:21:42 +00:00 |
|
Kevin H. Luu
|
284203f171
|
[ci/build] Have dependabot ignore all patch update (#10436)
We have too many dependencies and all patch updates can be a little noisy. This is to have dependabot ignore all patch version updates.
|
2024-11-19 01:04:25 +00:00 |
|
Ricky Xu
|
90a6c759ca
|
[misc] partial prefix & random input generation benchmark (#9929)
Signed-off-by: rickyx <rickyx@anyscale.com>
|
2024-11-18 15:39:14 -08:00 |
|
youkaichao
|
2298e69b5f
|
[ci][bugfix] fix kernel tests (#10431)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-18 15:29:37 -08:00 |
|
youkaichao
|
a03ea40792
|
[3/N][torch.compile] consolidate custom op logging (#10399)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-18 15:14:59 -08:00 |
|
Lucas Wilkinson
|
96d999fbe8
|
[Kernel] Initial Machete W4A8 support + Refactors (#9855)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
|
2024-11-18 12:59:29 -07:00 |
|
Angus Wang
|
c2170a5b39
|
[Kernel] Explicitly specify other value in tl.load calls (#9014)
Signed-off-by: Angus Wang <wangjadehao@gmail.com>
|
2024-11-18 11:39:40 -08:00 |
|
Yan Ma
|
6b2d25efc7
|
[Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107)
Signed-off-by: yan ma <yan.ma@intel.com>
|
2024-11-18 11:18:05 -07:00 |
|
Michael Goin
|
281cc4b3cd
|
[Model][Bugfix] Support TP for PixtralHF ViT (#10405)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2024-11-18 10:04:14 -08:00 |
|
Andrew Nesbitt
|
4f686d139f
|
Fix open_collective value in FUNDING.yml (#10426)
Signed-off-by: Andrew Nesbitt <andrewnez@gmail.com>
|
2024-11-18 09:52:42 -08:00 |
|
ismael-dm
|
31894a2155
|
[Doc] Add documentation for Structured Outputs (#9943)
Signed-off-by: ismael-dm <ismaeldm99@gmail.com>
|
2024-11-18 09:52:12 -08:00 |
|
youkaichao
|
7851b45196
|
[5/N][torch.compile] torch.jit.script --> torch.compile (#10406)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-18 23:20:06 +08:00 |
|
B-201
|
4186be8111
|
[Doc] Update doc for LoRA support in GLM-4V (#10425)
Signed-off-by: B-201 <Joy25810@foxmail.com>
|
2024-11-18 15:08:30 +00:00 |
|
Isotr0py
|
e7ebb662d7
|
[Model] Remove transformers attention porting in VITs (#10414)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-18 21:45:21 +08:00 |
|
B-201
|
5be4e52b65
|
[Model][LoRA]LoRA support added for glm-4v (#10418)
Signed-off-by: B-201 <Joy25810@foxmail.com>
|
2024-11-18 12:57:10 +00:00 |
|
Maybewuss
|
01aae1cc68
|
[Model] Remove redundant softmax when using PoolingType.STEP (#10415)
|
2024-11-18 10:05:36 +00:00 |
|
lkchen
|
c7dec926f6
|
[VLM] Report multi_modal_placeholders in output (#10407)
Signed-off-by: Linkun Chen <lkchen+anyscale@github.com>
|
2024-11-18 16:06:16 +08:00 |
|
youkaichao
|
51bb12d17b
|
[4/N][torch.compile] clean up set_torch_compile_backend (#10401)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-17 23:57:20 -08:00 |
|
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
|
47826cacf0
|
[Bugfix] Ignore ray reinit error when current platform is ROCm or XPU (#10375)
Signed-off-by: Hollow Man <hollowman@opensuse.org>
|
2024-11-18 11:29:26 +08:00 |
|
Isotr0py
|
c4e464333e
|
[Misc] Add uninitialized params tracking for AutoWeightsLoader (#10327)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-18 09:07:46 +08:00 |
|
wchen61
|
d1557e66d3
|
[Misc] Enhance offline_inference to support user-configurable paramet… (#10392)
Signed-off-by: wchen61 <wchen61@foxmail.com>
|
2024-11-17 11:32:40 +00:00 |
|
电脑星人
|
80d85c5d7b
|
[Bugfix] Fix mrope_position_delta in non-last prefill chunk (#10403)
Signed-off-by: imkero <kerorek@outlook.com>
|
2024-11-17 08:50:24 +00:00 |
|
Kunshang Ji
|
76aab90ab6
|
[Hardware] [HPU]add mark_step for hpu (#10239)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2024-11-17 00:44:44 -08:00 |
|
youkaichao
|
8d74b5aee9
|
[platforms] refactor cpu code (#10402)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 23:14:23 -08:00 |
|
Isotr0py
|
cf349c4a97
|
[Bugfix][CPU] Fix CPU embedding runner with tensor parallel (#10394)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-16 23:12:04 -08:00 |
|
Chendi.Xue
|
905d0f0af4
|
[CI/Build] Fix IDC hpu [Device not found] issue (#10384)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2024-11-17 14:58:22 +08:00 |
|
Roger Wang
|
643ecf7b11
|
[V1] Refactor model executable interface for all text-only language models (#10374)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-11-17 05:18:46 +00:00 |
|
youkaichao
|
4fd9375028
|
[2/N][torch.compile] make compilation cfg part of vllm cfg (#10383)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 18:02:14 -08:00 |
|
Woosuk Kwon
|
661a34fd4f
|
[V1] Add code owners for V1 (#10397)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-16 10:45:26 -08:00 |
|
电脑星人
|
361c29e174
|
[Bugfix] Fix M-RoPE position calculation when chunked prefill is enabled (#10388)
Signed-off-by: imkero <kerorek@outlook.com>
|
2024-11-17 02:10:00 +08:00 |
|
Sky Lee
|
b98d89efd4
|
[Misc] Medusa supports custom bias (#10361)
|
2024-11-16 16:33:01 +00:00 |
|
Jaehyun An
|
8b6725b0cf
|
[Misc] Update benchmark to support image_url file or http (#10287)
Signed-off-by: rbbang <anjaehyun87@gmail.com>
|
2024-11-16 18:15:40 +08:00 |
|
rasmith
|
1d75472626
|
[BugFix] [Kernel] Fix GPU SEGV occuring in fused_moe kernel (#10385)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2024-11-16 09:55:05 +00:00 |
|
youkaichao
|
2f427c2d16
|
[misc][plugin] improve log messages (#10386)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 01:23:20 -08:00 |
|
youkaichao
|
755b85359b
|
[doc] add doc for the plugin system (#10372)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-15 21:46:27 -08:00 |
|
Cyrus Leung
|
32e46e000f
|
[Frontend] Automatic detection of chat content format from AST (#9919)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-16 13:35:40 +08:00 |
|
Michael Green
|
4f168f69a3
|
[Docs] Misc updates to TPU installation instructions (#10165)
|
2024-11-15 13:26:17 -08:00 |
|
Russell Bryant
|
3e8d14d8a1
|
[Doc] Move PR template content to docs (#10159)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-15 13:20:20 -08:00 |
|
Russell Bryant
|
a067f85e08
|
[Frontend] Add --version flag to CLI (#10369)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-15 13:13:53 -08:00 |
|
Simon Mo
|
c76ac49d26
|
[Docs] Add Nebius as sponsors (#10371)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2024-11-15 12:47:40 -08:00 |
|
Simon Mo
|
a6221a144a
|
[Misc] bump mistral common version (#10367)
Signed-off-by: simon-mo <simon.mo@hey.com>
v0.6.4.post1
|
2024-11-15 09:48:07 -08:00 |
|
ElizaWszola
|
79ee45b428
|
[Misc] Bump up test_fused_moe tolerance (#10364)
Signed-off-by: ElizaWszola <eliza@neuralmagic.com>
|
2024-11-15 16:31:18 +00:00 |
|
Guillaume Calmettes
|
691a3ec047
|
[Bugfix] Ensure special tokens are properly filtered out for guided structured output with MistralTokenizer (#10363)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2024-11-15 14:50:40 +00:00 |
|
youkaichao
|
3a763ba0c3
|
[core][misc] keep compatibility for old-style classes (#10356)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-15 13:55:51 +00:00 |
|
shangmingc
|
f2056f726d
|
[Misc] Fix some help info of arg_utils to improve readability (#10362)
|
2024-11-15 12:40:30 +00:00 |
|