Michael Goin
8e836d982a
[Doc] Fix code formatting in spec_decode.rst ( #9348 )
2024-10-14 21:29:11 -07:00
Steve Grubb
44eaa5a5d9
[Frontend] Clarify model_type error messages ( #9345 )
2024-10-14 21:29:01 -07:00
Tyler Michael Smith
169b530607
[Bugfix] Clean up some cruft in mamba.py ( #9343 )
2024-10-15 00:24:25 +00:00
Xiang Xu
f0fe4fe86d
[Model] Make llama3.2 support multiple and interleaved images ( #9095 )
2024-10-14 15:24:26 -07:00
Brendan Wong
4d31cd424b
[Frontend] merge beam search implementations ( #9296 )
2024-10-14 15:05:52 -07:00
Woosuk Kwon
473e7b3606
[TPU] Fix TPU SMEM OOM by Pallas paged attention kernel ( #9350 )
2024-10-14 15:02:06 -07:00
Simon Mo
fd47e57f4b
[Docs] Remove PDF build from Readtehdocs ( #9347 )
v0.6.3
2024-10-14 11:57:47 -07:00
Daniele
203ab8f80f
[CI/Build] setuptools-scm fixes ( #8900 )
2024-10-14 11:34:47 -07:00
Kunshang Ji
4141608c6a
[Hardware][intel GPU] add async output process for xpu ( #8897 )
2024-10-14 12:23:33 -06:00
Reza Salehi
dfe43a2071
[Model] Molmo vLLM Integration ( #9016 )
...
Co-authored-by: sanghol <sanghol@allenai.org>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-10-14 07:56:24 -07:00
Tyler Michael Smith
16b24e7dcd
[Bugfix] Bandaid fix for speculative decoding tests ( #9327 )
2024-10-13 23:02:11 +00:00
Lily Liu
f519902c52
[CI] Fix merge conflict ( #9317 )
2024-10-13 06:41:23 +00:00
Jee Jee Li
250e26a63e
[Bugfix]Fix MiniCPM's LoRA bug ( #9286 )
2024-10-12 09:36:47 -07:00
Yunmeng
2b184ddd4f
[Misc][Installation] Improve source installation script and doc ( #9309 )
...
Co-authored-by: youkaichao <youkaichao@126.com>
2024-10-12 09:36:40 -07:00
Xiang Xu
00298e092c
[Bugfix] Fix bug of xformer prefill for encoder-decoder ( #9026 )
2024-10-12 15:00:43 +08:00
Lily Liu
89feb4c84d
[SpecDec] Remove Batch Expansion (2/3) ( #9298 )
2024-10-12 05:13:37 +00:00
Maximilien de Bayser
ec10cb8511
[BugFix] Fix tool call finish reason in streaming case ( #9209 )
...
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2024-10-11 18:24:26 -07:00
Prashant Gupta
d11b46f3a5
[bugfix] fix f-string for error ( #9295 )
...
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
2024-10-11 17:03:48 -07:00
Allen Wang
c6cf9295e1
[Bugfix] Sets is_first_step_output for TPUModelRunner ( #9202 )
2024-10-11 13:28:10 -07:00
Lucas Wilkinson
de9fb4bef8
[Bugfix][CI/Build] Fix docker build where CUDA archs < 7.0 are being detected ( #9254 )
2024-10-11 15:57:39 -04:00
Wallas Henrique
8baf85e4e9
[Doc] Compatibility matrix for mutual exclusive features ( #8512 )
...
Signed-off-by: Wallas Santos <wallashss@ibm.com>
2024-10-11 11:18:50 -07:00
homeffjy
1a1823871d
[Doc] Remove outdated comment to avoid misunderstanding ( #9287 )
2024-10-11 18:02:03 +00:00
sixgod
6cf1167c1a
[Model] Add GLM-4v support and meet vllm==0.6.2 ( #9242 )
2024-10-11 17:36:13 +00:00
Burkhard Ringlein
f710090d8e
[Kernel] adding fused moe kernel config for L40S TP4 ( #9245 )
2024-10-11 08:54:22 -07:00
Tyler Michael Smith
7342a7d7f8
[Model] Support Mamba ( #6484 )
2024-10-11 15:40:06 +00:00
Sebastian Schoennenbeck
df3dcdf49d
[Bugfix] Fix priority in multiprocessing engine ( #9277 )
2024-10-11 15:35:35 +00:00
Jee Jee Li
36ea79079b
[Misc][LoRA] Support loading LoRA weights for target_modules in reg format ( #9275 )
2024-10-11 12:31:21 +00:00
Cyrus Leung
e808156f30
[Misc] Collect model support info in a single process per model ( #9233 )
2024-10-11 11:08:11 +00:00
youkaichao
cbc2ef5529
[misc] hide best_of from engine ( #9261 )
...
Co-authored-by: Brendan Wong <bjwpokemon@gmail.com>
2024-10-10 21:30:44 -07:00
Andy Dai
94bf9ae4e9
[Misc] Fix sampling from sonnet for long context case ( #9235 )
2024-10-11 00:33:16 +00:00
omrishiv
f990bab2a4
[Doc][Neuron] add note to neuron documentation about resolving triton issue ( #9257 )
...
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
2024-10-10 23:36:32 +00:00
youkaichao
e00c094f15
[torch.compile] generic decorators ( #9258 )
2024-10-10 15:54:23 -07:00
Kevin H. Luu
a78c6ba7c8
[ci/build] Add placeholder command for custom models test ( #9262 )
2024-10-10 15:45:09 -07:00
dependabot[bot]
fb870fd491
Bump actions/setup-python from 3 to 5 ( #9195 )
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-10 13:30:46 -07:00
dependabot[bot]
270953bafb
Bump actions/checkout from 3 to 4 ( #9196 )
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-10 13:30:35 -07:00
dependabot[bot]
9cc811c4ff
Bump actions/github-script from 6 to 7 ( #9197 )
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-10 13:30:24 -07:00
youkaichao
e4d652ea3e
[torch.compile] integration with compilation control ( #9058 )
2024-10-10 12:39:36 -07:00
Simon Mo
78c0b4166c
Suggest codeowners for the core componenets ( #9210 )
2024-10-10 12:29:24 -07:00
jordanyono
21efb603f5
[CI/Build] Make the Dockerfile.cpu file's PIP_EXTRA_INDEX_URL Configurable as a Build Argument ( #9252 )
2024-10-10 18:18:18 +00:00
Rafael Vasquez
055f3270d4
[Doc] Improve debugging documentation ( #9204 )
...
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
2024-10-10 10:48:51 -07:00
Lucas Wilkinson
18511aeda6
[Bugfix] Fix Machete unittests failing with NotImplementedError ( #9218 )
2024-10-10 17:39:56 +00:00
Ilya Lavrenov
83ea5c72b9
[OpenVINO] Use torch 2.4.0 and newer optimim version ( #9121 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-10 11:18:58 -06:00
whyiug
04de9057ab
[Model] support input image embedding for minicpmv ( #9237 )
2024-10-10 15:00:47 +00:00
Isotr0py
07c11cf4d4
[Bugfix] Fix lm_head weights tying with lora for llama ( #9227 )
2024-10-10 21:11:56 +08:00
sroy745
f3a507f1d3
[Core] Add an environment variable which needs to be set explicitly to allow BlockSpaceManagerV1 ( #9149 )
2024-10-10 14:17:17 +08:00
Lucas Wilkinson
a64e7b9407
[Bugfix] Machete garbage results for some models (large K dim) ( #9212 )
2024-10-10 14:16:17 +08:00
Michael Goin
ce00231a8b
[Bugfix] Fix Weight Loading Multiple GPU Test - Large Models ( #9213 )
2024-10-10 14:15:40 +08:00
youkaichao
de895f1697
[misc] improve model support check in another process ( #9208 )
2024-10-09 21:58:27 -07:00
Russell Bryant
cf25b93bdd
[Core] Fix invalid args to _process_request ( #9201 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-10-10 12:10:09 +08:00
Michael Goin
d5fbb8706d
[CI/Build] Update Dockerfile install+deploy image to ubuntu 22.04 ( #9130 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-09 12:51:47 -06:00