Cyrus Leung
|
18572e3384
|
[Bugfix] Fix HfExampleModels.find_hf_info (#12223)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 15:35:36 +00:00 |
|
wangxiyuan
|
86bfb6dba7
|
[Misc] Pass attention to impl backend (#12218)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-01-20 23:25:28 +08:00 |
|
Chen Zhang
|
5f0ec3935a
|
[V1] Remove _get_cache_block_size (#12214)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-01-20 21:54:16 +08:00 |
|
youkaichao
|
c222f47992
|
[core][bugfix] configure env var during import vllm (#12209)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-20 19:35:59 +08:00 |
|
youkaichao
|
170eb35079
|
[misc] print a message to suggest how to bypass commit hooks (#12217)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-20 18:06:24 +08:00 |
|
Cyrus Leung
|
b37d82791e
|
[Model] Upgrade Aria to transformers 4.48 (#12203)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 17:58:48 +08:00 |
|
Cyrus Leung
|
3127e975fb
|
[CI/Build] Make pre-commit faster (#12212)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 17:36:24 +08:00 |
|
Cyrus Leung
|
4001ea1266
|
[CI/Build] Remove dummy CI steps (#12208)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 16:41:57 +08:00 |
|
youkaichao
|
5c89a29c22
|
[misc] add placeholder format.sh (#12206)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-20 16:04:49 +08:00 |
|
Cyrus Leung
|
59a0192fb9
|
[Core] Interface for accessing model from VllmRunner (#10353)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-20 15:00:59 +08:00 |
|
Isotr0py
|
83609791d2
|
[Model] Add Qwen2 PRM model support (#12202)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-01-20 14:59:46 +08:00 |
|
Yuan Tang
|
0974c9bc5c
|
[Bugfix] Fix incorrect types in LayerwiseProfileResults (#12196)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-01-20 14:59:20 +08:00 |
|
Yuan Tang
|
d2643128f7
|
[DOC] Add missing docstring in LLMEngine.add_request() (#12195)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-01-20 14:59:00 +08:00 |
|
Yuan Tang
|
c5c06209ec
|
[DOC] Fix typo in docstring and assert message (#12194)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-01-20 14:58:29 +08:00 |
|
Harry Mellor
|
3ea7b94523
|
Move linting to pre-commit (#11975)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-20 14:58:01 +08:00 |
|
youkaichao
|
51ef828f10
|
[torch.compile] fix sym_tensor_indices (#12191)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-20 11:37:50 +08:00 |
|
shangmingc
|
df450aa567
|
[Bugfix] Fix num_heads value for simple connector when tp enabled (#12074)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-01-20 02:56:43 +00:00 |
|
Martin Gleize
|
bbe5f9de7d
|
[Model] Support for fairseq2 Llama (#11442)
Signed-off-by: Martin Gleize <mgleize@meta.com>
Co-authored-by: mgleize user <mgleize@a100-st-p4de24xlarge-4.fair-a100.hpcaas>
|
2025-01-19 10:40:40 -08:00 |
|
Roger Wang
|
81763c58a0
|
[V1] Add V1 support of Qwen2-VL (#12128)
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: imkero <kerorek@outlook.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-19 19:52:13 +08:00 |
|
Isotr0py
|
edaae198e7
|
[Misc] Add BNB support to GLM4-V model (#12184)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-01-19 19:49:22 +08:00 |
|
gujing
|
936db119ed
|
benchmark_serving support --served-model-name param (#12109)
Signed-off-by: zibai <zibai.gj@alibaba-inc.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2025-01-19 09:59:56 +00:00 |
|
youkaichao
|
e66faf4809
|
[torch.compile] store inductor compiled Python file (#12182)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-19 16:27:26 +08:00 |
|
Cyrus Leung
|
630eb5b5ce
|
[Bugfix] Fix multi-modal processors for transformers 4.48 (#12187)
|
2025-01-18 19:16:34 -08:00 |
|
Michal Adamczyk
|
4e94951bb1
|
[BUGFIX] Move scores to float32 in case of running xgrammar on cpu (#12152)
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
|
2025-01-19 11:12:05 +08:00 |
|
Simon Mo
|
7a8a48d51e
|
[V1] Collect env var for usage stats (#12115)
|
2025-01-19 03:07:15 +00:00 |
|
yancong
|
32eb0da808
|
[Misc] Support register quantization method out-of-tree (#11969)
|
2025-01-18 16:13:16 -08:00 |
|
youkaichao
|
6d0e3d3724
|
[core] clean up executor class hierarchy between v1 and v0 (#12171)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-18 14:35:15 +08:00 |
|
Isotr0py
|
02798ecabe
|
[Model] Port deepseek-vl2 processor, remove dependency (#12169)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-01-18 13:59:39 +08:00 |
|
Russell Bryant
|
813f249f02
|
[Docs] Fix broken link in SECURITY.md (#12175)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-01-18 04:35:21 +00:00 |
|
youkaichao
|
da02cb4b27
|
[core] further polish memory profiling (#12126)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-18 12:25:08 +08:00 |
|
Hongxia Yang
|
c09503ddd6
|
[AMD][CI/Build][Bugfix] use pytorch stale wheel (#12172)
Signed-off-by: hongxyan <hongxyan@amd.com>
|
2025-01-18 11:15:53 +08:00 |
|
youkaichao
|
2b83503227
|
[misc] fix cross-node TP (#12166)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-18 10:53:27 +08:00 |
|
youkaichao
|
7b98a65ae6
|
[torch.compile] disable logging when cache is disabled (#12043)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-17 20:29:31 +00:00 |
|
Gregory Shtrasberg
|
b5b57e301e
|
[AMD][FP8] Using MI300 FP8 format on ROCm for block_quant (#12134)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-01-17 17:12:26 +00:00 |
|
Kunshang Ji
|
54cacf008f
|
[Bugfix] Mistral tokenizer encode accept list of str (#12149)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-01-17 16:47:53 +00:00 |
|
Wallas Henrique
|
58fd57ff1d
|
[Bugfix] Fix score api for missing max_model_len validation (#12119)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
|
2025-01-17 16:24:22 +00:00 |
|
youkaichao
|
87a0c076af
|
[core] allow callable in collective_rpc (#12151)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-17 20:47:01 +08:00 |
|
Li, Jiang
|
d4e6194570
|
[CI/Build][CPU][Bugfix] Fix CPU CI (#12150)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-01-17 19:39:52 +08:00 |
|
Jee Jee Li
|
07934cc237
|
[Misc][LoRA] Improve the readability of LoRA error messages (#12102)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-01-17 19:32:28 +08:00 |
|
Chen Zhang
|
69d765f5a5
|
[V1] Move more control of kv cache initialization from model_executor to EngineCore (#11960)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
|
2025-01-17 07:39:35 +00:00 |
|
Divakar Verma
|
8027a72461
|
[ROCm][MoE] moe tuning support for rocm (#12049)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2025-01-17 14:49:16 +08:00 |
|
Isotr0py
|
d75ab55f10
|
[Misc] Add deepseek_vl2 chat template (#12143)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-01-17 06:34:48 +00:00 |
|
Chen Zhang
|
d1adb9b403
|
[BugFix] add more is not None check in VllmConfig.__post_init__ (#12138)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-01-17 05:33:22 +00:00 |
|
Yuan Tang
|
b8bfa46a18
|
[Bugfix] Fix issues in CPU build Dockerfile (#12135)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-01-17 12:54:01 +08:00 |
|
Yuan Tang
|
1475847a14
|
[Doc] Add instructions on using Podman when SELinux is active (#12136)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-01-17 04:45:36 +00:00 |
|
Kunshang Ji
|
fead53ba78
|
[CI]add genai-perf benchmark in nightly benchmark (#10704)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-01-17 04:15:09 +00:00 |
|
Kuntai Du
|
ebc73f2828
|
[Bugfix] Fix a path bug in disaggregated prefill example script. (#12121)
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
2025-01-17 11:12:41 +08:00 |
|
Chen Zhang
|
d06e824006
|
[Bugfix] Set enforce_eager automatically for mllama (#12127)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-01-16 15:30:08 -05:00 |
|
Isotr0py
|
62b06ba23d
|
[Model] Add support for deepseek-vl2-tiny model (#12068)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-01-16 17:14:48 +00:00 |
|
Varun Sundar Rabindranath
|
5fd24ec02e
|
[misc] Add LoRA kernel micro benchmarks (#11579)
|
2025-01-16 15:51:40 +00:00 |
|