Raushan Turganbay
9499e26e2a
[Model] Support VLMs with transformers backend ( #20543 )
...
Signed-off-by: raushan <raushan@huggingface.co>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-20 13:25:50 +00:00
Yuxuan Zhang
10eb24cc91
GLM-4 Update ( #20736 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Lu Fang <fanglu@fb.com>
2025-07-19 22:40:31 +00:00
Woosuk Kwon
752c6ade2e
[V0 Deprecation] Deprecate BlockSparse Attention & Phi3-Small ( #21217 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-19 13:53:17 -07:00
Thomas Parnell
881e3cbe3b
[V1] [Hybrid] Enable piecewise CUDA Graph for mamba layers ( #21194 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-07-19 19:27:21 +00:00
Huy Do
cf8cc32674
Fix a couple of Voxtral tests ( #21218 )
...
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-07-19 09:13:41 +00:00
김종곤
3e04107d97
[Model] EXAONE 4.0 model support ( #21060 )
...
Signed-off-by: Deepfocused <rlawhdrhs27@gmail.com>
Signed-off-by: woongsik <rlawhdrhs27@gmail.com>
2025-07-19 14:25:44 +08:00
Woosuk Kwon
dd572c0ab3
[V0 Deprecation] Remove V0 Spec Decode workers ( #21152 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-18 21:47:50 -07:00
Cyrus Leung
45badd05d0
[Core] Set pooling params based on task and model ( #21128 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-18 05:41:17 -07:00
wang.yuqi
ca4eb82bcb
[Model] Re-add the implicit conversion feature for as_seq_cls_model ( #21103 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-18 07:15:07 +00:00
kYLe
4ef00b5cac
[VLM] Add Nemotron-Nano-VL-8B-V1 support ( #20349 )
...
Signed-off-by: Kyle Huang <kylhuang@nvidia.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-17 03:07:55 -07:00
zhiweiz
c11013db8b
[Meta] Llama4 EAGLE Support ( #20591 )
...
Signed-off-by: qizixi <qizixi@meta.com>
Co-authored-by: qizixi <qizixi@meta.com>
2025-07-15 21:14:15 -07:00
Peter Pan
1eb2b9c102
[CI] update typos config for CI pre-commit and fix some spells ( #20919 )
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-15 21:12:40 -07:00
Patrick von Platen
cfbcb9ed87
[Voxtral] Add more tests ( #21010 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-07-15 21:11:49 -07:00
Patrick von Platen
e7e3e6d263
Voxtral ( #20970 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-15 07:35:30 -07:00
Thomas Parnell
3534c39a20
[V1] [Hybrid] Refactor mamba state shape calculation; enable V1 via cli ( #20840 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-07-15 04:04:35 -07:00
ant-yy
38efa28278
[Model] Add Ling implementation ( #20680 )
...
Signed-off-by: vito.yy <vito.yy@antgroup.com>
2025-07-14 22:10:32 +08:00
Cyrus Leung
dcf2a5e208
[CI/Build] Fix OOM issue in Jina-VL test ( #20907 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-14 10:32:35 +00:00
Minkyu Kim
bd4c1e6fdb
Support for LlamaForSequenceClassification ( #20807 )
...
Signed-off-by: thechaos16 <thechaos16@gmail.com>
2025-07-13 00:09:34 -07:00
Congcong Chen
2c11a738b3
[Model] New model support for microsoft/Phi-4-mini-flash-reasoning ( #20702 )
...
Signed-off-by: Congcong Chen <congcongchen@microsoft.com>
2025-07-12 06:02:10 -07:00
Isotr0py
01cae37713
[CI/Build] Ensure compatability with Transformers v4.53 ( #20541 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-07-11 20:53:07 -07:00
yurhett
11c0198615
[Bugfix] Fix tensor parallel issue in Qwen3 reranker weight loading ( #20682 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-07-11 20:52:43 -07:00
Isotr0py
0d21b2664c
[Bugfix] Fix OOM in language generation test ( #20814 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-07-11 11:21:52 -07:00
Jee Jee Li
8020e98c9f
[Quantization][1/N] MoE support BNB-Inflight Quantization ( #20061 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-11 08:01:13 +00:00
Michael Goin
922f316441
[Model] Support HF format of minimax ( #20211 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-11 02:55:21 +00:00
shineran96
4bed167768
[Model][VLM] Support JinaVL Reranker ( #20260 )
...
Signed-off-by: shineran96 <shinewang96@gmail.com>
2025-07-10 10:43:43 -07:00
Isotr0py
77f77a951e
[Misc] Clean up mark to fork process in BNB tests ( #20692 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-07-10 13:59:40 +00:00
Jee Jee Li
7571a4a7e5
[CI/Build] Fix Basic Models Test ( #20728 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-10 09:57:19 +00:00
Li, Jiang
e59ba9e142
[CI/Build] Enlarge tolerance for a CPU multi-modal test ( #20684 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-09 17:48:52 +00:00
B-201
6bbf1795b7
[Misc] Fix the size of batched_dummy_mm_inputs in profile_run ( #20434 )
...
Signed-off-by: bk-201 <joy25810@foxmail.com>
2025-07-08 20:15:44 -07:00
wang.yuqi
baba0389f7
[CI] Increase the threshold of the MTEB RERANK tests ( #20615 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-08 08:10:11 -07:00
Li, Jiang
7721ef1786
[CI/Build][CPU] Fix CPU CI and remove all CPU V0 files ( #20560 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-07 22:13:44 -07:00
wang.yuqi
110df74332
[Model][Last/4] Automatic conversion of CrossEncoding model ( #19675 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-07 14:46:04 +00:00
Cyrus Leung
9fb52e523a
[V1] Support any head size for FlexAttention backend ( #20467 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-06 09:54:36 -07:00
Thomas Parnell
2f35a022e6
Enable V1 for Hybrid SSM/Attention Models ( #20016 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Co-authored-by: Stanislaw Wozniak <stw@zurich.ibm.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
2025-07-04 17:46:53 +00:00
wang.yuqi
2e26f9156a
[Model][3/N] Automatic conversion of CrossEncoding model ( #20168 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-04 05:47:39 -07:00
Jee Jee Li
1caca5a589
[Misc] Add SPDX-FileCopyrightText ( #20428 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-04 07:40:42 +00:00
wang.yuqi
6f1229f91d
[Model][2/N] Automatic conversion of CrossEncoding model ( #19978 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-03 13:59:23 +00:00
CSWYF3634076
e303dcf523
[Model] Add Ernie4.5 and Ernie4.5MoE Model Support ( #20220 )
...
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
2025-07-02 03:37:01 -07:00
Kwai-Keye
8452946c06
[Model][VLM] Support Keye-VL-8B-Preview ( #20126 )
...
Signed-off-by: Kwai-Keye <Keye@kuaishou.com>
2025-07-01 23:35:04 -07:00
aiyiwang2025
ecad851cbd
[Model]Add Tencent HunYuanMoEV1 Model Support ( #20114 )
...
Signed-off-by: aiyiwang <aiyiwang@tencent.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: quinnrong <quinnrong@tencent.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-01 07:28:13 -07:00
Yuxuan Zhang
ed70f3c64f
Add GLM4.1V model (Draft) ( #19331 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-07-01 12:48:26 +00:00
Li, Jiang
6cc1e7d96d
[CPU] Update custom ops for the CPU backend ( #20255 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-01 07:25:03 +00:00
redmoe-moutain
65b1cbb138
[Model] support dots1 ( #18254 )
...
Signed-off-by: redmoe-moutain <agiredmoe@gmail.com>
2025-06-29 19:34:36 -07:00
Stan Wozniak
daec9dea6e
[Bugfix] Correct behavior of GraniteMoeHybrid for TensorParallel execution ( #20137 )
...
Signed-off-by: Stanislaw Wozniak <stw@zurich.ibm.com>
2025-06-28 08:16:41 -07:00
Thomas Parnell
8615d9776f
[CI/Build] Add new CI job to validate Hybrid Models for every PR ( #20147 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-06-27 23:00:25 -07:00
Michael Goin
c329ceca6d
[CI Fix] Pin tests/models/registry.py MiniMaxText01ForCausalLM to revision due to model changes ( #20199 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-28 13:43:06 +08:00
Robert Shaw
d1c956dc0f
Gemma3n (Text-only) ( #20134 )
...
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-06-27 07:16:26 +00:00
wang.yuqi
cd4cfee689
[Model][1/N] Automatic conversion of CrossEncoding model ( #20012 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-06-26 21:10:04 -07:00
Thomas Parnell
e110930680
[Fix] Fix gemma CI test failing on main ( #20124 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-06-26 21:06:59 -07:00
Michael Goin
71799fd005
[CI Failure] Fix OOM with test_oot_registration_embedding ( #20144 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-27 11:21:04 +08:00