1361 Commits

Author SHA1 Message Date
Cyrus Leung
e2db1164a1
[Model] Enable BLOOM on V1 (#23488)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-24 13:30:47 +00:00
汪志鹏
416f05929a
[New Model]Donut model (#23229)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
2025-08-24 12:52:24 +00:00
Benji Beck
a75277285b
Migrate Paligemma inputs to TensorSchema (#23470)
Signed-off-by: Benji Beck <benjibeck@meta.com>
2025-08-24 04:56:56 +00:00
Benji Beck
053278a5dc
Migrate Pixtral inputs to TensorSchema (#23472)
Signed-off-by: Benji Beck <benjibeck@meta.com>
2025-08-24 04:55:53 +00:00
WeiQing Chen
23c939fd30
[Model] Support DP for ViT on MiniCPM-V-4 (#23327)
Signed-off-by: ycyaw66 <497410282@qq.com>
Co-authored-by: ycyaw66 <497410282@qq.com>
2025-08-23 02:14:41 +00:00
Isotr0py
cd7a3df26f
[Bugfix] Fix broken Florence-2 model (#23426)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>
2025-08-22 17:50:52 +00:00
Isotr0py
32d2b4064f
[Model] Add Ovis2.5 PP support (#23405)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-22 17:46:34 +00:00
bppps
424fb7a5d2
[BugFix] Fix the issue where image embeddings were incorrectly split.… (#23366)
Signed-off-by: bppps <bpppsaka@gmail.com>
Co-authored-by: zouyu.zzx <zouyu.zzx@alibaba-inc.com>
Co-authored-by: bppps <bpppsaka@gmail.com>
2025-08-22 16:56:46 +00:00
PapaGoose
88491c1b6b
[Speculators][Speculative Decoding] Fix Qwen 2 Eagle3 Support (#23337) 2025-08-22 16:39:19 +00:00
Benji Beck
998720859c
Migrate MiniCPMOAudioInputs to TensorSchema (#21847)
Signed-off-by: Benji Beck <benjibeck@meta.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-08-22 16:43:29 +08:00
Chen Zhang
17373dcd93
[Attention] Refactor AttentionMetadata Preparation for Encoder-only Models (#23154)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-08-22 05:05:59 +00:00
Bin Jia
5964069367
[New Model] Add Seed-Oss model (#23241)
Signed-off-by: jiabin.00 <jiabin.00@bytedance.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-22 04:58:10 +00:00
Benji Beck
0b9cc56fac
Migrate MllamaImagePixelInputs to TensorSchema (#22020)
Signed-off-by: Benji Beck <benjibeck@meta.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-08-22 11:28:49 +08:00
Benji Beck
a482e4e769
Migrate MolmoImageInputs to TensorSchema (#22022)
Signed-off-by: Benji Beck <benjibeck@meta.com>
2025-08-21 16:54:08 +00:00
Roger Wang
79f05e4436
[Multimodal] Always enable hashing mm data (#23308)
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-21 07:23:28 -07:00
wang.yuqi
d70a16625d
[Performance] V1 Pooling Models E2E Performance Optimization (#23162)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-08-21 13:26:09 +00:00
Paul Pak
2e2000f352
[Model] Add LFM2 architecture (#22845)
Signed-off-by: Paul Pak <paulpak58@gmail.com>
2025-08-21 09:35:07 +02:00
Cyrus Leung
0c31e28e95
[Bugfix] Fix extra whitespace in strings caused by newline (#23272)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 22:03:00 -07:00
杨奇(yann qi)
655a09f653
[Model][VLM] Support R-4B Model (#23246)
Signed-off-by: yannqi <yannqi@qq.com>
Signed-off-by: 杨奇(yann qi) <51905299+yannqi@users.noreply.github.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: yannqiyang <yannqiyang@tencent.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-08-21 04:08:52 +00:00
Asaf Joseph Gardin
3663870c72
[V1][Mamba1] - Full CUDA and Piecewise CUDA Graphs Support (#23035)
Signed-off-by: asafg <asafg@ai21.com>
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
Co-authored-by: asafg <asafg@ai21.com>
2025-08-20 20:08:51 -07:00
Cyrus Leung
2461d9e562
[CI/Build] Split out mm processor tests (#23260)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 20:05:20 -07:00
rongfu.leng
4fbda0b20c
[Feature] use --eplb_config to set eplb param (#20562)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-20 14:07:28 -07:00
dongluw
ebe56a0064
Small fix for Command-A-Vision (#23268)
Signed-off-by: donglu <donglu@cohere.com>
2025-08-20 18:15:18 +00:00
Benji Beck
c4477f55e5
Migrate Mistral3ImagePixelInputs to TensorSchema (#21945)
Signed-off-by: Benji Beck <benjibeck@meta.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-08-20 17:37:29 +00:00
Cyrus Leung
5efd6905bc
[CLI][Doc] Formalize --mm-encoder-tp-mode (#23190)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 23:42:28 +08:00
Cyrus Leung
4449235843
[Bugfix] Ensure correctness of HCXVision processing (#23254)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 14:19:30 +00:00
Jee Jee Li
c6d80a7a96
[Model] Improve olmo and olmo2 (#23228)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-20 12:47:05 +00:00
xyxinyang
7cd17e22d7
[Model][V1] Support Ernie MTP (#22169)
Signed-off-by: zhouchong <zhouchong03@baidu.com>
Co-authored-by: zhouchong <zhouchong03@baidu.com>
2025-08-20 20:41:55 +08:00
Cyrus Leung
68fcd3fa73
[Bugfix] Ensure correctness of Cohere2Vision processing (#23245)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 11:09:18 +00:00
Xin Yang
83e69a09d6
[Model] Support deepseek with eagle (#21086)
Signed-off-by: Xin Yang <xyangx@amazon.com>
2025-08-20 19:01:31 +08:00
Calvin Chen
103f1ec8d3
[Model] use autoWeightsLoader for gptoss (#22446)
Signed-off-by: calvin chen <wen.chen@dynamia.ai>
2025-08-20 10:16:27 +00:00
Cyrus Leung
64ab3c7253
[Doc] Update V1 status of various pooling models (#23189)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 10:33:41 +08:00
Benji Beck
a70d0bd0a3
Migrate LlavaOnevisionMultiInputs to TensorSchema (#21844)
Signed-off-by: Benji Beck <benjibeck@meta.com>
2025-08-19 17:02:02 +00:00
yiz-liu
4f510bc2a1
[Model] Removes redundant all-reduce operation in Qwen3MoeSparseMoeBlock (#23169)
Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
2025-08-19 16:18:41 +00:00
TJian
1298c67795
[FEAT] [Performance] Enable DP for ViT in Qwen2.5VL (#22742)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-08-19 15:25:57 +00:00
myselvess
b87cb97a53
[Model] support new model ovis2.5 (#23084)
Signed-off-by: myselvess <244285088@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-19 13:12:59 +00:00
qizixi
5bfe0dea7a
[bug fix] Fix llama4 spec decoding (#22691)
Signed-off-by: qizixi <qizixi@meta.com>
Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com>
2025-08-19 08:53:24 +00:00
Isotr0py
31fd3265c8
[Bugfix] Fix broken Minimax-01-VL model (#22116)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-19 08:49:29 +00:00
qizixi
4efd43e9b4
Fix GLM-4.5V-FP8 numerical issue (#22949)
Signed-off-by: qizixi <qizixi@meta.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-08-19 07:56:31 +00:00
Jiangyun Zhu
fda9537c5e
[Model] Support Pipeline Parallelism for moonshotai/Kimi-VL-A3B-Thinking-2506 (#23114)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-08-19 14:24:31 +08:00
Benji Beck
e75f342261
Migrate InternVLImagePixelInputs (in nemotron_vl.py) to TensorSchema (#22023)
Signed-off-by: Benji Beck <benjibeck@meta.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-08-19 13:48:26 +08:00
Raushan Turganbay
0e3bb543f0
[Bugfix] Support compile for Transformers multimodal (#23095)
Signed-off-by: raushan <raushan@huggingface.co>
2025-08-18 13:35:48 +00:00
Cyrus Leung
d3f71f1224
[Refactor] Get prompt updates earlier (#23097)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-18 12:31:53 +00:00
Cyrus Leung
27e8d1ea3e
[Refactor] Define MultiModalKwargsItems separate from MultiModalKwargs (#23053)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-18 09:52:00 +00:00
double7
9f1c642254
[Bugfix] fix Qwen2.5-Omni processor output mapping (#23058)
Signed-off-by: double7 <33449816+DoubleVII@users.noreply.github.com>
Co-authored-by: 杨森 <yangsen.double7@bytedance.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-08-17 22:09:11 -07:00
Woosuk Kwon
c55bc1db26
[Misc] Remove dead return (#23061)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-17 10:36:46 -07:00
947132885
fe0411fc6f
[Bugfix] should use stack instead of concat (#22972)
Signed-off-by: 947132885 <947132885@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-17 08:46:36 +00:00
Jee Jee Li
4d4061b6e7
[Kernel] Add cuda kernel for gpt_oss activation (#22951)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-17 05:03:24 +00:00
Cyrus Leung
5c32143b9d
[Refactor] Defer tensor data construction in MultiModalKwargs (#23030)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-16 21:05:50 -07:00
汪志鹏
829bbd7882
[New Model]mBART model (#22883)
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
2025-08-16 12:16:58 +00:00