160 Commits

Author SHA1 Message Date
Cyrus Leung
909fdaf152
[Bugfix] Fix standard models tests (#17217)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-26 02:26:41 -07:00
Cyrus Leung
43faa0461a
[Bugfix] Fix hybrid model tests (#17182)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-25 15:14:37 -07:00
Harry Mellor
0bd7f8fca5
Bump Transformers to 4.51.3 (#17116)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-25 08:34:34 -07:00
Cyrus Leung
19dcc02a72
[Bugfix] Fix mistral model tests (#17181)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-25 06:03:34 -07:00
Isotr0py
83f3c3bd91
[Model] Refactor Phi-4-multimodal to use merged processor and support V1 (#15477)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-19 02:26:11 -07:00
Yang Fan
2c1bd848a6
[Model][VLM] Add Qwen2.5-Omni model support (thinker only) (#15130)
Signed-off-by: fyabc <suyang.fy@alibaba-inc.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Xiong Wang <wangxiongts@163.com>
2025-04-18 23:14:36 -07:00
Harry Mellor
686623c5e7
Fix nullable_kvs fallback (#16837)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-18 05:58:39 -07:00
Harry Mellor
e78587a64c
Improve-mm-and-pooler-and-decoding-configs (#16789)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-17 22:13:32 -07:00
Shinichi Hemmi
3badb0213b
[Model] Add PLaMo2 (#14323)
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: shemmi <shemmi@preferred.jp>
Co-authored-by: Kento Nozawa <nzw0301@preferred.jp>
Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp>
Co-authored-by: Calvin Metzger <metzger@preferred.jp>
2025-04-15 19:31:30 -07:00
courage17340
b1308b84a3
[Model][VLM] Add Kimi-VL model support (#16387)
Signed-off-by: courage17340 <courage17340@163.com>
2025-04-14 21:41:48 +00:00
Cyrus Leung
d9fc8cd9da
[V1] Enable multi-input by default (#15799)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-12 08:52:39 +00:00
Isotr0py
93195146ea
[Bugfix][VLM] Fix failing Phi-4-MM multi-images tests and add vision-speech test (#16424)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-04-11 04:57:16 +00:00
Cyrus Leung
3d4c87758e
[Misc] Update transformers version limits of multi-modal tests (#16381)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-09 23:03:33 -07:00
Chauncey
102bf967f0
[Model] Add smolvlm support (#16017)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-04-08 19:12:17 -07:00
Isotr0py
f6b32efb7f
[Bugfix] Fix and reorganize broken GGUF tests and bump gguf version (#16194)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-04-08 13:38:13 +08:00
Roger Wang
f2ebb6f541
[V1] Scatter and gather placeholders in the model runner (#16076)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
2025-04-08 10:43:41 +08:00
Lu Fang
55dcce91df
Upstream Llama4 Support to Main (#16113)
Signed-off-by: Aston Zhang <22279212+astonzhang@users.noreply.github.com>
Signed-off-by: Chris Thi <chris.c.thi@gmail.com>
Signed-off-by: drisspg <drisspguessous@gmail.com>
Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
Signed-off-by: Lu Fang <fanglu@meta.com>
Signed-off-by: Xiaodong Wang <xdwang@meta.com>
Signed-off-by: Yang Chen <yangche@fb.com>
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Lu Fang <lufang@fb.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-07 08:06:27 -07:00
Roger Wang
af51d80fa1
Revert "[V1] Scatter and gather placeholders in the model runner" (#16075) 2025-04-04 14:50:57 -07:00
Cyrus Leung
f5722a5052
[V1] Scatter and gather placeholders in the model runner (#15712)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-04-04 21:26:44 +00:00
Jennifer Zhao
38327cf454
[Model] Aya Vision (#15441)
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-04-01 16:30:43 +00:00
Cyrus Leung
09e974d483
[Bugfix] Check dimensions of multimodal embeddings in V1 (#15816)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-31 09:01:35 -07:00
Harry Mellor
e5ef4fa99a
Upgrade transformers to v4.50.3 (#13905)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-31 08:59:37 -07:00
Cyrus Leung
803d5c35f3
[V1] Override mm_counts for dummy data creation (#15703)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-30 03:20:42 -07:00
Roger Wang
c67abd614f
[V1] Support interleaved modality items (#15605)
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-03-29 06:30:09 -07:00
pengyuange
de1cb38769
[Model] Support Skywork-R1V (#15397)
Signed-off-by: jiacai.liu <932997367@qq.com>
Co-authored-by: jiacai.liu <932997367@qq.com>
2025-03-28 20:39:21 -07:00
vllmellm
5ebf66748b
[FEAT][ROCm] Integrate Fused MoE Kernels from AITER (#14967)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-03-26 16:30:30 +08:00
Cyrus Leung
997c8811d6
[Model] Support multi-image for Molmo (#15438)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-26 11:26:33 +08:00
Cyrus Leung
a9e879b316
[Misc] Clean up MiniCPM-V/O code (#15337)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-25 10:22:52 +00:00
TJian
ec870fba9a
[FEAT] [ROCm]: Add AITER RMS Norm (Layer Norm) Feature (#14959)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-03-21 22:36:14 -07:00
Isotr0py
1e508343e1
[Bugfix] Fix incorrect qwen2.5-vl attention mask pre-computation (#15200)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-03-20 19:18:04 -07:00
Cyrus Leung
f690372b68
[Core] Update dtype detection and defaults (#14858)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-19 13:49:33 +08:00
Jee Jee Li
46c759c165
[Bugfix] Fix LoRA extra vocab size (#15047)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-03-18 09:40:29 -07:00
yury-tokpanov
452e8fd968
[MODEL] Add support for Zamba2 models (#13185)
Signed-off-by: Yury Tokpanov <yury@zyphra.com>
Signed-off-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-03-18 08:56:21 -07:00
Patrick von Platen
f863ffc965
[Mistral-Small 3.1] Update docs and tests (#14977)
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-03-18 03:29:42 -07:00
Cyrus Leung
b89fb2a4a1
[CI/Build] Use AutoModelForImageTextToText to load VLMs in tests (#14945)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-17 18:35:17 +00:00
vllmellm
2bb0e1a799
[Bugfix][ROCm] running new process using spawn method for rocm in tests. (#14810)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-03-17 11:33:35 +00:00
Sibi
a73e183e36
[Misc] Replace os environ to monkeypatch in test suite (#14516)
Signed-off-by: sibi <85477603+t-sibiraj@users.noreply.github.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>
2025-03-16 20:35:57 -07:00
Robert Shaw
bb3aeddfaf
[CI] Nightly Tests (#14898)
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
2025-03-17 02:06:43 +00:00
Isotr0py
def232e122
[VLM] Clean up Phi-4-MM ViT implementation (#14812)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-03-15 18:53:52 -07:00
Li, Jiang
a2ae496589
[CPU] Support FP8 KV cache (#14741)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-03-14 22:07:36 -07:00
Robert Shaw
d4d93db2c5
[V1] V1 Enablement Oracle (#13726)
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2025-03-14 22:02:20 -07:00
Cyrus Leung
613c5bb945
[Bugfix] Fix Aria test loading (#14823)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-14 09:11:23 -07:00
Cyrus Leung
f53a0586b9
[Bugfix] Fix prompt format of GLM4V (#14539)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-13 11:37:17 +00:00
Cyrus Leung
382403921f
[VLM] Support pan-and-scan for Gemma3 multi-modal processor (#14672)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-03-13 02:23:12 -07:00
Pavani Majety
debd6bbf09
[Kernel] Add ModelOpt FP4 Checkpoint Support (#12520)
Signed-off-by: Pavani Majety <pmajety@nvidia.com>
2025-03-12 05:13:11 +00:00
Farzad Abdolhosseini
80e78d02ac
[Model] Extend Ultravox to accept audio longer than 30s (#13631)
Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai>
2025-03-12 10:27:10 +08:00
kYLe
1769928079
[Model] Update Paligemma multimodal processing with PromptUpdate (#14015)
Signed-off-by: Kyle Huang <kylhuang@nvidia.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-03-06 08:31:38 +00:00
Harry Mellor
cf069aa8aa
Update deprecated Python 3.8 typing (#13971) 2025-03-02 17:34:51 -08:00
Isotr0py
edf309ebbe
[VLM] Support multimodal inputs for Florence-2 models (#13320) 2025-02-27 02:06:41 -08:00
Tyler Michael Smith
1f69c4a892
[Model] Support Mamba2 (Codestral Mamba) (#9292)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
2025-02-17 20:17:50 +08:00