Cyrus Leung
a84e598e21
[CI/Build] Reorganize models tests ( #7820 )
2024-09-13 10:20:06 -07:00
Cyrus Leung
8427550488
[CI/Build] Update pixtral tests to use JSON ( #8436 )
2024-09-13 03:47:52 +00:00
Patrick von Platen
d31174a4e1
[Hotfix][Pixtral] Fix multiple images bugs ( #8415 )
2024-09-12 15:21:51 -07:00
Alex Brooks
c6202daeed
[Model] Support multiple images for qwen-vl ( #8247 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-12 10:10:54 -07:00
Isotr0py
e56bf27741
[Bugfix] Fix InternVL2 inference with various num_patches ( #8375 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-12 10:10:35 -07:00
Patrick von Platen
d394787e52
Pixtral ( #8377 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-09-11 14:41:55 -07:00
bnellnm
73202dbe77
[Kernel][Misc] register ops to prevent graph breaks ( #6917 )
...
Co-authored-by: Sage Moore <sage@neuralmagic.com>
2024-09-11 12:52:19 -07:00
Yang Fan
3b7fea770f
[Model][VLM] Add Qwen2-VL model support ( #7905 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-11 09:31:19 -07:00
Yangshen⚡Deng
6a512a00df
[model] Support for Llava-Next-Video model ( #7559 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-09-10 22:21:36 -07:00
Pavani Majety
efcf946a15
[Hardware][NV] Add support for ModelOpt static scaling checkpoints. ( #6112 )
2024-09-11 00:38:40 -04:00
Isotr0py
e807125936
[Model][VLM] Support multi-images inputs for InternVL2 models ( #8201 )
2024-09-07 16:38:23 +08:00
Cyrus Leung
2f707fcb35
[Model] Multi-input support for LLaVA ( #8238 )
2024-09-07 02:57:24 +00:00
Patrick von Platen
29f49cd6e3
[Model] Allow loading from original Mistral format ( #8168 )
...
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-09-06 17:02:05 -06:00
Alex Brooks
9da25a88aa
[MODEL] Qwen Multimodal Support (Qwen-VL / Qwen-VL-Chat) ( #8029 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-05 12:48:10 +00:00
Cody Yu
2ad2e5608e
[MISC] Consolidate FP8 kv-cache tests ( #8131 )
2024-09-04 18:53:25 +00:00
Peter Salas
2be8ec6e71
[Model] Add Ultravox support for multiple audio chunks ( #7963 )
2024-09-04 04:38:21 +00:00
Shawn Tan
f8d60145b4
[Model] Add Granite model ( #7436 )
...
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
2024-09-01 18:37:18 -07:00
Roger Wang
5b86b19954
[Misc] Optional installation of audio related packages ( #8063 )
2024-09-01 14:46:57 -07:00
Pavani Majety
622f8abff8
[Bugfix] bugfix and add model test for flashinfer fp8 kv cache. ( #8013 )
2024-08-30 22:18:50 -07:00
Wenxiang
1248e8506a
[Model] Adding support for MSFT Phi-3.5-MoE ( #7729 )
...
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Zeqi Lin <zelin@microsoft.com>
Co-authored-by: Zeqi Lin <Zeqi.Lin@microsoft.com>
2024-08-30 13:42:57 -06:00
Jungho Christopher Cho
f97be32d1d
[VLM][Model] TP support for ViTs ( #7186 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-08-30 08:19:27 -07:00
afeldman-nm
428dd1445e
[Core] Logprobs support in Multi-step ( #7652 )
2024-08-29 19:19:08 -07:00
Cyrus Leung
4abed65c58
[VLM] Disallow overflowing max_model_len for multimodal models ( #7998 )
2024-08-29 17:49:04 -07:00
zifeitong
5340a2dccf
[Model] Add multi-image input support for LLaVA-Next offline inference ( #7230 )
2024-08-28 07:09:02 +08:00
Isotr0py
9db642138b
[CI/Build][VLM] Cleanup multiple images inputs model test ( #7897 )
2024-08-27 15:28:30 +00:00
Patrick von Platen
6fc4e6e07a
[Model] Add Mistral Tokenization to improve robustness and chat encoding ( #7739 )
2024-08-27 12:40:02 +00:00
Isotr0py
2059b8d9ca
[Misc] Remove snapshot_download usage in InternVL2 test ( #7835 )
2024-08-25 15:53:09 +00:00
Isotr0py
8aaf3d5347
[Model][VLM] Support multi-images inputs for Phi-3-vision models ( #7783 )
2024-08-25 11:51:20 +00:00
zifeitong
80162c44b1
[Bugfix] Fix Phi-3v crash when input images are of certain sizes ( #7840 )
2024-08-24 18:16:24 -07:00
zifeitong
df1a21131d
[Model] Fix Phi-3.5-vision-instruct 'num_crops' issue ( #7710 )
2024-08-22 09:36:24 +08:00
Peter Salas
1ca0d4f86b
[Model] Add UltravoxModel and UltravoxConfig ( #7615 )
2024-08-21 22:49:39 +00:00
Isotr0py
12e1c65bc9
[Model] Add AWQ quantization support for InternVL2 model ( #7187 )
2024-08-20 23:18:57 -07:00
Isotr0py
aae6927be0
[VLM][Model] Add test for InternViT vision encoder ( #7409 )
2024-08-20 23:10:20 +08:00
Isotr0py
7601cb044d
[Core] Support tensor parallelism for GGUF quantization ( #7520 )
2024-08-19 17:30:14 -04:00
Alex Brooks
40e1360bb6
[CI/Build] Add text-only test for Qwen models ( #7475 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
2024-08-19 07:43:46 +08:00
Mor Zusman
7fc23be81c
[Kernel] W8A16 Int8 inside FusedMoE ( #7415 )
2024-08-16 10:06:51 -07:00
Cyrus Leung
3f674a49b5
[VLM][Core] Support profiling with multiple multi-modal inputs per prompt ( #7126 )
2024-08-14 17:55:42 +00:00
youkaichao
ea49e6a3c8
[misc][ci] fix cpu test with plugins ( #7489 )
2024-08-13 19:27:46 -07:00
youkaichao
16422ea76f
[misc][plugin] add plugin system implementation ( #7426 )
2024-08-13 16:24:17 -07:00
Cyrus Leung
7025b11d94
[Bugfix] Fix weight loading for Chameleon when TP>1 ( #7410 )
2024-08-13 05:33:41 +00:00
Roger Wang
e6e42e4b17
[Core][VLM] Support image embeddings as input ( #6613 )
2024-08-12 16:16:06 +08:00
Isotr0py
4c5d8e8ea9
[Bugfix] Fix phi3v batch inference when images have different aspect ratio ( #7392 )
2024-08-10 16:19:33 +00:00
Cyrus Leung
7eb4a51c5f
[Core] Support serving encoder/decoder models ( #7258 )
2024-08-09 10:39:41 +08:00
Isotr0py
b764547616
[Bugfix] Fix input processor for InternVL2 model ( #7164 )
...
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-08-07 09:32:07 -07:00
afeldman-nm
fd95e026e0
[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) ( #4942 )
...
Co-authored-by: Andrew Feldman <afeld2012@gmail.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
2024-08-06 16:51:47 -04:00
Cyrus Leung
1f26efbb3a
[Model] Support SigLIP encoder and alternative decoders for LLaVA models ( #7153 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
2024-08-06 16:55:31 +08:00
Isotr0py
360bd67cf0
[Core] Support loading GGUF model ( #5191 )
...
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-08-05 17:54:23 -06:00
Alphi
7b86e7c9cd
[Model] Add multi-image support for minicpmv ( #7122 )
...
Co-authored-by: hezhihui <hzh7269@modelbest.cn>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-08-05 09:23:17 +08:00
Michael Goin
fb3db61688
[CI/Build] Remove sparseml requirement from testing ( #7037 )
2024-08-01 12:00:51 -07:00
Cyrus Leung
daed30c4a9
[Bugfix] Fix feature size calculation for LLaVA-NeXT ( #6982 )
2024-07-31 23:46:17 +08:00