Yong Hoon Shin
cb293f6a79
[V1] Enable prefill optimization for Gemma3n ( #22628 )
...
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
2025-08-28 14:54:30 -07:00
Harry Mellor
c49848396d
Refactor sliding window configuration to Transformers best practice ( #21927 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-09 20:50:48 -07:00
Nicolò Lucchesi
5a16fa614c
[Model] Gemma3n MM ( #20495 )
...
Signed-off-by: ShriKode <shrikode@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: ShriKode <shrikode@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-08-09 09:56:25 -07:00
Kyle Sayers
0f46a780d4
[Model] [Quantization] Support quantization for Gemma3n ( #21974 )
...
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-07-31 22:45:15 -07:00
Yong Hoon Shin
ad510309ee
Override attention metadata for fast prefill in some KV sharing setups ( #21590 )
...
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
2025-07-30 08:54:15 -07:00
Areeb Syed
fdde18229e
[Bugfix] Fix shape mismatch assertion error when loading Gemma3n model with BitsAndBytes quantization ( #21808 )
...
Signed-off-by: sydarb <areebsyed237@gmail.com>
2025-07-30 11:35:21 +08:00
Yong Hoon Shin
9266d98048
[BugFix] Fix interleaved sliding window not set for Gemma3n ( #21863 )
...
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
2025-07-29 16:34:19 -07:00
Robert Shaw
d1c956dc0f
Gemma3n (Text-only) ( #20134 )
...
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-06-27 07:16:26 +00:00