12 Commits

Author SHA1 Message Date
Jee Jee Li
4d4061b6e7
[Kernel] Add cuda kernel for gpt_oss activation (#22951)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-17 05:03:24 +00:00
Simon Mo
f1f0d2fab8
Revert "[Kernel] Add cuda kernel for gpt_oss activation" (#22948) 2025-08-14 17:38:10 -07:00
Jee Jee Li
81f4b96481
[Kernel] Add cuda kernel for gpt_oss activation (#22538)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-14 17:21:29 -07:00
Isotr0py
b76753f0b5
[Bugfix][Kernel] Support partial rotary embedding for MRoPE triton kernel (#22593)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-10 09:00:36 -07:00
Mickaël Seznec
4fb56914c5
[perf] Add fused MLA QKV + strided layernorm (#21116)
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-07-22 07:07:44 -07:00
Ning Xie
2f1c19b245
[CI] change spell checker from codespell to typos (#18711)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-06-11 19:57:10 -07:00
Simon Mo
02f0c7b220
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
Cyrus Leung
1aa2f81b43
[Misc] Update type annotation for rotary embedding base (#18914)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-30 10:17:01 +08:00
Lucas Wilkinson
4e1c6a0264
[Bugfix] fix rotary embedding test for _get_padded_tensor_shape (#18229)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-05-16 01:32:45 +00:00
Lucas Wilkinson
d93c976a0d
[Kernel] Have rotary embeddings support tensors (#18046)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-05-14 15:43:55 -07:00
Yong Hoon Shin
98c89e16ff
Make key optional for rotary embedding (#17566)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
2025-05-07 00:11:46 -07:00
Michael Goin
6317a5174a
Categorize tests/kernels/ based on kernel type (#16799)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-04-23 09:21:07 -04:00