19 Commits

Author SHA1 Message Date
Harry Mellor
8fcaaf6a16
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Tyler Michael Smith
a5354b3ed2
[Bugfix][WideEP] Apply TP Attn + EP MoE fix to other models (#24982)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
2025-09-27 14:22:28 +00:00
Eldar Kurtić
21467f9a1c
Enable Eagle3 speculative decoding for GPT-OSS model (#25246)
Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com>
2025-09-22 08:50:39 +00:00
Woosuk Kwon
1c3ffdbecc
[V0 Deprecation] Remove V0 sampling metadata (#25345)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-09-21 10:37:11 -07:00
Nikhil Gupta
064cac7bb7
[fix]: remove data type hardcoding from gptoss model implementation (#23807)
Signed-off-by: Nikhil Gupta <nikhil.gupta2@arm.com>
2025-09-18 18:15:23 +00:00
whx
4a9375fe9d
[Model] Pass param prefix to LLMHead (#24862)
Signed-off-by: whx-sjtu <2952154980@qq.com>
2025-09-17 16:01:27 +08:00
Jiangyun Zhu
bfab219648
[Model] [gpt-oss] fix gpt-oss pp support (#23815)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-08-28 05:36:55 -07:00
Isotr0py
c5d004aaaf
[Model] Add PP support and VLM backbone compatability for GPT-OSS (#23680)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-28 16:03:28 +08:00
Wei
fecbb7c782
[Bugfix][gpt-oss] passing the cache config in gpt-oss (#23613)
Signed-off-by: Wei Wei <wwei6@meta.com>
2025-08-27 02:54:23 +00:00
Calvin Chen
103f1ec8d3
[Model] use autoWeightsLoader for gptoss (#22446)
Signed-off-by: calvin chen <wen.chen@dynamia.ai>
2025-08-20 10:16:27 +00:00
Jee Jee Li
4d4061b6e7
[Kernel] Add cuda kernel for gpt_oss activation (#22951)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-17 05:03:24 +00:00
Simon Mo
f1f0d2fab8
Revert "[Kernel] Add cuda kernel for gpt_oss activation" (#22948) 2025-08-14 17:38:10 -07:00
Jee Jee Li
81f4b96481
[Kernel] Add cuda kernel for gpt_oss activation (#22538)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-14 17:21:29 -07:00
Michael Goin
c6b928798e
Force TRTLLM attention for gpt-oss on SM100 (#22678)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-12 21:22:16 -07:00
Jee Jee Li
0c5254b82a
[oss] Init gpt-oss bf16 support (#22508)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-09 20:19:13 -07:00
Lain
9a3835aaa9
Fix trtllm-gen attention env and add attention sink (#22378)
Signed-off-by: Siyuan Fu <siyuanf@nvidia.com>
Signed-off-by: Lain <fusiyuan2000@hotmail.com>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>
2025-08-06 18:07:41 -07:00
Yongye Zhu
5c7cc33f4d
[gpt-oss] fix model config with hf_config (#22401)
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
2025-08-06 18:04:04 -07:00
Woosuk Kwon
de98252f49
Add GPT-OSS model code and config [1/N] (#22327)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-05 23:26:00 -07:00