Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Tyler Michael Smith
|
a5354b3ed2
|
[Bugfix][WideEP] Apply TP Attn + EP MoE fix to other models (#24982)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
|
2025-09-27 14:22:28 +00:00 |
|
Eldar Kurtić
|
21467f9a1c
|
Enable Eagle3 speculative decoding for GPT-OSS model (#25246)
Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com>
|
2025-09-22 08:50:39 +00:00 |
|
Woosuk Kwon
|
1c3ffdbecc
|
[V0 Deprecation] Remove V0 sampling metadata (#25345)
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
|
2025-09-21 10:37:11 -07:00 |
|
Nikhil Gupta
|
064cac7bb7
|
[fix]: remove data type hardcoding from gptoss model implementation (#23807)
Signed-off-by: Nikhil Gupta <nikhil.gupta2@arm.com>
|
2025-09-18 18:15:23 +00:00 |
|
whx
|
4a9375fe9d
|
[Model] Pass param prefix to LLMHead (#24862)
Signed-off-by: whx-sjtu <2952154980@qq.com>
|
2025-09-17 16:01:27 +08:00 |
|
Jiangyun Zhu
|
bfab219648
|
[Model] [gpt-oss] fix gpt-oss pp support (#23815)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2025-08-28 05:36:55 -07:00 |
|
Isotr0py
|
c5d004aaaf
|
[Model] Add PP support and VLM backbone compatability for GPT-OSS (#23680)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-08-28 16:03:28 +08:00 |
|
Wei
|
fecbb7c782
|
[Bugfix][gpt-oss] passing the cache config in gpt-oss (#23613)
Signed-off-by: Wei Wei <wwei6@meta.com>
|
2025-08-27 02:54:23 +00:00 |
|
Calvin Chen
|
103f1ec8d3
|
[Model] use autoWeightsLoader for gptoss (#22446)
Signed-off-by: calvin chen <wen.chen@dynamia.ai>
|
2025-08-20 10:16:27 +00:00 |
|
Jee Jee Li
|
4d4061b6e7
|
[Kernel] Add cuda kernel for gpt_oss activation (#22951)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-08-17 05:03:24 +00:00 |
|
Simon Mo
|
f1f0d2fab8
|
Revert "[Kernel] Add cuda kernel for gpt_oss activation" (#22948)
|
2025-08-14 17:38:10 -07:00 |
|
Jee Jee Li
|
81f4b96481
|
[Kernel] Add cuda kernel for gpt_oss activation (#22538)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-08-14 17:21:29 -07:00 |
|
Michael Goin
|
c6b928798e
|
Force TRTLLM attention for gpt-oss on SM100 (#22678)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-12 21:22:16 -07:00 |
|
Jee Jee Li
|
0c5254b82a
|
[oss] Init gpt-oss bf16 support (#22508)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-08-09 20:19:13 -07:00 |
|
Lain
|
9a3835aaa9
|
Fix trtllm-gen attention env and add attention sink (#22378)
Signed-off-by: Siyuan Fu <siyuanf@nvidia.com>
Signed-off-by: Lain <fusiyuan2000@hotmail.com>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>
|
2025-08-06 18:07:41 -07:00 |
|
Yongye Zhu
|
5c7cc33f4d
|
[gpt-oss] fix model config with hf_config (#22401)
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
|
2025-08-06 18:04:04 -07:00 |
|
Woosuk Kwon
|
de98252f49
|
Add GPT-OSS model code and config [1/N] (#22327)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-08-05 23:26:00 -07:00 |
|