Calvin Chen
|
103f1ec8d3
|
[Model] use autoWeightsLoader for gptoss (#22446)
Signed-off-by: calvin chen <wen.chen@dynamia.ai>
|
2025-08-20 10:16:27 +00:00 |
|
Jee Jee Li
|
4d4061b6e7
|
[Kernel] Add cuda kernel for gpt_oss activation (#22951)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-08-17 05:03:24 +00:00 |
|
Simon Mo
|
f1f0d2fab8
|
Revert "[Kernel] Add cuda kernel for gpt_oss activation" (#22948)
|
2025-08-14 17:38:10 -07:00 |
|
Jee Jee Li
|
81f4b96481
|
[Kernel] Add cuda kernel for gpt_oss activation (#22538)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-08-14 17:21:29 -07:00 |
|
Michael Goin
|
c6b928798e
|
Force TRTLLM attention for gpt-oss on SM100 (#22678)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-12 21:22:16 -07:00 |
|
Jee Jee Li
|
0c5254b82a
|
[oss] Init gpt-oss bf16 support (#22508)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-08-09 20:19:13 -07:00 |
|
Lain
|
9a3835aaa9
|
Fix trtllm-gen attention env and add attention sink (#22378)
Signed-off-by: Siyuan Fu <siyuanf@nvidia.com>
Signed-off-by: Lain <fusiyuan2000@hotmail.com>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>
|
2025-08-06 18:07:41 -07:00 |
|
Yongye Zhu
|
5c7cc33f4d
|
[gpt-oss] fix model config with hf_config (#22401)
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
|
2025-08-06 18:04:04 -07:00 |
|
Woosuk Kwon
|
de98252f49
|
Add GPT-OSS model code and config [1/N] (#22327)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-08-05 23:26:00 -07:00 |
|