yyzxw
|
19f76ee68e
|
[misc] refactor speculative config (#25657)
Signed-off-by: zxw <1020938856@qq.com>
|
2025-09-26 01:22:06 -07:00 |
|
XuruiYang
|
845adb3ec6
|
[Model] Add LongCat-Flash (#23991)
Signed-off-by: yangxurui <yangxurui@meituan.com>
Co-authored-by: yangxurui <yangxurui@meituan.com>
|
2025-09-24 21:53:40 -07:00 |
|
Woosuk Kwon
|
2e19a848d4
|
[V0 Deprecation] Remove max_seq_len_to_capture (#25543)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-09-24 01:51:39 -07:00 |
|
Eldar Kurtić
|
21467f9a1c
|
Enable Eagle3 speculative decoding for GPT-OSS model (#25246)
Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com>
|
2025-09-22 08:50:39 +00:00 |
|
qizixi
|
c4cb0af98a
|
[spec decode] Fix MTP inference path for MiMo-7B model (#25136)
Signed-off-by: zixi-qi <qizixi@meta.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-09-18 09:12:19 -07:00 |
|
Benjamin Chislett
|
b7433ca1a4
|
[Spec Decode] Efficient padded speculation (#24539)
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
|
2025-09-18 01:07:24 -04:00 |
|
Harry Mellor
|
0faf3cc3e8
|
Move SpeculativeConfig from config/__init__.py to config/speculative.py (#24904)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-16 12:51:35 +01:00 |
|