Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Michael Goin
|
d83f3f7cb3
|
Fixes and updates to bench_per_token_quant_fp8 (#25591)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
|
2025-09-24 08:30:15 -07:00 |
|
Tahsin Tunan
|
cef32104b4
|
[FP8] Extend per-token-group quantization support to QuantFP8 (#24342)
Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Luka Govedič <lgovedic@redhat.com>
|
2025-09-16 18:31:06 -07:00 |
|
Luka Govedič
|
31d5c1797f
|
[Perf][fp8] Use CustomOp abstraction for fp8 quant for better perf (#19830)
Signed-off-by: Luka Govedic <lgovedic@redhat.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-07-11 04:56:28 +00:00 |
|