Lily Liu
|
775f00f81e
|
[Speculative Decoding] Test refactor (#8317)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-09-11 14:07:34 -07:00 |
|
Thomas Parnell
|
a5314e8698
|
[Model] RowParallelLinear: pass bias to quant_method.apply (#6327)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2024-07-19 07:15:22 -06:00 |
|
Sirej Dua
|
15aba081f3
|
[Speculative Decoding] MLPSpeculator Tensor Parallel support (1/2) (#6050)
Co-authored-by: Sirej Dua <sirej.dua@databricks.com>
Co-authored-by: Sirej Dua <Sirej Dua>
|
2024-07-02 07:20:29 -07:00 |
|
Woo-Yeon Lee
|
2ce5d6688b
|
[Speculative Decoding] Support draft model on different tensor-parallel size than target model (#5414)
|
2024-06-25 09:56:06 +00:00 |
|