mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-14 06:45:00 +08:00
Super tiny little typo fix (#10633)
This commit is contained in:
parent
ed46f14321
commit
2b0879bfc2
@ -4,7 +4,7 @@ FP8 E5M2 KV Cache
|
|||||||
==================
|
==================
|
||||||
|
|
||||||
The int8/int4 quantization scheme requires additional scale GPU memory storage, which reduces the expected GPU memory benefits.
|
The int8/int4 quantization scheme requires additional scale GPU memory storage, which reduces the expected GPU memory benefits.
|
||||||
The FP8 data format retains 2~3 mantissa bits and can convert float/fp16/bflaot16 and fp8 to each other.
|
The FP8 data format retains 2~3 mantissa bits and can convert float/fp16/bfloat16 and fp8 to each other.
|
||||||
|
|
||||||
Here is an example of how to enable this feature:
|
Here is an example of how to enable this feature:
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user