mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-14 18:05:01 +08:00
[Doc] Update supported_hardware.rst (#7276)
This commit is contained in:
parent
fc1493a01e
commit
6d94420246
@ -5,18 +5,20 @@ Supported Hardware for Quantization Kernels
|
||||
|
||||
The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM:
|
||||
|
||||
============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
|
||||
Implementation Volta Turing Ampere Ada Hopper AMD GPU Intel GPU x86 CPU AWS Inferentia Google TPU
|
||||
============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
|
||||
AQLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
AWQ ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
DeepSpeedFP ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
FP8 ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
Marlin ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
GPTQ ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
SqueezeLLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
bitsandbytes ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
|
||||
===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
|
||||
Implementation Volta Turing Ampere Ada Hopper AMD GPU Intel GPU x86 CPU AWS Inferentia Google TPU
|
||||
===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
|
||||
AWQ ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
GPTQ ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
Marlin (GPTQ/AWQ/FP8) ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
INT8 (W8A8) ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
FP8 (W8A8) ❌ ❌ ❌ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
AQLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
bitsandbytes ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
DeepSpeedFP ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
GGUF ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
SqueezeLLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
|
||||
===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
|
||||
|
||||
Notes:
|
||||
^^^^^^
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user