mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-10 02:15:01 +08:00
[docs] convert supported configs to table (#20858)
Signed-off-by: reidliu41 <reid201711@gmail.com>
This commit is contained in:
parent
c2a2f19aba
commit
a86754a12b
@ -133,36 +133,20 @@ docker run \
|
||||
The following configurations have been validated to function with
|
||||
Gaudi2 devices. Configurations that are not listed may or may not work.
|
||||
|
||||
- [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b)
|
||||
on single HPU, or with tensor parallelism on 2x and 8x HPU, BF16
|
||||
datatype with random or greedy sampling
|
||||
- [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
|
||||
on single HPU, or with tensor parallelism on 2x and 8x HPU, BF16
|
||||
datatype with random or greedy sampling
|
||||
- [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
|
||||
on single HPU, or with tensor parallelism on 2x and 8x HPU, BF16
|
||||
datatype with random or greedy sampling
|
||||
- [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
|
||||
on single HPU, or with tensor parallelism on 2x and 8x HPU, BF16
|
||||
datatype with random or greedy sampling
|
||||
- [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)
|
||||
on single HPU, or with tensor parallelism on 2x and 8x HPU, BF16
|
||||
datatype with random or greedy sampling
|
||||
- [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
|
||||
on single HPU, or with tensor parallelism on 2x and 8x HPU, BF16
|
||||
datatype with random or greedy sampling
|
||||
- [meta-llama/Llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b)
|
||||
with tensor parallelism on 8x HPU, BF16 datatype with random or greedy sampling
|
||||
- [meta-llama/Llama-2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf)
|
||||
with tensor parallelism on 8x HPU, BF16 datatype with random or greedy sampling
|
||||
- [meta-llama/Meta-Llama-3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B)
|
||||
with tensor parallelism on 8x HPU, BF16 datatype with random or greedy sampling
|
||||
- [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)
|
||||
with tensor parallelism on 8x HPU, BF16 datatype with random or greedy sampling
|
||||
- [meta-llama/Meta-Llama-3.1-70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B)
|
||||
with tensor parallelism on 8x HPU, BF16 datatype with random or greedy sampling
|
||||
- [meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)
|
||||
with tensor parallelism on 8x HPU, BF16 datatype with random or greedy sampling
|
||||
| Model | TP Size| dtype | Sampling |
|
||||
|-------|--------|--------|----------|
|
||||
| [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b) | 1, 2, 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) | 1, 2, 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 1, 2, 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 1, 2, 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) | 1, 2, 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) | 1, 2, 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b) | 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Llama-2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf) | 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Meta-Llama-3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B) | 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Meta-Llama-3.1-70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B) | 8 | BF16 | Random / Greedy |
|
||||
| [meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) | 8 | BF16 | Random / Greedy |
|
||||
|
||||
## Performance tuning
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user