mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-15 04:35:01 +08:00
Update bnb.md with example for OpenAI (#11718)
This commit is contained in:
parent
9c93636d84
commit
d1d49397e7
@ -37,3 +37,10 @@ model_id = "huggyllama/llama-7b"
|
|||||||
llm = LLM(model=model_id, dtype=torch.bfloat16, trust_remote_code=True, \
|
llm = LLM(model=model_id, dtype=torch.bfloat16, trust_remote_code=True, \
|
||||||
quantization="bitsandbytes", load_format="bitsandbytes")
|
quantization="bitsandbytes", load_format="bitsandbytes")
|
||||||
```
|
```
|
||||||
|
## OpenAI Compatible Server
|
||||||
|
|
||||||
|
Append the following to your 4bit model arguments:
|
||||||
|
|
||||||
|
```
|
||||||
|
--quantization bitsandbytes --load-format bitsandbytes
|
||||||
|
```
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user