mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-10 10:40:44 +08:00
Update bnb.md with example for OpenAI (#11718)
This commit is contained in:
parent
9c93636d84
commit
d1d49397e7
@ -37,3 +37,10 @@ model_id = "huggyllama/llama-7b"
|
||||
llm = LLM(model=model_id, dtype=torch.bfloat16, trust_remote_code=True, \
|
||||
quantization="bitsandbytes", load_format="bitsandbytes")
|
||||
```
|
||||
## OpenAI Compatible Server
|
||||
|
||||
Append the following to your 4bit model arguments:
|
||||
|
||||
```
|
||||
--quantization bitsandbytes --load-format bitsandbytes
|
||||
```
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user