Update openai_compatible_server.md (#16507)

Signed-off-by: Christian Sears <csears@redhat.com>
This commit is contained in:
Christian Sears 2025-04-11 18:54:58 -04:00 committed by GitHub
parent a3bf8d4a2b
commit c09632a66c
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -2,15 +2,15 @@
# OpenAI-Compatible Server
vLLM provides an HTTP server that implements OpenAI's [Completions API](https://platform.openai.com/docs/api-reference/completions), [Chat API](https://platform.openai.com/docs/api-reference/chat), and more!
vLLM provides an HTTP server that implements OpenAI's [Completions API](https://platform.openai.com/docs/api-reference/completions), [Chat API](https://platform.openai.com/docs/api-reference/chat), and more! This functionality lets you serve models and interact with them using an HTTP client.
You can start the server via the [`vllm serve`](#vllm-serve) command, or through [Docker](#deployment-docker):
In your terminal, you can [install](../getting_started/installation.md) vLLM, then start the server with the [`vllm serve`](#vllm-serve) command. (You can also use our [Docker](#deployment-docker) image.)
```bash
vllm serve NousResearch/Meta-Llama-3-8B-Instruct --dtype auto --api-key token-abc123
```
To call the server, you can use the [official OpenAI Python client](https://github.com/openai/openai-python), or any other HTTP client.
To call the server, in your preferred text editor, create a script that uses an HTTP client. Include any messages that you want to send to the model. Then run that script. Below is an example script using the [official OpenAI Python client](https://github.com/openai/openai-python).
```python
from openai import OpenAI