mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-11 09:25:29 +08:00
Update openai_compatible_server.md (#11536)
Co-authored-by: Simon Mo <simon.mo@hey.com>
This commit is contained in:
parent
82d24f7aac
commit
0c0c2015c5
@ -112,7 +112,13 @@ completion = client.chat.completions.create(
|
|||||||
|
|
||||||
## Extra HTTP Headers
|
## Extra HTTP Headers
|
||||||
|
|
||||||
Only `X-Request-Id` HTTP request header is supported for now.
|
Only `X-Request-Id` HTTP request header is supported for now. It can be enabled
|
||||||
|
with `--enable-request-id-headers`.
|
||||||
|
|
||||||
|
> Note that enablement of the headers can impact performance significantly at high QPS
|
||||||
|
> rates. We recommend implementing HTTP headers at the router level (e.g. via Istio),
|
||||||
|
> rather than within the vLLM layer for this reason.
|
||||||
|
> See https://github.com/vllm-project/vllm/pull/11529 for more details.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
completion = client.chat.completions.create(
|
completion = client.chat.completions.create(
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user