mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-16 12:45:01 +08:00
Update deploying_with_k8s.rst (#10922)
This commit is contained in:
parent
25ebed2f8c
commit
da6f409246
@ -162,7 +162,7 @@ To test the deployment, run the following ``curl`` command:
|
||||
curl http://mistral-7b.default.svc.cluster.local/v1/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "facebook/opt-125m",
|
||||
"model": "mistralai/Mistral-7B-Instruct-v0.3",
|
||||
"prompt": "San Francisco is a",
|
||||
"max_tokens": 7,
|
||||
"temperature": 0
|
||||
@ -172,4 +172,4 @@ If the service is correctly deployed, you should receive a response from the vLL
|
||||
|
||||
Conclusion
|
||||
----------
|
||||
Deploying vLLM with Kubernetes allows for efficient scaling and management of ML models leveraging GPU resources. By following the steps outlined above, you should be able to set up and test a vLLM deployment within your Kubernetes cluster. If you encounter any issues or have suggestions, please feel free to contribute to the documentation.
|
||||
Deploying vLLM with Kubernetes allows for efficient scaling and management of ML models leveraging GPU resources. By following the steps outlined above, you should be able to set up and test a vLLM deployment within your Kubernetes cluster. If you encounter any issues or have suggestions, please feel free to contribute to the documentation.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user