Yuan Tang 0736f901e7
docs: Add llm-d integration to the website (#31234)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-12-23 20:27:22 +00:00

433 B

KServe

vLLM can be deployed with KServe on Kubernetes for highly scalable distributed model serving.

You can use vLLM with KServe's Hugging Face serving runtime or via LLMInferenceService that uses llm-d.