mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-26 15:43:05 +08:00
433 B
433 B
KServe
vLLM can be deployed with KServe on Kubernetes for highly scalable distributed model serving.
You can use vLLM with KServe's Hugging Face serving runtime or via LLMInferenceService that uses llm-d.