mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-11 03:08:49 +08:00
506 B
506 B
(deploying-with-lws)=
Deploying with LWS
LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads. A major use case is for multi-host/multi-node distributed inference.
vLLM can be deployed with LWS on Kubernetes for distributed model serving.
Please see this guide for more details on deploying vLLM on Kubernetes using LWS.