vllm/modal.md at d99c3a4f7bd33e3e3acf7c2c82d52d15ba501eaf

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-10 07:04:53 +08:00

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2025-07-08 03:27:40 -07:00

vLLM can be run on cloud GPUs with Modal, a serverless computing platform designed for fast auto-scaling.

For details on how to deploy vLLM on Modal, see this tutorial in the Modal documentation.

Modal