diff --git a/docs/source/serving/deploying_with_docker.rst b/docs/source/serving/deploying_with_docker.rst index c73288cb945f1..4d6c18512b045 100644 --- a/docs/source/serving/deploying_with_docker.rst +++ b/docs/source/serving/deploying_with_docker.rst @@ -3,11 +3,25 @@ Deploying with Docker ============================ +vLLM offers official docker image for deployment. +The image can be used to run OpenAI compatible server. +The image is available on Docker Hub as `vllm/vllm-openai `_. + +... code-block:: console + + $ docker run --runtime nvidia --gpus all \ + -v ~/.cache/huggingface:/root/.cache/huggingface \ + -p 8000:8000 \ + --env "HUGGING_FACE_HUB_TOKEN=" \ + vllm/vllm-openai:latest \ + --model mistralai/Mistral-7B-v0.1 + + You can build and run vLLM from source via the provided dockerfile. To build vLLM: .. code-block:: console - $ DOCKER_BUILDKIT=1 docker build . --target vllm --tag vllm --build-arg max_jobs=8 + $ DOCKER_BUILDKIT=1 docker build . --target vllm-openai --tag vllm/vllm-openai --build-arg max_jobs=8 To run vLLM: @@ -17,5 +31,5 @@ To run vLLM: -v ~/.cache/huggingface:/root/.cache/huggingface \ -p 8000:8000 \ --env "HUGGING_FACE_HUB_TOKEN=" \ - vllm + vllm/vllm-openai