diff --git a/docs/source/getting_started/installation.rst b/docs/source/getting_started/installation.rst index 421bf19cc1d5..7611bc2be228 100644 --- a/docs/source/getting_started/installation.rst +++ b/docs/source/getting_started/installation.rst @@ -3,31 +3,15 @@ Installation ============ -vLLM is a Python library that also contains some C++ and CUDA code. -This additional code requires compilation on the user's machine. +vLLM is a Python library that also contains pre-compiled C++ and CUDA (11.8) binaries. Requirements ------------ * OS: Linux -* Python: 3.8 or higher -* CUDA: 11.0 -- 11.8 +* Python: 3.8 -- 3.11 * GPU: compute capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, etc.) -.. note:: - As of now, vLLM does not support CUDA 12. - If you are using Hopper or Lovelace GPUs, please use CUDA 11.8 instead of CUDA 12. - -.. tip:: - If you have trouble installing vLLM, we recommend using the NVIDIA PyTorch Docker image. - - .. code-block:: console - - $ # Pull the Docker image with CUDA 11.8. - $ docker run --gpus all -it --rm --shm-size=8g nvcr.io/nvidia/pytorch:22.12-py3 - - Inside the Docker container, please execute :code:`pip uninstall torch` before installing vLLM. - Install with pip ---------------- @@ -40,7 +24,7 @@ You can install vLLM using pip: $ conda activate myenv $ # Install vLLM. - $ pip install vllm # This may take 5-10 minutes. + $ pip install vllm .. _build_from_source: @@ -55,3 +39,11 @@ You can also build and install vLLM from source: $ git clone https://github.com/vllm-project/vllm.git $ cd vllm $ pip install -e . # This may take 5-10 minutes. + +.. tip:: + If you have trouble building vLLM, we recommend using the NVIDIA PyTorch Docker image. + + .. code-block:: console + + $ # Pull the Docker image with CUDA 11.8. + $ docker run --gpus all -it --rm --shm-size=8g nvcr.io/nvidia/pytorch:22.12-py3