mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-14 16:27:27 +08:00
[Docs] use uv in CPU installation docs (#22089)
Signed-off-by: David Xia <david@davidxia.com>
This commit is contained in:
parent
3146519add
commit
97608dc276
@ -1,6 +1,6 @@
|
|||||||
# --8<-- [start:installation]
|
# --8<-- [start:installation]
|
||||||
|
|
||||||
vLLM has experimental support for macOS with Apple silicon. For now, users shall build from the source vLLM to natively run on macOS.
|
vLLM has experimental support for macOS with Apple silicon. For now, users must build from source to natively run on macOS.
|
||||||
|
|
||||||
Currently the CPU implementation for macOS supports FP32 and FP16 datatypes.
|
Currently the CPU implementation for macOS supports FP32 and FP16 datatypes.
|
||||||
|
|
||||||
@ -23,20 +23,20 @@ Currently the CPU implementation for macOS supports FP32 and FP16 datatypes.
|
|||||||
# --8<-- [end:pre-built-wheels]
|
# --8<-- [end:pre-built-wheels]
|
||||||
# --8<-- [start:build-wheel-from-source]
|
# --8<-- [start:build-wheel-from-source]
|
||||||
|
|
||||||
After installation of XCode and the Command Line Tools, which include Apple Clang, execute the following commands to build and install vLLM from the source.
|
After installation of XCode and the Command Line Tools, which include Apple Clang, execute the following commands to build and install vLLM from source.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/vllm-project/vllm.git
|
git clone https://github.com/vllm-project/vllm.git
|
||||||
cd vllm
|
cd vllm
|
||||||
pip install -r requirements/cpu.txt
|
uv pip install -r requirements/cpu.txt
|
||||||
pip install -e .
|
uv pip install -e .
|
||||||
```
|
```
|
||||||
|
|
||||||
!!! note
|
!!! note
|
||||||
On macOS the `VLLM_TARGET_DEVICE` is automatically set to `cpu`, which currently is the only supported device.
|
On macOS the `VLLM_TARGET_DEVICE` is automatically set to `cpu`, which is currently the only supported device.
|
||||||
|
|
||||||
!!! example "Troubleshooting"
|
!!! example "Troubleshooting"
|
||||||
If the build has error like the following snippet where standard C++ headers cannot be found, try to remove and reinstall your
|
If the build fails with errors like the following where standard C++ headers cannot be found, try to remove and reinstall your
|
||||||
[Command Line Tools for Xcode](https://developer.apple.com/download/all/).
|
[Command Line Tools for Xcode](https://developer.apple.com/download/all/).
|
||||||
|
|
||||||
```text
|
```text
|
||||||
|
|||||||
@ -1,4 +1,4 @@
|
|||||||
First, install recommended compiler. We recommend to use `gcc/g++ >= 12.3.0` as the default compiler to avoid potential problems. For example, on Ubuntu 22.4, you can run:
|
First, install the recommended compiler. We recommend using `gcc/g++ >= 12.3.0` as the default compiler to avoid potential problems. For example, on Ubuntu 22.4, you can run:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
sudo apt-get update -y
|
sudo apt-get update -y
|
||||||
@ -6,28 +6,34 @@ sudo apt-get install -y --no-install-recommends ccache git curl wget ca-certific
|
|||||||
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 10 --slave /usr/bin/g++ g++ /usr/bin/g++-12
|
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 10 --slave /usr/bin/g++ g++ /usr/bin/g++-12
|
||||||
```
|
```
|
||||||
|
|
||||||
Second, clone vLLM project:
|
Second, clone the vLLM project:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/vllm-project/vllm.git vllm_source
|
git clone https://github.com/vllm-project/vllm.git vllm_source
|
||||||
cd vllm_source
|
cd vllm_source
|
||||||
```
|
```
|
||||||
|
|
||||||
Third, install Python packages for vLLM CPU backend building:
|
Third, install required dependencies:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install --upgrade pip
|
uv pip install -r requirements/cpu-build.txt --torch-backend auto
|
||||||
pip install -v -r requirements/cpu-build.txt --extra-index-url https://download.pytorch.org/whl/cpu
|
uv pip install -r requirements/cpu.txt --torch-backend auto
|
||||||
pip install -v -r requirements/cpu.txt --extra-index-url https://download.pytorch.org/whl/cpu
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Finally, build and install vLLM CPU backend:
|
??? console "pip"
|
||||||
|
```bash
|
||||||
|
pip install --upgrade pip
|
||||||
|
pip install -v -r requirements/cpu-build.txt --extra-index-url https://download.pytorch.org/whl/cpu
|
||||||
|
pip install -v -r requirements/cpu.txt --extra-index-url https://download.pytorch.org/whl/cpu
|
||||||
|
```
|
||||||
|
|
||||||
|
Finally, build and install vLLM:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
VLLM_TARGET_DEVICE=cpu python setup.py install
|
VLLM_TARGET_DEVICE=cpu python setup.py install
|
||||||
```
|
```
|
||||||
|
|
||||||
If you want to develop vllm, install it in editable mode instead.
|
If you want to develop vLLM, install it in editable mode instead.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
VLLM_TARGET_DEVICE=cpu python setup.py develop
|
VLLM_TARGET_DEVICE=cpu python setup.py develop
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
# --8<-- [start:installation]
|
# --8<-- [start:installation]
|
||||||
|
|
||||||
vLLM has experimental support for s390x architecture on IBM Z platform. For now, users shall build from the vLLM source to natively run on IBM Z platform.
|
vLLM has experimental support for s390x architecture on IBM Z platform. For now, users must build from source to natively run on IBM Z platform.
|
||||||
|
|
||||||
Currently the CPU implementation for s390x architecture supports FP32 datatype only.
|
Currently the CPU implementation for s390x architecture supports FP32 datatype only.
|
||||||
|
|
||||||
@ -40,12 +40,23 @@ curl https://sh.rustup.rs -sSf | sh -s -- -y && \
|
|||||||
. "$HOME/.cargo/env"
|
. "$HOME/.cargo/env"
|
||||||
```
|
```
|
||||||
|
|
||||||
Execute the following commands to build and install vLLM from the source.
|
Execute the following commands to build and install vLLM from source.
|
||||||
|
|
||||||
!!! tip
|
!!! tip
|
||||||
Please build the following dependencies, `torchvision`, `pyarrow` from the source before building vLLM.
|
Please build the following dependencies, `torchvision`, `pyarrow` from source before building vLLM.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
sed -i '/^torch/d' requirements-build.txt # remove torch from requirements-build.txt since we use nightly builds
|
||||||
|
uv pip install -v \
|
||||||
|
--torch-backend auto \
|
||||||
|
-r requirements-build.txt \
|
||||||
|
-r requirements-cpu.txt \
|
||||||
|
VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
|
||||||
|
uv pip install dist/*.whl
|
||||||
|
```
|
||||||
|
|
||||||
|
??? console "pip"
|
||||||
|
```bash
|
||||||
sed -i '/^torch/d' requirements-build.txt # remove torch from requirements-build.txt since we use nightly builds
|
sed -i '/^torch/d' requirements-build.txt # remove torch from requirements-build.txt since we use nightly builds
|
||||||
pip install -v \
|
pip install -v \
|
||||||
--extra-index-url https://download.pytorch.org/whl/nightly/cpu \
|
--extra-index-url https://download.pytorch.org/whl/nightly/cpu \
|
||||||
@ -53,7 +64,7 @@ Execute the following commands to build and install vLLM from the source.
|
|||||||
-r requirements-cpu.txt \
|
-r requirements-cpu.txt \
|
||||||
VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
|
VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
|
||||||
pip install dist/*.whl
|
pip install dist/*.whl
|
||||||
```
|
```
|
||||||
|
|
||||||
# --8<-- [end:build-wheel-from-source]
|
# --8<-- [end:build-wheel-from-source]
|
||||||
# --8<-- [start:pre-built-images]
|
# --8<-- [start:pre-built-images]
|
||||||
@ -65,16 +76,16 @@ Execute the following commands to build and install vLLM from the source.
|
|||||||
docker build -f docker/Dockerfile.s390x \
|
docker build -f docker/Dockerfile.s390x \
|
||||||
--tag vllm-cpu-env .
|
--tag vllm-cpu-env .
|
||||||
|
|
||||||
# Launching OpenAI server
|
# Launch OpenAI server
|
||||||
docker run --rm \
|
docker run --rm \
|
||||||
--privileged=true \
|
--privileged true \
|
||||||
--shm-size=4g \
|
--shm-size 4g \
|
||||||
-p 8000:8000 \
|
-p 8000:8000 \
|
||||||
-e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
|
-e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
|
||||||
-e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
|
-e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
|
||||||
vllm-cpu-env \
|
vllm-cpu-env \
|
||||||
--model=meta-llama/Llama-3.2-1B-Instruct \
|
--model meta-llama/Llama-3.2-1B-Instruct \
|
||||||
--dtype=float \
|
--dtype float \
|
||||||
other vLLM OpenAI server arguments
|
other vLLM OpenAI server arguments
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user