mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-10 04:54:56 +08:00

[Docs] Reduce custom syntax used in docs (#27009 )

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2025-10-16 20:05:34 -07:00

3.2 KiB

Raw Blame History

--8<-- [start:installation]

vLLM has experimental support for s390x architecture on IBM Z platform. For now, users must build from source to natively run on IBM Z platform.

Currently the CPU implementation for s390x architecture supports FP32 datatype only.

!!! warning There are no pre-built wheels or images for this device, so you must build vLLM from source.

--8<-- [end:installation]

--8<-- [start:requirements]

OS: Linux
SDK: gcc/g++ >= 12.3.0 or later with Command Line Tools
Instruction Set Architecture (ISA): VXE support is required. Works with Z14 and above.
Build install python packages: pyarrow, torch and torchvision

--8<-- [end:requirements]

--8<-- [start:set-up-using-python]

--8<-- [end:set-up-using-python]

--8<-- [start:pre-built-wheels]

--8<-- [end:pre-built-wheels]

--8<-- [start:build-wheel-from-source]

Install the following packages from the package manager before building the vLLM. For example on RHEL 9.4:

dnf install -y \
    which procps findutils tar vim git gcc g++ make patch make cython zlib-devel \
    libjpeg-turbo-devel libtiff-devel libpng-devel libwebp-devel freetype-devel harfbuzz-devel \
    openssl-devel openblas openblas-devel wget autoconf automake libtool cmake numactl-devel

Install rust>=1.80 which is needed for outlines-core and uvloop python packages installation.

curl https://sh.rustup.rs -sSf | sh -s -- -y && \
    . "$HOME/.cargo/env"

Execute the following commands to build and install vLLM from source.

!!! tip Please build the following dependencies, torchvision, pyarrow from source before building vLLM.

    sed -i '/^torch/d' requirements/build.txt    # remove torch from requirements/build.txt since we use nightly builds
    uv pip install -v \
        --torch-backend auto \
        -r requirements/build.txt \
        -r requirements/cpu.txt \
    VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \
        uv pip install dist/*.whl

??? console "pip" bash sed -i '/^torch/d' requirements/build.txt # remove torch from requirements/build.txt since we use nightly builds pip install -v \ --extra-index-url https://download.pytorch.org/whl/nightly/cpu \ -r requirements/build.txt \ -r requirements/cpu.txt \ VLLM_TARGET_DEVICE=cpu python setup.py bdist_wheel && \ pip install dist/*.whl

--8<-- [end:build-wheel-from-source]

--8<-- [start:pre-built-images]

--8<-- [end:pre-built-images]

--8<-- [start:build-image-from-source]

docker build -f docker/Dockerfile.s390x \
    --tag vllm-cpu-env .

# Launch OpenAI server
docker run --rm \
    --privileged true \
    --shm-size 4g \
    -p 8000:8000 \
    -e VLLM_CPU_KVCACHE_SPACE=<KV cache space> \
    -e VLLM_CPU_OMP_THREADS_BIND=<CPU cores for inference> \
    vllm-cpu-env \
    --model meta-llama/Llama-3.2-1B-Instruct \
    --dtype float \
    other vLLM OpenAI server arguments

!!! tip An alternative of --privileged true is --cap-add SYS_NICE --security-opt seccomp=unconfined.

--8<-- [end:build-image-from-source]

--8<-- [start:extra-information]

--8<-- [end:extra-information]

3.2 KiB Raw Blame History