# --8<-- [start:installation] vLLM supports basic model inferencing and serving on x86 CPU platform, with data types FP32, FP16 and BF16. # --8<-- [end:installation] # --8<-- [start:requirements] - OS: Linux - CPU flags: `avx512f` (Recommended), `avx512_bf16` (Optional), `avx512_vnni` (Optional) !!! tip Use `lscpu` to check the CPU flags. # --8<-- [end:requirements] # --8<-- [start:set-up-using-python] # --8<-- [end:set-up-using-python] # --8<-- [start:pre-built-wheels] # --8<-- [end:pre-built-wheels] # --8<-- [start:build-wheel-from-source] --8<-- "docs/getting_started/installation/cpu/build.inc.md" # --8<-- [end:build-wheel-from-source] # --8<-- [start:pre-built-images] [https://gallery.ecr.aws/q9t5s3a7/vllm-cpu-release-repo](https://gallery.ecr.aws/q9t5s3a7/vllm-cpu-release-repo) !!! warning If deploying the pre-built images on machines without `avx512f`, `avx512_bf16`, or `avx512_vnni` support, an `Illegal instruction` error may be raised. It is recommended to build images for these machines with the appropriate build arguments (e.g., `--build-arg VLLM_CPU_DISABLE_AVX512=true`, `--build-arg VLLM_CPU_AVX512BF16=false`, or `--build-arg VLLM_CPU_AVX512VNNI=false`) to disable unsupported features. Please note that without `avx512f`, AVX2 will be used and this version is not recommended because it only has basic feature support. # --8<-- [end:pre-built-images] # --8<-- [start:build-image-from-source] ```bash docker build -f docker/Dockerfile.cpu \ --build-arg VLLM_CPU_AVX512BF16=false (default)|true \ --build-arg VLLM_CPU_AVX512VNNI=false (default)|true \ --build-arg VLLM_CPU_DISABLE_AVX512=false (default)|true \ --tag vllm-cpu-env \ --target vllm-openai . # Launching OpenAI server docker run --rm \ --security-opt seccomp=unconfined \ --cap-add SYS_NICE \ --shm-size=4g \ -p 8000:8000 \ -e VLLM_CPU_KVCACHE_SPACE= \ -e VLLM_CPU_OMP_THREADS_BIND= \ vllm-cpu-env \ --model=meta-llama/Llama-3.2-1B-Instruct \ --dtype=bfloat16 \ other vLLM OpenAI server arguments ``` # --8<-- [end:build-image-from-source] # --8<-- [start:extra-information] # --8<-- [end:extra-information]