mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-10 00:06:06 +08:00

docs: update fastsafetensors usage instructions (#22891 )

Signed-off-by: Nir Levy <bhr166@gmail.com>

2025-08-14 19:56:54 +00:00

416 B

Raw Blame History

Loading Model weights with fastsafetensors

Using fastsafetensors library enables loading model weights to GPU memory by leveraging GPU direct storage. See their GitHub repository for more details.

To enable this feature, use the --load-format fastsafetensors command-line argument

416 B Raw Blame History

Loading Model weights with fastsafetensors

416 B

Raw Blame History