mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-10 00:06:06 +08:00
416 B
416 B
Loading Model weights with fastsafetensors
Using fastsafetensors library enables loading model weights to GPU memory by leveraging GPU direct storage. See their GitHub repository for more details.
To enable this feature, use the --load-format fastsafetensors command-line argument