mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-14 05:04:58 +08:00
390 B
390 B
Loading Model weights with fastsafetensors
Using fastsafetensor library enables loading model weights to GPU memory by leveraging GPU direct storage. See https://github.com/foundation-model-stack/fastsafetensors for more details.
For enabling this feature, set the environment variable USE_FASTSAFETENSOR to true