vllm/worker at e433c115bce2bf27f7b1abdde7029566007d9eee - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-08-01 12:47:54 +08:00

History

Zhuohan Li 537c9755a7

[Minor] Small fix to make distributed init logic in worker looks cleaner (#2905 )

2024-02-18 14:39:00 -08:00

..

[Speculative decoding 2/9] Multi-step worker for draft model (#2424 )

2024-01-21 16:31:47 -08:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

cache_engine.py

Remove hardcoded device="cuda" to support more devices (#2503 )

2024-02-01 15:46:39 -08:00

model_runner.py

Don't use cupy NCCL for AMD backends (#2855 )

2024-02-14 12:30:44 -08:00

worker.py

[Minor] Small fix to make distributed init logic in worker looks cleaner (#2905 )

2024-02-18 14:39:00 -08:00