vllm/executor at a31cab7556f540b558b0b454b4a4b9b438542566 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-26 21:07:30 +08:00

History

zifeitong a58f24e590

[Bugfix] Fix torch.compile() error when using MultiprocessingGPUExecutor (#5229 )

2024-06-03 20:55:50 -07:00

..

__init__.py

Add distributed model executor abstraction (#3191 )

2024-03-11 11:03:45 -07:00

cpu_executor.py

[Misc][Refactor] Introduce ExecuteModelData (#4540 )

2024-05-03 17:47:07 -07:00

distributed_gpu_executor.py

[Core] Eliminate parallel worker per-step task scheduling overhead (#4894 )

2024-05-23 06:17:27 +09:00

executor_base.py

[Core] Eliminate parallel worker per-step task scheduling overhead (#4894 )

2024-05-23 06:17:27 +09:00

gpu_executor.py

[Speculative decoding][Re-take] Enable TP>1 speculative decoding (#4840 )

2024-05-16 00:53:51 -07:00

multiproc_gpu_executor.py

[Bugfix] Fix torch.compile() error when using MultiprocessingGPUExecutor (#5229 )

2024-06-03 20:55:50 -07:00

multiproc_worker_utils.py

[Misc] centralize all usage of environment variables (#4548 )

2024-05-02 11:13:25 -07:00

neuron_executor.py

[Misc][Refactor] Introduce ExecuteModelData (#4540 )

2024-05-03 17:47:07 -07:00

ray_gpu_executor.py

[Core] Eliminate parallel worker per-step task scheduling overhead (#4894 )

2024-05-23 06:17:27 +09:00

ray_utils.py

[Core] Add MultiprocessingGPUExecutor (#4539 )

2024-05-14 10:38:59 -07:00