vllm/parallel_utils at d27f4bae393214b4e7715fc3cb5754d4bf801bce - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-22 16:17:18 +08:00

History

explainerauthors a1125ad4df

Correct comments in parallel_state.py (#1818 )

2023-11-28 10:19:35 -08:00

..

__init__.py

TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181 )

2023-10-02 15:36:09 -07:00

communication_op.py

Implement prompt logprobs & Batched topk for computing logprobs (#1328 )

2023-10-16 10:56:50 -07:00

parallel_state.py

Correct comments in parallel_state.py (#1818 )

2023-11-28 10:19:35 -08:00

README.md

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

utils.py

TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622 )

2023-11-15 22:50:41 -08:00

README.md

The files in this folder are ported from Megatron-LM. We only keep the codes that are used in inference.