vllm/model_loader at f775a07e30fdeafc14f53fe502b262b00540dd71 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-11 18:05:43 +08:00

History

chenqianfzh b9c0605a8e

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00

..

__init__.py

[Misc] Enhance attention selector (#4751 )

2024-05-13 10:47:25 -07:00

loader.py

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00

neuron.py

[Typing] Mypy typing part 2 (#4043 )

2024-04-17 17:28:43 -07:00

tensorizer.py

[Frontend] [Core] perf: Automatically detect vLLM-tensorized model, update tensorizer to version 2.9.0 (#4208 )

2024-05-13 14:57:07 -07:00

utils.py

[Kernel] FP8 support for MoE kernel / Mixtral (#4244 )

2024-04-24 01:18:23 +00:00

weight_utils.py

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00