vllm/compressed_tensors at cd7edc4e8726d4b87e121f9ec671ecb6dd0c45d6 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-14 17:17:25 +08:00

History

Lucas Wilkinson cd7edc4e87

[Bugfix] Fix empty (nullptr) channelwise scales when loading wNa16 using compressed tensors (#6798 )

2024-07-25 15:05:09 -07:00

..

[Bugfix] Fix empty (nullptr) channelwise scales when loading wNa16 using compressed tensors (#6798 )

2024-07-25 15:05:09 -07:00

__init__.py

[Kernel] Initial Activation Quantization Support (#4525 )

2024-05-23 21:29:18 +00:00

compressed_tensors.py

[ Misc ] fp8-marlin channelwise via compressed-tensors (#6524 )

2024-07-25 09:46:04 -07:00

utils.py

[Misc] Add ignored layers for fp8 quantization (#6657 )

2024-07-23 14:04:04 -04:00