This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-06-14 17:17:25 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
vllm
/
model_executor
/
layers
/
quantization
/
compressed_tensors
History
Lucas Wilkinson
cd7edc4e87
[Bugfix] Fix empty (nullptr) channelwise scales when loading wNa16 using compressed tensors (
#6798
)
2024-07-25 15:05:09 -07:00
..
schemes
[Bugfix] Fix empty (nullptr) channelwise scales when loading wNa16 using compressed tensors (
#6798
)
2024-07-25 15:05:09 -07:00
__init__.py
[Kernel] Initial Activation Quantization Support (
#4525
)
2024-05-23 21:29:18 +00:00
compressed_tensors.py
[ Misc ]
fp8-marlin
channelwise via
compressed-tensors
(
#6524
)
2024-07-25 09:46:04 -07:00
utils.py
[Misc] Add ignored layers for
fp8
quantization (
#6657
)
2024-07-23 14:04:04 -04:00