This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-04-02 17:47:02 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
vllm
/
model_executor
/
layers
/
quantization
/
compressed_tensors
History
Bowen Wang
e9fd658a73
[Feature] Expert Parallelism Load Balancer (EPLB) (
#18343
)
...
Signed-off-by: Bowen Wang <abmfy@icloud.com>
2025-06-26 15:30:21 -07:00
..
schemes
[Quantization] Add compressed-tensors emulations support for NVFP4 (
#19879
)
2025-06-25 14:28:19 -04:00
__init__.py
[Kernel] Initial Activation Quantization Support (
#4525
)
2024-05-23 21:29:18 +00:00
compressed_tensors_moe.py
[Feature] Expert Parallelism Load Balancer (EPLB) (
#18343
)
2025-06-26 15:30:21 -07:00
compressed_tensors.py
[Quantization] Add compressed-tensors emulations support for NVFP4 (
#19879
)
2025-06-25 14:28:19 -04:00
triton_scaled_mm.py
[AMD][Kernel][BugFix] fix test_rocm_compressed_tensors_w8a8 for rocm (
#19509
)
2025-06-12 07:14:24 +00:00
utils.py
[Quantization] Add compressed-tensors NVFP4 support (
#18312
)
2025-06-08 09:05:55 -04:00