This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-04-23 22:07:12 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
vllm
/
model_executor
/
layers
/
fused_moe
History
Jinzhen Lin
27b78c73ca
[Kernel] add triton fused moe kernel for gptq/awq (
#12185
)
2025-01-29 09:07:09 -05:00
..
configs
[ROCm][MoE] MI300 tuned configs Mixtral-8x(7B,22B) | fp16, fp8 (
#12408
)
2025-01-25 12:17:19 +08:00
__init__.py
[torch.compile] support moe models (
#9632
)
2024-10-27 21:58:04 -07:00
fused_marlin_moe.py
[optimization] remove python function call for custom op (
#11750
)
2025-01-07 17:04:28 +00:00
fused_moe.py
[Kernel] add triton fused moe kernel for gptq/awq (
#12185
)
2025-01-29 09:07:09 -05:00
layer.py
[BugFix] Fix parameter names and
process_after_weight_loading
for W4A16 MoE Group Act Order (
#11528
)
2025-01-23 21:40:33 +00:00
moe_pallas.py
[Hardware][TPU] Support MoE with Pallas GMM kernel (
#6457
)
2024-07-16 09:56:28 -07:00
moe_torch_iterative.py
[Hardware][TPU] workaround fix for MoE on TPU (
#11764
)
2025-01-12 10:53:51 -05:00