Logo
Explore Help
Sign In
xinyun/vllm
1
0
Fork 0
You've already forked vllm
mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-23 12:15:48 +08:00
Code Issues Packages Projects Releases Wiki Activity
vllm/vllm/model_executor/layers/fused_moe
History
bnellnm d6fc629f4d
[Kernel][Minor] Re-fuse triton moe weight application (#16071)
Signed-off-by: Bill Nell <bnell@redhat.com>
2025-04-04 23:27:34 +00:00
..
configs
[Misc] Add tuned R1 w8a8 and MoE configs for NVIDIA L20 (#15322)
2025-03-23 01:10:10 -07:00
__init__.py
[Minor] Fused experts refactor (#15914)
2025-04-03 10:19:38 -07:00
cutlass_moe.py
[Minor] Fused experts refactor (#15914)
2025-04-03 10:19:38 -07:00
deep_gemm_moe.py
[Minor] Fused experts refactor (#15914)
2025-04-03 10:19:38 -07:00
fused_marlin_moe.py
[model][refactor] remove cuda hard code in models and layers (#13658)
2025-02-24 06:10:14 -08:00
fused_moe.py
[Kernel][Minor] Re-fuse triton moe weight application (#16071)
2025-04-04 23:27:34 +00:00
layer.py
[Hardware][Gaudi][BugFix] fix arguments of hpu fused moe (#15945)
2025-04-04 09:38:55 -07:00
moe_align_block_size.py
[Minor] Fused experts refactor (#15914)
2025-04-03 10:19:38 -07:00
moe_pallas.py
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
2025-02-02 11:58:18 -08:00
moe_torch_iterative.py
Expert Parallelism (EP) Support for DeepSeek V2 (#12583)
2025-02-24 07:33:20 -08:00
rocm_aiter_fused_moe.py
[Bugfix] Fix imports for MoE on CPU (#15841)
2025-04-02 03:33:55 +00:00
utils.py
[Minor] Fused experts refactor (#15914)
2025-04-03 10:19:38 -07:00
Powered by Gitea Version: 1.23.1 Page: 518ms Template: 4ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API