This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-03-25 05:29:12 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
vllm
/
model_executor
History
Wentao Ye
c1ffcb55da
[Refactor] Optimize FP8 MOE Backend Choice and Log (
#26044
)
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-10-03 15:23:42 -06:00
..
layers
[Refactor] Optimize FP8 MOE Backend Choice and Log (
#26044
)
2025-10-03 15:23:42 -06:00
model_loader
[Quantization/NVFP4] Speed up TRTLLM NVFP4 MOE weight loading and fix K/V scale loading for MLA Attn (
#25968
)
2025-10-03 19:35:06 +00:00
models
[BugFix][QWEN-VL]fix wrong apply_rotary_emb_torch selection introduced by
#24642
(
#26123
)
2025-10-03 08:52:26 -07:00
warmup
[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (
#25489
)
2025-09-25 17:37:50 +00:00
__init__.py
[V0 Deprecation] Remove V0 sampling metadata (
#25345
)
2025-09-21 10:37:11 -07:00
custom_op.py
[V0 deprecation] Deprecate V0 Neuron backend (
#21159
)
2025-09-06 16:15:18 -07:00
parameter.py
Revert "[Bug] Dynamo Unsupported due to
BasevLLMParameter.torch_function
calling disabled super()" (
#25681
)
2025-09-25 09:45:06 -07:00
utils.py
[OOT] Support sync_model_loading for OOT (
#25126
)
2025-09-19 05:41:53 +00:00