This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-04-11 05:47:03 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
vllm
/
attention
History
Adrian Abeyta
c42ff4f4fd
[BugFix][torch.compile] KV scale calculation issues with FP8 quantization (
#25513
)
...
Signed-off-by: adabeyta <aabeyta@redhat.com>
2025-09-29 15:52:04 -04:00
..
backends
[torch.compile] Make Query Quantization Fusable (
#24914
)
2025-09-25 09:25:12 -04:00
layers
Directly get max encoder len from VLLM config in V1 (
#24866
)
2025-09-16 17:52:31 +00:00
ops
[Misc] fix tests failure by using current_platform (
#25825
)
2025-09-29 04:18:57 +00:00
utils
[Attention] FlashAttn MLA (
#14258
)
2025-09-04 02:47:59 -07:00
__init__.py
[V0 Deprecation] Remove unused classes in attention (
#25541
)
2025-09-24 13:24:40 -07:00
layer.py
[BugFix][torch.compile] KV scale calculation issues with FP8 quantization (
#25513
)
2025-09-29 15:52:04 -04:00
selector.py
[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (
#25489
)
2025-09-25 17:37:50 +00:00