This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-04-08 21:37:10 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
vllm
/
attention
History
jvlunteren
01a583fea4
[Kernel] Decouple Tile Size from Block Size in Triton Unified Attention Kernel (
#21197
)
...
Signed-off-by: Jan van Lunteren <jvl@zurich.ibm.com>
2025-09-18 14:27:01 +00:00
..
backends
[Bug] Fix
is_flashmla_supported
Check Error (
#24774
)
2025-09-15 20:10:55 -06:00
layers
Directly get max encoder len from VLLM config in V1 (
#24866
)
2025-09-16 17:52:31 +00:00
ops
[Kernel] Decouple Tile Size from Block Size in Triton Unified Attention Kernel (
#21197
)
2025-09-18 14:27:01 +00:00
utils
[Attention] FlashAttn MLA (
#14258
)
2025-09-04 02:47:59 -07:00
__init__.py
Remove duplicate entry in vllm.attention.__all__ (
#23296
)
2025-08-20 17:14:59 -07:00
layer.py
[XPU] Whisper model support on XPU Platform (
#25123
)
2025-09-18 04:30:10 +00:00
selector.py
[gpt-oss] Enable gpt-oss on ampere (
#22714
)
2025-08-12 03:21:44 -07:00