This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2025-12-18 02:55:02 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
docs
/
design
History
Benjamin Chislett
304419576a
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer (
#28479
)
...
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
2025-11-13 01:56:40 +09:00
..
arch_overview.md
…
cuda_graphs.md
[Perf] Refactor cudagraph_support to enable full CUDA graphs for spec decoding with FlashInfer (
#28479
)
2025-11-13 01:56:40 +09:00
dbo.md
…
debug_vllm_compile.md
[Docs] Add guide to debugging vLLM-torch.compile integration (
#28094
)
2025-11-05 21:31:46 +00:00
fused_moe_modular_kernel.md
…
huggingface_integration.md
…
hybrid_kv_cache_manager.md
…
io_processor_plugins.md
[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (
#25524
)
2025-10-30 12:13:05 +00:00
logits_processors.md
[Bugfix] Validate custom logits processor xargs for online serving (
#27560
)
2025-11-05 16:53:33 +00:00
metrics.md
[Doc] Fix minor issues in docs/design/metrics.md (
#27436
)
2025-10-24 05:40:54 -07:00
mm_processing.md
…
moe_kernel_features.md
[RFC][ROCm][AITER] Keep all AITER kernels in
_aiter_ops
class like
_custom_ops
and
_ipex_ops
(
#24490
)
2025-11-10 08:20:53 -08:00
multiprocessing.md
…
p2p_nccl_connector.md
…
paged_attention.md
…
plugin_system.md
…
prefix_caching.md
[Doc] Fix numbering sequence in prefix caching (
#27357
)
2025-10-22 17:35:47 +00:00
torch_compile.md
[BUG] Make 'binary' default option for saving torch compile artifacts when using standalone_compile (
#27616
)
2025-11-03 11:13:51 -05:00