vllm/design at c46b932df2b801ba0a6452e436268f086029d82b - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-11 20:27:22 +08:00

History

prefix caching design doc sha256 now default (#29261 )

Signed-off-by: redwrasse <mail@redwrasse.io>

2025-12-06 07:39:56 +00:00

arch_overview.md

[Docs] Replace all explicit anchors with real links (#27087 )

2025-10-17 02:22:06 -07:00

cuda_graphs.md

[Core] Refactor padding logic and pad for CUDA graphs before attention metadata building (#28579 )

2025-11-26 14:07:13 -05:00

dbo.md

[UX] Replace VLLM_ALL2ALL_BACKEND with --all2all-backend (#26732 )

2025-10-13 18:12:52 -07:00

debug_vllm_compile.md

[Frontend] Remove deprecated -O.xx flag (#29991 )

2025-12-05 00:47:22 -08:00

fused_moe_modular_kernel.md

[Docs] Enable some more markdown lint rules for the docs (#28731 )

2025-11-14 18:39:19 +00:00

huggingface_integration.md

[Misc] Update TokenizerLike interface and move get_cached_tokenizer (#29730 )

2025-11-30 14:59:47 +08:00

hybrid_kv_cache_manager.md

…

io_processor_plugins.md

[examples] Resettle pooling examples. (#29365 )

2025-12-02 15:54:28 +00:00

logits_processors.md

[Doc]: fix typos in various files (#28863 )

2025-11-17 20:32:14 -08:00

lora_resolver_plugins.md

docs(lora_resolvers): clarify multi-resolver order and storage path requirement (#28153 )

2025-11-14 18:08:30 +00:00

metrics.md

docs: update metrics design doc to use new vllm:kv_cache_usage_perc (#30041 )

2025-12-04 23:37:14 +00:00

mm_processing.md

[Docs] Replace all explicit anchors with real links (#27087 )

2025-10-17 02:22:06 -07:00

moe_kernel_features.md

[Kernels] Remove BatchedTritonOrDeepGemmExperts and default fallback to Triton (#29929 )

2025-12-03 20:49:00 +00:00

multiprocessing.md

[Docs] Replace all explicit anchors with real links (#27087 )

2025-10-17 02:22:06 -07:00

optimization_levels.md

[Frontend][torch.compile] CompilationConfig Overhaul (#20283 ): Set up -O infrastructure (#26847 )

2025-11-27 01:55:58 -08:00

p2p_nccl_connector.md

…

paged_attention.md

…

plugin_system.md

[Doc]: fix code block rendering (#29728 )

2025-11-29 13:46:48 +00:00

prefix_caching.md

prefix caching design doc sha256 now default (#29261 )

2025-12-06 07:39:56 +00:00

torch_compile.md

[Frontend] Remap -O to -cc commandline flag (#29557 )

2025-11-28 21:51:12 +00:00