vllm/examples/online_serving/disaggregated_serving
Reid 27d0952600
[Misc] extract parser.parse_args() (#18323)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-19 04:06:26 +00:00
..

Disaggregated Serving

This example contains scripts that demonstrate the disaggregated serving features of vLLM.

Files

  • disagg_proxy_demo.py - Demonstrates XpYd (X prefill instances, Y decode instances).
  • kv_events.sh - Demonstrates KV cache event publishing.