vllm/examples/offline_inference/disaggregated-prefill-v1
Simon Mo 02f0c7b220
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
..

Disaggregated Prefill V1

This example contains scripts that demonstrate disaggregated prefill in the offline setting of vLLM.

Files

  • run.sh - A helper script that will run prefill_example.py and decode_example.py sequentially.
    • Make sure you are in the examples/offline_inference/disaggregated-prefill-v1 directory before running run.sh.
  • prefill_example.py - A script which performs prefill only, saving the KV state to the local_storage directory and the prompts to output.txt.
  • decode_example.py - A script which performs decode only, loading the KV state from the local_storage directory and the prompts from output.txt.