vllm/examples/offline_inference/disaggregated-prefill-v1
Reid 107f5fc4cb
[Misc] refactor disaggregated-prefill-v1 example (#18474)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-05-21 11:10:14 +00:00
..
2025-04-17 13:22:40 -07:00

Disaggregated Prefill V1

This example contains scripts that demonstrate disaggregated prefill in the offline setting of vLLM.

Files

  • run.sh - A helper script that will run prefill_example.py and decode_example.py sequentially.
    • Make sure you are in the examples/offline_inference/disaggregated-prefill-v1 directory before running run.sh.
  • prefill_example.py - A script which performs prefill only, saving the KV state to the local_storage directory and the prompts to output.txt.
  • decode_example.py - A script which performs decode only, loading the KV state from the local_storage directory and the prompts from output.txt.