From a073be6d87c6480ecd725bd475cc4f30fd747aa4 Mon Sep 17 00:00:00 2001 From: Chen Zhang Date: Fri, 22 Aug 2025 06:20:39 -0700 Subject: [PATCH] [Doc] Update the doc for log probs + prefix caching (#23399) Signed-off-by: Chen Zhang Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- docs/usage/v1_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/usage/v1_guide.md b/docs/usage/v1_guide.md index b89768913681e..7fc615d4c042f 100644 --- a/docs/usage/v1_guide.md +++ b/docs/usage/v1_guide.md @@ -166,7 +166,7 @@ Processed means the values after applying all processors, including temperature ##### Prompt Logprobs with Prefix Caching -Currently prompt logprobs are only supported when prefix caching is turned off via `--no-enable-prefix-caching`. In a future release, prompt logprobs will be compatible with prefix caching, but a recomputation will be triggered to recover the full prompt logprobs even upon a prefix cache hit. See details in [RFC #13414](gh-issue:13414). +Logprobs are not cached. For a request requiring prompt logprobs, the engine will ignore the prefix cache and recompute the prefill of full prompt to generate the logprobs. #### Deprecated Features