[Doc] Update the doc for log probs + prefix caching (#23399)

Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This commit is contained in:
Chen Zhang 2025-08-22 06:20:39 -07:00 committed by GitHub
parent 695e7adcd2
commit a073be6d87
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -166,7 +166,7 @@ Processed means the values after applying all processors, including temperature
##### Prompt Logprobs with Prefix Caching
Currently prompt logprobs are only supported when prefix caching is turned off via `--no-enable-prefix-caching`. In a future release, prompt logprobs will be compatible with prefix caching, but a recomputation will be triggered to recover the full prompt logprobs even upon a prefix cache hit. See details in [RFC #13414](gh-issue:13414).
Logprobs are not cached. For a request requiring prompt logprobs, the engine will ignore the prefix cache and recompute the prefill of full prompt to generate the logprobs.
#### Deprecated Features