mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2026-01-23 20:54:36 +08:00
[Doc] Update the doc for log probs + prefix caching (#23399)
Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This commit is contained in:
parent
695e7adcd2
commit
a073be6d87
@ -166,7 +166,7 @@ Processed means the values after applying all processors, including temperature
|
||||
|
||||
##### Prompt Logprobs with Prefix Caching
|
||||
|
||||
Currently prompt logprobs are only supported when prefix caching is turned off via `--no-enable-prefix-caching`. In a future release, prompt logprobs will be compatible with prefix caching, but a recomputation will be triggered to recover the full prompt logprobs even upon a prefix cache hit. See details in [RFC #13414](gh-issue:13414).
|
||||
Logprobs are not cached. For a request requiring prompt logprobs, the engine will ignore the prefix cache and recompute the prefill of full prompt to generate the logprobs.
|
||||
|
||||
#### Deprecated Features
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user