diff --git a/docs/features/spec_decode.md b/docs/features/spec_decode.md index 4be6bd01a4eb..be4b91feda7a 100644 --- a/docs/features/spec_decode.md +++ b/docs/features/spec_decode.md @@ -256,12 +256,12 @@ speculative decoding, breaking down the guarantees into three key areas: 2. **Algorithmic Losslessness** \- vLLM’s implementation of speculative decoding is algorithmically validated to be lossless. Key validation tests include: - > - **Rejection Sampler Convergence**: Ensures that samples from vLLM’s rejection sampler align with the target - > distribution. [View Test Code](https://github.com/vllm-project/vllm/blob/47b65a550866c7ffbd076ecb74106714838ce7da/tests/samplers/test_rejection_sampler.py#L252) - > - **Greedy Sampling Equality**: Confirms that greedy sampling with speculative decoding matches greedy sampling - > without it. This verifies that vLLM's speculative decoding framework, when integrated with the vLLM forward pass and the vLLM rejection sampler, - > provides a lossless guarantee. Almost all of the tests in . - > verify this property using [this assertion implementation](https://github.com/vllm-project/vllm/blob/b67ae00cdbbe1a58ffc8ff170f0c8d79044a684a/tests/spec_decode/e2e/conftest.py#L291) + > - **Rejection Sampler Convergence**: Ensures that samples from vLLM’s rejection sampler align with the target + > distribution. [View Test Code](https://github.com/vllm-project/vllm/blob/47b65a550866c7ffbd076ecb74106714838ce7da/tests/samplers/test_rejection_sampler.py#L252) + > - **Greedy Sampling Equality**: Confirms that greedy sampling with speculative decoding matches greedy sampling + > without it. This verifies that vLLM's speculative decoding framework, when integrated with the vLLM forward pass and the vLLM rejection sampler, + > provides a lossless guarantee. Almost all of the tests in . + > verify this property using [this assertion implementation](https://github.com/vllm-project/vllm/blob/b67ae00cdbbe1a58ffc8ff170f0c8d79044a684a/tests/spec_decode/e2e/conftest.py#L291) 3. **vLLM Logprob Stability** \- vLLM does not currently guarantee stable token log probabilities (logprobs). This can result in different outputs for the