diff --git a/docs/features/spec_decode.md b/docs/features/spec_decode.md index 89d5b489e1888..597a8e8644278 100644 --- a/docs/features/spec_decode.md +++ b/docs/features/spec_decode.md @@ -203,6 +203,7 @@ an [EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency)](https "model": "yuhuili/EAGLE-LLaMA3-Instruct-8B", "draft_tensor_parallel_size": 1, "num_speculative_tokens": 2, + "method": "eagle", }, ) @@ -231,6 +232,9 @@ A few important things to consider when using the EAGLE based draft models: reported in the reference implementation [here](https://github.com/SafeAILab/EAGLE). This issue is under investigation and tracked here: . +4. When using EAGLE-3 based draft model, option "method" must be set to "eagle3". + That is, to specify `"method": "eagle3"` in `speculative_config`. + A variety of EAGLE draft models are available on the Hugging Face hub: | Base Model | EAGLE on Hugging Face | # EAGLE Parameters |