From ba6011027dd68ce3bff9de9d8001354f095cee5c Mon Sep 17 00:00:00 2001 From: Russell Bryant Date: Thu, 11 Sep 2025 04:50:08 -0400 Subject: [PATCH] [Docs] Update V1 doc to reflect whisper support (#24606) Signed-off-by: Russell Bryant --- docs/models/supported_models.md | 2 +- docs/usage/v1_guide.md | 7 ++++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/models/supported_models.md b/docs/models/supported_models.md index a3cc1cda5f866..de28ca37396eb 100644 --- a/docs/models/supported_models.md +++ b/docs/models/supported_models.md @@ -766,7 +766,7 @@ Speech2Text models trained specifically for Automatic Speech Recognition. | Architecture | Models | Example HF Models | [LoRA](../features/lora.md) | [PP](../serving/parallelism_scaling.md) | [V1](gh-issue:8779) | |--------------|--------|-------------------|----------------------|---------------------------|---------------------| -| `WhisperForConditionalGeneration` | Whisper | `openai/whisper-small`, `openai/whisper-large-v3-turbo`, etc. | | | | +| `WhisperForConditionalGeneration` | Whisper | `openai/whisper-small`, `openai/whisper-large-v3-turbo`, etc. | | | ✅︎ | | `VoxtralForConditionalGeneration` | Voxtral (Mistral format) | `mistralai/Voxtral-Mini-3B-2507`, `mistralai/Voxtral-Small-24B-2507`, etc. | ✅︎ | ✅︎ | ✅︎ | | `Gemma3nForConditionalGeneration` | Gemma3n | `google/gemma-3n-E2B-it`, `google/gemma-3n-E4B-it`, etc. | | | ✅︎ | diff --git a/docs/usage/v1_guide.md b/docs/usage/v1_guide.md index 525f740d12a7f..d404c87e8f5a7 100644 --- a/docs/usage/v1_guide.md +++ b/docs/usage/v1_guide.md @@ -83,7 +83,7 @@ based on assigned priority, with FCFS as a tie-breaker), configurable via the | Model Type | Status | |-----------------------------|------------------------------------------------------------------------------------| | **Decoder-only Models** | 🚀 Optimized | -| **Encoder-Decoder Models** | 🟠 Delayed | +| **Encoder-Decoder Models** | 🟢 Whisper only | | **Embedding Models** | 🟢 Functional | | **Mamba Models** | 🟢 (Mamba-2), 🟢 (Mamba-1) | | **Multimodal Models** | 🟢 Functional | @@ -118,8 +118,9 @@ Please note that prefix caching is not yet supported for any of the above models #### Encoder-Decoder Models -Models requiring cross-attention between separate encoder and decoder (e.g., `BartForConditionalGeneration`, `MllamaForConditionalGeneration`) -are not yet supported. +Whisper is supported. Other models requiring cross-attention between separate +encoder and decoder (e.g., `BartForConditionalGeneration`, +`MllamaForConditionalGeneration`) are not yet supported. ### Features