From 23472ff51cdf25c2f9c9bf9afa50a8d3cc6cc1d8 Mon Sep 17 00:00:00 2001 From: Roger Wang Date: Fri, 8 Aug 2025 23:04:19 -0700 Subject: [PATCH] [Doc] Add usage of implicit text-only mode (#22561) Signed-off-by: Roger Wang Co-authored-by: Flora Feng <4florafeng@gmail.com> --- docs/models/supported_models.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/models/supported_models.md b/docs/models/supported_models.md index b79650444a54c..afabfccb552f8 100644 --- a/docs/models/supported_models.md +++ b/docs/models/supported_models.md @@ -583,6 +583,9 @@ See [this page](../features/multimodal_inputs.md) on how to pass multi-modal inp **This is no longer required if you are using vLLM V1.** +!!! tip + For hybrid-only models such as Llama-4, Step3 and Mistral-3, a text-only mode can be enabled by setting all supported multimodal modalities to 0 (e.g, `--limit-mm-per-prompt '{"image":0}`) so that their multimodal modules will not be loaded to free up more GPU memory for KV cache. + !!! note vLLM currently only supports adding LoRA to the language backbone of multimodal models.