fix: handle None tokenizer in multimodal processor initialization

When skip_tokenizer_init=True is set, the tokenizer is None. Previously, this None value was unconditionally passed to the processor, which overrode the processor's ability to load its own tokenizer from the model path. This caused crashes in multimodal models like gemma-3 that require a tokenizer during processor initialization. The fix is to only pass the tokenizer kwarg when it's not None, allowing the processor to load its own tokenizer when skip_tokenizer_init=True. Fixes #31123 Signed-off-by: yurekami <69337011+yurekami@users.noreply.github.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: yurekami <yurekami@users.noreply.github.com>
2026-07-05 13:37:16 +08:00 · 2025-12-24 23:23:59 +09:00 · 2025-12-24 23:23:59 +09:00 · 2843784c1c
commit 2843784c1c
parent 7cd288a4b3
1 changed files with 10 additions and 1 deletions
--- a/vllm/multimodal/processing.py
+++ b/vllm/multimodal/processing.py
@ -1046,10 +1046,19 @@ class InputProcessingContext:

            typ = ProcessorMixin

+        # Only pass tokenizer if not None to allow the processor to
+        # load its own tokenizer from the model path when skip_tokenizer_init
+        # is True. Passing tokenizer=None would override the processor's
+        # tokenizer loading and cause crashes in multimodal models that
+        # require a tokenizer during processor initialization.
+        tokenizer_kwargs = {}
+        if self.tokenizer is not None:
+            tokenizer_kwargs["tokenizer"] = self.tokenizer
+
        return cached_processor_from_config(
            self.model_config,
            processor_cls=typ,
-            tokenizer=self.tokenizer,
+            **tokenizer_kwargs,
            **kwargs,
        )