[Misc] Fix Qwen3-VL video_grid_thw typing (#25646)

Signed-off-by: Roger Wang <hey@rogerw.io>
This commit is contained in:
Roger Wang 2025-09-25 03:16:45 -07:00 committed by GitHub
parent 393de22d2e
commit 7be9ffcd9f
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -1249,7 +1249,7 @@ class Qwen3VLForConditionalGeneration(nn.Module, SupportsMultiModal,
rope_type="rope_3d") rope_type="rope_3d")
else: else:
video_embeds = self.visual(pixel_values_videos, video_embeds = self.visual(pixel_values_videos,
grid_thw=grid_thw) grid_thw=grid_thw_list)
# Split concatenated embeddings for each video item. # Split concatenated embeddings for each video item.
# Using prod on grid_thw_list instead of grid_thw.prod avoids CUDA sync # Using prod on grid_thw_list instead of grid_thw.prod avoids CUDA sync