[Misc] [ROCm] Prevent surplus tensor reshape (#19803)

Signed-off-by: Zsolt Borbely <zsolt.borbely@htecgroup.com>
2026-06-22 10:47:19 +08:00 · 2025-06-19 07:57:16 +02:00 · 2025-06-19 07:57:16 +02:00 · aa20d10a91
commit aa20d10a91
parent 2de12be428
1 changed files with 1 additions and 1 deletions
--- a/vllm/v1/attention/backends/triton_attn.py
+++ b/vllm/v1/attention/backends/triton_attn.py
@ -376,7 +376,7 @@ class TritonAttentionImpl(AttentionImpl):
                    query.reshape(
                        (num_tokens, num_heads * head_size)).contiguous(),
                    layer._q_scale)
-            query = query.reshape((num_tokens, num_heads, head_size))
+                query = query.reshape((num_tokens, num_heads, head_size))

        use_local_attn = \
            (self.use_irope and attn_metadata.local_attn_metadata is not None)