[Bugfix][DCP] Set default CUDAGraphMode to PIECEWISE for DCP (#26574)

Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
This commit is contained in:
Jaya Yuan 2025-10-12 17:58:38 +08:00 committed by GitHub
parent 045b396d09
commit b91d8db873
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -350,6 +350,15 @@ class VllmConfig:
or self.model_config.is_encoder_decoder
):
self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE
# decode context parallel do not support full cudagraphs now.
if self.parallel_config.decode_context_parallel_size > 1:
logger.warning(
"Decode context parallel (DCP) is enabled, which is "
"incompatible with full CUDA graphs. Set "
"cudagraph_mode to PIECEWISE."
)
self.compilation_config.cudagraph_mode = CUDAGraphMode.PIECEWISE
else:
self.compilation_config.cudagraph_mode = CUDAGraphMode.NONE