diff --git a/docs/configuration/optimization.md b/docs/configuration/optimization.md
index 69d4de9d2f644..6c7c31f503c15 100644
--- a/docs/configuration/optimization.md
+++ b/docs/configuration/optimization.md
@@ -196,6 +196,13 @@ vllm serve Qwen/Qwen2.5-VL-3B-Instruct --api-server-count 4 -dp 2
 !!! note
     API server scale-out is only available for online inference.
 
+!!! warning
+    By default, 8 CPU threads are used in each API server to load media items (e.g. images)
+    from request data.
+
+    If you apply API server scale-out, consider adjusting `VLLM_MEDIA_LOADING_THREAD_COUNT`
+    to avoid CPU resource exhaustion.
+
 !!! note
     [Multi-modal processor cache](#processor-cache) is disabled when API server scale-out is enabled
     because it requires a one-to-one correspondance between API and engine core processes.