diff --git a/docs/examples/README.md b/docs/examples/README.md
index 34e4dfd408a20..3cf93027f4209 100644
--- a/docs/examples/README.md
+++ b/docs/examples/README.md
@@ -2,6 +2,6 @@
 
 vLLM's examples are split into three categories:
 
-- If you are using vLLM from within Python code, see [Offline Inference](./offline_inference/)
-- If you are using vLLM from an HTTP application or client, see [Online Serving](./online_serving/)
-- For examples of using some of vLLM's advanced features (e.g. LMCache or Tensorizer) which are not specific to either of the above use cases, see [Others](./others/)
+- If you are using vLLM from within Python code, see [Offline Inference](./offline_inference)
+- If you are using vLLM from an HTTP application or client, see [Online Serving](./online_serving)
+- For examples of using some of vLLM's advanced features (e.g. LMCache or Tensorizer) which are not specific to either of the above use cases, see [Others](./others)
diff --git a/docs/models/generative_models.md b/docs/models/generative_models.md
index a64ecd31ebaef..d02522a6657de 100644
--- a/docs/models/generative_models.md
+++ b/docs/models/generative_models.md
@@ -19,7 +19,7 @@ Run a model in generation mode via the option `--runner generate`.
 ## Offline Inference
 
 The [LLM][vllm.LLM] class provides various methods for offline inference.
-See [configuration](../api/summary.md#configuration) for a list of options when initializing the model.
+See [configuration](../api/README.md#configuration) for a list of options when initializing the model.
 
 ### `LLM.generate`
 
diff --git a/docs/models/pooling_models.md b/docs/models/pooling_models.md
index 753d8bd0b8339..fbb5f6f6dd171 100644
--- a/docs/models/pooling_models.md
+++ b/docs/models/pooling_models.md
@@ -81,7 +81,7 @@ which takes priority over both the model's and Sentence Transformers's defaults.
 ## Offline Inference
 
 The [LLM][vllm.LLM] class provides various methods for offline inference.
-See [configuration](../api/summary.md#configuration) for a list of options when initializing the model.
+See [configuration](../api/README.md#configuration) for a list of options when initializing the model.
 
 ### `LLM.embed`