From d14c4ebf08c9a6b6c4131eb3021d50da6fb0c212 Mon Sep 17 00:00:00 2001
From: Michael Yao <haifeng.yao@daocloud.io>
Date: Thu, 11 Sep 2025 16:50:12 +0800
Subject: [PATCH] [Docs] Use 1-2-3 list for deploy steps in
 deployment/frameworks/ (#24633)

Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
---
 docs/deployment/frameworks/autogen.md         | 16 ++---
 docs/deployment/frameworks/chatbox.md         | 24 ++++---
 docs/deployment/frameworks/dify.md            | 50 ++++++++-------
 docs/deployment/frameworks/haystack.md        | 12 ++--
 docs/deployment/frameworks/litellm.md         | 22 +++----
 .../retrieval_augmented_generation.md         | 64 +++++++++----------
 6 files changed, 98 insertions(+), 90 deletions(-)

diff --git a/docs/deployment/frameworks/autogen.md b/docs/deployment/frameworks/autogen.md
index c255a85d38401..7517ee771c097 100644
--- a/docs/deployment/frameworks/autogen.md
+++ b/docs/deployment/frameworks/autogen.md
@@ -4,9 +4,7 @@
 
 ## Prerequisites
 
-- Setup vLLM environment
-
-- Setup [AutoGen](https://microsoft.github.io/autogen/0.2/docs/installation/) environment
+Set up the vLLM and [AutoGen](https://microsoft.github.io/autogen/0.2/docs/installation/) environment:
 
 ```bash
 pip install vllm
@@ -18,14 +16,14 @@ pip install -U "autogen-agentchat" "autogen-ext[openai]"
 
 ## Deploy
 
-- Start the vLLM server with the supported chat completion model, e.g.
+1. Start the vLLM server with the supported chat completion model, e.g.
 
-```bash
-python -m vllm.entrypoints.openai.api_server \
-    --model mistralai/Mistral-7B-Instruct-v0.2
-```
+    ```bash
+    python -m vllm.entrypoints.openai.api_server \
+        --model mistralai/Mistral-7B-Instruct-v0.2
+    ```
 
-- Call it with AutoGen:
+1. Call it with AutoGen:
 
 ??? code
 
diff --git a/docs/deployment/frameworks/chatbox.md b/docs/deployment/frameworks/chatbox.md
index cbca6e6282fc6..002935da56009 100644
--- a/docs/deployment/frameworks/chatbox.md
+++ b/docs/deployment/frameworks/chatbox.md
@@ -6,27 +6,31 @@ It allows you to deploy a large language model (LLM) server with vLLM as the bac
 
 ## Prerequisites
 
-- Setup vLLM environment
+Set up the vLLM environment:
+
+```bash
+pip install vllm
+```
 
 ## Deploy
 
-- Start the vLLM server with the supported chat completion model, e.g.
+1. Start the vLLM server with the supported chat completion model, e.g.
 
-```bash
-vllm serve qwen/Qwen1.5-0.5B-Chat
-```
+    ```bash
+    vllm serve qwen/Qwen1.5-0.5B-Chat
+    ```
 
-- Download and install [Chatbox desktop](https://chatboxai.app/en#download).
+1. Download and install [Chatbox desktop](https://chatboxai.app/en#download).
 
-- On the bottom left of settings, Add Custom Provider
+1. On the bottom left of settings, Add Custom Provider
     - API Mode: `OpenAI API Compatible`
     - Name: vllm
     - API Host: `http://{vllm server host}:{vllm server port}/v1`
     - API Path: `/chat/completions`
     - Model: `qwen/Qwen1.5-0.5B-Chat`
 
-![](../../assets/deployment/chatbox-settings.png)
+    ![](../../assets/deployment/chatbox-settings.png)
 
-- Go to `Just chat`, and start to chat:
+1. Go to `Just chat`, and start to chat:
 
-![](../../assets/deployment/chatbox-chat.png)
+    ![](../../assets/deployment/chatbox-chat.png)
diff --git a/docs/deployment/frameworks/dify.md b/docs/deployment/frameworks/dify.md
index 35f02c33cb02b..820ef0cbed9fa 100644
--- a/docs/deployment/frameworks/dify.md
+++ b/docs/deployment/frameworks/dify.md
@@ -8,44 +8,50 @@ This guide walks you through deploying Dify using a vLLM backend.
 
 ## Prerequisites
 
-- Setup vLLM environment
-- Install [Docker](https://docs.docker.com/engine/install/) and [Docker Compose](https://docs.docker.com/compose/install/)
+Set up the vLLM environment:
+
+```bash
+pip install vllm
+```
+
+And install [Docker](https://docs.docker.com/engine/install/) and [Docker Compose](https://docs.docker.com/compose/install/).
 
 ## Deploy
 
-- Start the vLLM server with the supported chat completion model, e.g.
+1. Start the vLLM server with the supported chat completion model, e.g.
 
-```bash
-vllm serve Qwen/Qwen1.5-7B-Chat
-```
+    ```bash
+    vllm serve Qwen/Qwen1.5-7B-Chat
+    ```
 
-- Start the Dify server with docker compose ([details](https://github.com/langgenius/dify?tab=readme-ov-file#quick-start)):
+1. Start the Dify server with docker compose ([details](https://github.com/langgenius/dify?tab=readme-ov-file#quick-start)):
 
-```bash
-git clone https://github.com/langgenius/dify.git
-cd dify
-cd docker
-cp .env.example .env
-docker compose up -d
-```
+    ```bash
+    git clone https://github.com/langgenius/dify.git
+    cd dify
+    cd docker
+    cp .env.example .env
+    docker compose up -d
+    ```
 
-- Open the browser to access `http://localhost/install`, config the basic login information and login.
+1. Open the browser to access `http://localhost/install`, config the basic login information and login.
 
-- In the top-right user menu (under the profile icon), go to Settings, then click `Model Provider`, and locate the `vLLM` provider to install it.
+1. In the top-right user menu (under the profile icon), go to Settings, then click `Model Provider`, and locate the `vLLM` provider to install it.
+
+1. Fill in the model provider details as follows:
 
-- Fill in the model provider details as follows:
     - **Model Type**: `LLM`
     - **Model Name**: `Qwen/Qwen1.5-7B-Chat`
     - **API Endpoint URL**: `http://{vllm_server_host}:{vllm_server_port}/v1`
     - **Model Name for API Endpoint**: `Qwen/Qwen1.5-7B-Chat`
     - **Completion Mode**: `Completion`
 
-![](../../assets/deployment/dify-settings.png)
+    ![](../../assets/deployment/dify-settings.png)
 
-- To create a test chatbot, go to `Studio → Chatbot → Create from Blank`, then select Chatbot as the type:
+1. To create a test chatbot, go to `Studio → Chatbot → Create from Blank`, then select Chatbot as the type:
 
-![](../../assets/deployment/dify-create-chatbot.png)
+    ![](../../assets/deployment/dify-create-chatbot.png)
 
-- Click the chatbot you just created to open the chat interface and start interacting with the model:
+1. Click the chatbot you just created to open the chat interface and start interacting with the model:
 
-![](../../assets/deployment/dify-chat.png)
+    ![](../../assets/deployment/dify-chat.png)
diff --git a/docs/deployment/frameworks/haystack.md b/docs/deployment/frameworks/haystack.md
index 70b4b48d4543e..836305cf15c42 100644
--- a/docs/deployment/frameworks/haystack.md
+++ b/docs/deployment/frameworks/haystack.md
@@ -6,7 +6,7 @@ It allows you to deploy a large language model (LLM) server with vLLM as the bac
 
 ## Prerequisites
 
-- Setup vLLM and Haystack environment
+Set up the vLLM and Haystack environment:
 
 ```bash
 pip install vllm haystack-ai
@@ -14,13 +14,13 @@ pip install vllm haystack-ai
 
 ## Deploy
 
-- Start the vLLM server with the supported chat completion model, e.g.
+1. Start the vLLM server with the supported chat completion model, e.g.
 
-```bash
-vllm serve mistralai/Mistral-7B-Instruct-v0.1
-```
+    ```bash
+    vllm serve mistralai/Mistral-7B-Instruct-v0.1
+    ```
 
-- Use the `OpenAIGenerator` and `OpenAIChatGenerator` components in Haystack to query the vLLM server.
+1. Use the `OpenAIGenerator` and `OpenAIChatGenerator` components in Haystack to query the vLLM server.
 
 ??? code
 
diff --git a/docs/deployment/frameworks/litellm.md b/docs/deployment/frameworks/litellm.md
index c7e514f2276e0..0d6c3729911ad 100644
--- a/docs/deployment/frameworks/litellm.md
+++ b/docs/deployment/frameworks/litellm.md
@@ -13,7 +13,7 @@ And LiteLLM supports all models on VLLM.
 
 ## Prerequisites
 
-- Setup vLLM and litellm environment
+Set up the vLLM and litellm environment:
 
 ```bash
 pip install vllm litellm
@@ -23,13 +23,13 @@ pip install vllm litellm
 
 ### Chat completion
 
-- Start the vLLM server with the supported chat completion model, e.g.
+1. Start the vLLM server with the supported chat completion model, e.g.
 
-```bash
-vllm serve qwen/Qwen1.5-0.5B-Chat
-```
+    ```bash
+    vllm serve qwen/Qwen1.5-0.5B-Chat
+    ```
 
-- Call it with litellm:
+1. Call it with litellm:
 
 ??? code
 
@@ -51,13 +51,13 @@ vllm serve qwen/Qwen1.5-0.5B-Chat
 
 ### Embeddings
 
-- Start the vLLM server with the supported embedding model, e.g.
+1. Start the vLLM server with the supported embedding model, e.g.
 
-```bash
-vllm serve BAAI/bge-base-en-v1.5
-```
+    ```bash
+    vllm serve BAAI/bge-base-en-v1.5
+    ```
 
-- Call it with litellm:
+1. Call it with litellm:
 
 ```python
 from litellm import embedding   
diff --git a/docs/deployment/frameworks/retrieval_augmented_generation.md b/docs/deployment/frameworks/retrieval_augmented_generation.md
index d5f2ec302b6cd..d86ab1600f126 100644
--- a/docs/deployment/frameworks/retrieval_augmented_generation.md
+++ b/docs/deployment/frameworks/retrieval_augmented_generation.md
@@ -11,7 +11,7 @@ Here are the integrations:
 
 ### Prerequisites
 
-- Setup vLLM and langchain environment
+Set up the vLLM and langchain environment:
 
 ```bash
 pip install -U vllm \
@@ -22,33 +22,33 @@ pip install -U vllm \
 
 ### Deploy
 
-- Start the vLLM server with the supported embedding model, e.g.
+1. Start the vLLM server with the supported embedding model, e.g.
 
-```bash
-# Start embedding service (port 8000)
-vllm serve ssmits/Qwen2-7B-Instruct-embed-base
-```
+    ```bash
+    # Start embedding service (port 8000)
+    vllm serve ssmits/Qwen2-7B-Instruct-embed-base
+    ```
 
-- Start the vLLM server with the supported chat completion model, e.g.
+1. Start the vLLM server with the supported chat completion model, e.g.
 
-```bash
-# Start chat service (port 8001)
-vllm serve qwen/Qwen1.5-0.5B-Chat --port 8001
-```
+    ```bash
+    # Start chat service (port 8001)
+    vllm serve qwen/Qwen1.5-0.5B-Chat --port 8001
+    ```
 
-- Use the script: <gh-file:examples/online_serving/retrieval_augmented_generation_with_langchain.py>
+1. Use the script: <gh-file:examples/online_serving/retrieval_augmented_generation_with_langchain.py>
 
-- Run the script
+1. Run the script
 
-```python
-python retrieval_augmented_generation_with_langchain.py
-```
+    ```python
+    python retrieval_augmented_generation_with_langchain.py
+    ```
 
 ## vLLM + llamaindex
 
 ### Prerequisites
 
-- Setup vLLM and llamaindex environment
+Set up the vLLM and llamaindex environment:
 
 ```bash
 pip install vllm \
@@ -60,24 +60,24 @@ pip install vllm \
 
 ### Deploy
 
-- Start the vLLM server with the supported embedding model, e.g.
+1. Start the vLLM server with the supported embedding model, e.g.
 
-```bash
-# Start embedding service (port 8000)
-vllm serve ssmits/Qwen2-7B-Instruct-embed-base
-```
+    ```bash
+    # Start embedding service (port 8000)
+    vllm serve ssmits/Qwen2-7B-Instruct-embed-base
+    ```
 
-- Start the vLLM server with the supported chat completion model, e.g.
+1. Start the vLLM server with the supported chat completion model, e.g.
 
-```bash
-# Start chat service (port 8001)
-vllm serve qwen/Qwen1.5-0.5B-Chat --port 8001
-```
+    ```bash
+    # Start chat service (port 8001)
+    vllm serve qwen/Qwen1.5-0.5B-Chat --port 8001
+    ```
 
-- Use the script: <gh-file:examples/online_serving/retrieval_augmented_generation_with_llamaindex.py>
+1. Use the script: <gh-file:examples/online_serving/retrieval_augmented_generation_with_llamaindex.py>
 
-- Run the script
+1. Run the script:
 
-```python
-python retrieval_augmented_generation_with_llamaindex.py
-```
+    ```python
+    python retrieval_augmented_generation_with_llamaindex.py
+    ```