[doc] fix "Other AI accelerators" getting started page (#19457)

Signed-off-by: David Xia <david@davidxia.com>
2025-12-14 07:05:01 +08:00 · 2025-06-11 12:11:17 -04:00 · 2025-06-11 12:11:17 -04:00 · 89b0f84e17
commit 89b0f84e17
parent 497a91e9f7
3 changed files with 50 additions and 43 deletions
--- a/docs/getting_started/installation/ai_accelerator/hpu-gaudi.inc.md
+++ b/docs/getting_started/installation/ai_accelerator/hpu-gaudi.inc.md
@ -19,7 +19,8 @@ to set up the execution environment. To achieve the best performance,
 please follow the methods outlined in the
 [Optimizing Training Platform Guide](https://docs.habana.ai/en/latest/PyTorch/Model_Optimization_PyTorch/Optimization_in_Training_Platform.html).

-## Configure a new environment
+# --8<-- [end:requirements]
+# --8<-- [start:configure-a-new-environment]

 ### Environment verification

@ -56,7 +57,7 @@ docker run \
  vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.4.0:latest
 ```

-# --8<-- [end:requirements]
+# --8<-- [end:configure-a-new-environment]
 # --8<-- [start:set-up-using-python]

 # --8<-- [end:set-up-using-python]
@ -183,7 +184,6 @@ Currently in vLLM for HPU we support four execution modes, depending on selected
 |                    0 |                 0 | torch.compile      |
 |                    0 |                 1 | PyTorch eager mode |
 |                    1 |                 0 | HPU Graphs         |
-  <figcaption>vLLM execution modes</figcaption>

 !!! warning
    In 1.18.0, all modes utilizing `PT_HPU_LAZY_MODE=0` are highly experimental and should be only used for validating functional correctness. Their performance will be improved in the next releases. For obtaining the best performance in 1.18.0, please use HPU Graphs, or PyTorch lazy mode.
--- a/docs/getting_started/installation/ai_accelerator/neuron.inc.md
+++ b/docs/getting_started/installation/ai_accelerator/neuron.inc.md
@ -17,7 +17,8 @@
 - Accelerator: NeuronCore-v2 (in trn1/inf2 chips) or NeuronCore-v3 (in trn2 chips)
 - AWS Neuron SDK 2.23

-## Configure a new environment
+# --8<-- [end:requirements]
+# --8<-- [start:configure-a-new-environment]

 ### Launch a Trn1/Trn2/Inf2 instance and verify Neuron dependencies

@ -37,7 +38,7 @@ for alternative setup instructions including using Docker and manually installin
    NxD Inference is the default recommended backend to run inference on Neuron. If you are looking to use the legacy [transformers-neuronx](https://github.com/aws-neuron/transformers-neuronx)
    library, refer to [Transformers NeuronX Setup](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/transformers-neuronx/setup/index.html).

-# --8<-- [end:requirements]
+# --8<-- [end:configure-a-new-environment]
 # --8<-- [start:set-up-using-python]

 # --8<-- [end:set-up-using-python]
--- a/docs/getting_started/installation/ai_accelerator/tpu.inc.md
+++ b/docs/getting_started/installation/ai_accelerator/tpu.inc.md
@ -58,11 +58,13 @@ assigned to your Google Cloud project for your immediate exclusive use.
 ### Provision Cloud TPUs with GKE

 For more information about using TPUs with GKE, see:
+
 - <https://cloud.google.com/kubernetes-engine/docs/how-to/tpus>
 - <https://cloud.google.com/kubernetes-engine/docs/concepts/tpus>
 - <https://cloud.google.com/kubernetes-engine/docs/concepts/plan-tpus>

-## Configure a new environment
+# --8<-- [end:requirements]
+# --8<-- [start:configure-a-new-environment]

 ### Provision a Cloud TPU with the queued resource API

@ -81,12 +83,12 @@ gcloud alpha compute tpus queued-resources create QUEUED_RESOURCE_ID \
 | Parameter name     | Description                                                                                                                                                                                              |
 |--------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | QUEUED_RESOURCE_ID | The user-assigned ID of the queued resource request.                                                                                                                                                     |
-| TPU_NAME           | The user-assigned name of the TPU which is created when the queued                                                                                                                                       |
+| TPU_NAME           | The user-assigned name of the TPU which is created when the queued resource request is allocated.                                                                                                        |
 | PROJECT_ID         | Your Google Cloud project                                                                                                                                                                                |
-| ZONE               | The GCP zone where you want to create your Cloud TPU. The value you use                                                                                                                                  |
-| ACCELERATOR_TYPE   | The TPU version you want to use. Specify the TPU version, for example                                                                                                                                    |
-| RUNTIME_VERSION    | The TPU VM runtime version to use. For example, use `v2-alpha-tpuv6e` for a VM loaded with one or more v6e TPU(s). For more information see [TPU VM images](https://cloud.google.com/tpu/docs/runtimes). |
-  <figcaption>Parameter descriptions</figcaption>
+| ZONE               | The GCP zone where you want to create your Cloud TPU. The value you use depends on the version of TPUs you are using. For more information, see [TPU regions and zones]                                  |
+| ACCELERATOR_TYPE   | The TPU version you want to use. Specify the TPU version, for example `v5litepod-4` specifies a v5e TPU with 4 cores, `v6e-1` specifies a v6e TPU with 1 core. For more information, see [TPU versions]. |
+| RUNTIME_VERSION    | The TPU VM runtime version to use. For example, use `v2-alpha-tpuv6e` for a VM loaded with one or more v6e TPU(s). For more information see [TPU VM images].                                             |
+| SERVICE_ACCOUNT    | The email address for your service account. You can find it in the IAM Cloud Console under *Service Accounts*. For example: `tpu-service-account@<your_project_ID>.iam.gserviceaccount.com`              |

 Connect to your TPU using SSH:

@ -94,7 +96,11 @@ Connect to your TPU using SSH:
 gcloud compute tpus tpu-vm ssh TPU_NAME --zone ZONE
 ```

-# --8<-- [end:requirements]
+[TPU versions]: https://cloud.google.com/tpu/docs/runtimes
+[TPU VM images]: https://cloud.google.com/tpu/docs/runtimes
+[TPU regions and zones]: https://cloud.google.com/tpu/docs/regions-zones
+
+# --8<-- [end:configure-a-new-environment]
 # --8<-- [start:set-up-using-python]

 # --8<-- [end:set-up-using-python]