diff --git a/docs/.nav.yml b/docs/.nav.yml
index e679807f7534..06bfcc3f1eff 100644
--- a/docs/.nav.yml
+++ b/docs/.nav.yml
@@ -39,6 +39,7 @@ nav:
- models/generative_models.md
- models/pooling_models.md
- models/extensions
+ - Hardware Supported Models: models/hardware_supported_models
- Features:
- features/compatibility_matrix.md
- features/*
diff --git a/docs/features/compatibility_matrix.md b/docs/features/compatibility_matrix.md
index 5d448eb5c03d..4f475ee4db83 100644
--- a/docs/features/compatibility_matrix.md
+++ b/docs/features/compatibility_matrix.md
@@ -59,23 +59,23 @@ th:not(:first-child) {
## Feature x Hardware
-| Feature | Volta | Turing | Ampere | Ada | Hopper | CPU | AMD |
-|-----------------------------------------------------------|--------------------|----------|----------|-------|----------|--------------------|-------|
-| [CP][chunked-prefill] | [❌](gh-issue:2729) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| [APC][automatic-prefix-caching] | [❌](gh-issue:3687) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| [LoRA][lora-adapter] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| prmpt adptr | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8475) | ✅ |
-| [SD][spec-decode] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| CUDA graph | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ |
-| pooling | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❔ |
-| enc-dec | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
-| mm | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| logP | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| prmpt logP | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| async output | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
-| multi-step | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8477) | ✅ |
-| best-of | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| beam-search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| Feature | Volta | Turing | Ampere | Ada | Hopper | CPU | AMD | TPU |
+|-----------------------------------------------------------|---------------------|-----------|-----------|--------|------------|--------------------|--------|-----|
+| [CP][chunked-prefill] | [❌](gh-issue:2729) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [APC][automatic-prefix-caching] | [❌](gh-issue:3687) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [LoRA][lora-adapter] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| prmpt adptr | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8475) | ✅ | ❌ |
+| [SD][spec-decode] | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
+| CUDA graph | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ |
+| pooling | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❔ | ❌ |
+| enc-dec | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
+| mm | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
+| logP | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
+| prmpt logP | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
+| async output | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
+| multi-step | ✅ | ✅ | ✅ | ✅ | ✅ | [❌](gh-issue:8477) | ✅ | ❌ |
+| best-of | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
+| beam-search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
!!! note
Please refer to [Feature support through NxD Inference backend][feature-support-through-nxd-inference-backend] for features supported on AWS Neuron hardware
diff --git a/docs/models/hardware_supported_models/tpu.md b/docs/models/hardware_supported_models/tpu.md
new file mode 100644
index 000000000000..dca5e20cb343
--- /dev/null
+++ b/docs/models/hardware_supported_models/tpu.md
@@ -0,0 +1,36 @@
+---
+title: TPU
+---
+[](){ #tpu-supported-models }
+
+# TPU Supported Models
+## Text-only Language Models
+
+| Model | Architecture | Supported |
+|-----------------------------------------------------|--------------------------------|-----------|
+| mistralai/Mixtral-8x7B-Instruct-v0.1 | MixtralForCausalLM | 🟨 |
+| mistralai/Mistral-Small-24B-Instruct-2501 | MistralForCausalLM | ✅ |
+| mistralai/Codestral-22B-v0.1 | MistralForCausalLM | ✅ |
+| mistralai/Mixtral-8x22B-Instruct-v0.1 | MixtralForCausalLM | ❌ |
+| meta-llama/Llama-3.3-70B-Instruct | LlamaForCausalLM | ✅ |
+| meta-llama/Llama-3.1-8B-Instruct | LlamaForCausalLM | ✅ |
+| meta-llama/Llama-3.1-70B-Instruct | LlamaForCausalLM | ✅ |
+| meta-llama/Llama-4-* | Llama4ForConditionalGeneration | ❌ |
+| microsoft/Phi-3-mini-128k-instruct | Phi3ForCausalLM | 🟨 |
+| microsoft/phi-4 | Phi3ForCausalLM | ❌ |
+| google/gemma-3-27b-it | Gemma3ForConditionalGeneration | 🟨 |
+| google/gemma-3-4b-it | Gemma3ForConditionalGeneration | ❌ |
+| deepseek-ai/DeepSeek-R1 | DeepseekV3ForCausalLM | ❌ |
+| deepseek-ai/DeepSeek-V3 | DeepseekV3ForCausalLM | ❌ |
+| RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8 | LlamaForCausalLM | ✅ |
+| RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w8a8 | LlamaForCausalLM | ✅ |
+| Qwen/Qwen3-8B | Qwen3ForCausalLM | ✅ |
+| Qwen/Qwen3-32B | Qwen3ForCausalLM | ✅ |
+| Qwen/Qwen2.5-7B-Instruct | Qwen2ForCausalLM | ✅ |
+| Qwen/Qwen2.5-32B | Qwen2ForCausalLM | ✅ |
+| Qwen/Qwen2.5-14B-Instruct | Qwen2ForCausalLM | ✅ |
+| Qwen/Qwen2.5-1.5B-Instruct | Qwen2ForCausalLM | 🟨 |
+
+✅ Runs and optimized.
+🟨 Runs and correct but not optimized to green yet.
+❌ Does not pass accuracy test or does not run.