diff --git a/docs/deployment/integrations/kaito.md b/docs/deployment/integrations/kaito.md new file mode 100644 index 0000000000000..ff050d3eeaf47 --- /dev/null +++ b/docs/deployment/integrations/kaito.md @@ -0,0 +1,5 @@ +# KAITO + +[KAITO](https://kaito-project.github.io/kaito/docs/) is a Kubernetes operator that supports deploying and serving LLMs with vLLM. It offers managing large models via container images with built-in OpenAI-compatible inference, auto-provisioning GPU nodes and curated model presets. + +Please refer to [quick start](https://kaito-project.github.io/kaito/docs/quick-start) for more details. diff --git a/docs/deployment/k8s.md b/docs/deployment/k8s.md index ca23e0b9fd8af..d3fda7eb6fb6e 100644 --- a/docs/deployment/k8s.md +++ b/docs/deployment/k8s.md @@ -12,6 +12,7 @@ Alternatively, you can deploy vLLM to Kubernetes using any of the following: - [Helm](frameworks/helm.md) - [InftyAI/llmaz](integrations/llmaz.md) +- [KAITO](integrations/kaito.md) - [KServe](integrations/kserve.md) - [KubeRay](integrations/kuberay.md) - [kubernetes-sigs/lws](frameworks/lws.md)