diff --git a/docs/source/deployment/integrations/index.md b/docs/source/deployment/integrations/index.md index c286edb4d7bc1..a557456c086d2 100644 --- a/docs/source/deployment/integrations/index.md +++ b/docs/source/deployment/integrations/index.md @@ -6,4 +6,5 @@ kserve kubeai llamastack +llmaz ::: diff --git a/docs/source/deployment/integrations/llmaz.md b/docs/source/deployment/integrations/llmaz.md new file mode 100644 index 0000000000000..cd4a76353d264 --- /dev/null +++ b/docs/source/deployment/integrations/llmaz.md @@ -0,0 +1,7 @@ +(deployment-llmaz)= + +# llmaz + +[llmaz](https://github.com/InftyAI/llmaz) is an easy-to-use and advanced inference platform for large language models on Kubernetes, aimed for production use. It uses vLLM as the default model serving backend. + +Please refer to the [Quick Start](https://github.com/InftyAI/llmaz?tab=readme-ov-file#quick-start) for more details.