From 07b8fae219b1fff51ef115c38c44b51395be5bb5 Mon Sep 17 00:00:00 2001 From: Kyle Yu <153807854+kyolebu@users.noreply.github.com> Date: Thu, 26 Jun 2025 18:22:12 -0400 Subject: [PATCH] [Doc] correct LoRA capitalization (#20135) Signed-off-by: kyolebu --- docs/README.md | 2 +- docs/models/supported_models.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/README.md b/docs/README.md index 0c6aff5fa07c3..9fb3137b31928 100644 --- a/docs/README.md +++ b/docs/README.md @@ -40,7 +40,7 @@ vLLM is flexible and easy to use with: - OpenAI-compatible API server - Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs, GaudiĀ® accelerators and GPUs, IBM Power CPUs, TPU, and AWS Trainium and Inferentia Accelerators. - Prefix caching support -- Multi-lora support +- Multi-LoRA support For more information, check out the following: diff --git a/docs/models/supported_models.md b/docs/models/supported_models.md index a435c59a3042b..04d9923f92105 100644 --- a/docs/models/supported_models.md +++ b/docs/models/supported_models.md @@ -427,7 +427,7 @@ Specified using `--task embed`. See [relevant issue on HF Transformers](https://github.com/huggingface/transformers/issues/34882). !!! note - `jinaai/jina-embeddings-v3` supports multiple tasks through lora, while vllm temporarily only supports text-matching tasks by merging lora weights. + `jinaai/jina-embeddings-v3` supports multiple tasks through LoRA, while vllm temporarily only supports text-matching tasks by merging LoRA weights. !!! note The second-generation GTE model (mGTE-TRM) is named `NewModel`. The name `NewModel` is too generic, you should set `--hf-overrides '{"architectures": ["GteNewModel"]}'` to specify the use of the `GteNewModel` architecture.