[Doc]: fix typos in .md files (including those of #23751) (#23825)

Signed-off-by: Didier Durand <durand.didier@gmail.com>
This commit is contained in:
Didier Durand 2025-08-28 13:38:19 +02:00 committed by GitHub
parent 3462c1c522
commit d99c3a4f7b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
16 changed files with 19 additions and 19 deletions

View File

@ -855,7 +855,7 @@ Examples:
### Custom HF processor
Some models don't define a HF processor class on HF Hub. In that case, you can define a custom HF processor that has the same call signature as HF processors and pass it to [_call_hf_processor][vllm.multimodal.processing.BaseMultiModalProcessor._call_hf_processor].
Some models don't define an HF processor class on HF Hub. In that case, you can define a custom HF processor that has the same call signature as HF processors and pass it to [_call_hf_processor][vllm.multimodal.processing.BaseMultiModalProcessor._call_hf_processor].
Examples:

View File

@ -6,6 +6,6 @@ Supports speech-synthesis, multi-modal, and extensible (function call) plugin sy
One-click FREE deployment of your private OpenAI ChatGPT/Claude/Gemini/Groq/Ollama chat application.
It supports vLLM as a AI model provider to efficiently serve large language models.
It supports vLLM as an AI model provider to efficiently serve large language models.
For details, see the tutorial [Using vLLM in LobeChat](https://lobehub.com/docs/usage/providers/vllm).

View File

@ -99,11 +99,11 @@ http_request_duration_seconds_count{handler="/v1/completions",method="POST"} 201
### Multi-process Mode
In v0, metrics are collected in the engine core process and we use multi-process mode to make them available in the API server process. See <gh-pr:7279>.
In v0, metrics are collected in the engine core process and we use multiprocess mode to make them available in the API server process. See <gh-pr:7279>.
### Built in Python/Process Metrics
The following metrics are supported by default by `prometheus_client`, but they are not exposed when multi-process mode is used:
The following metrics are supported by default by `prometheus_client`, but they are not exposed when multiprocess mode is used:
- `python_gc_objects_collected_total`
- `python_gc_objects_uncollectable_total`

View File

@ -140,7 +140,7 @@ Alternatively, users can directly call the NxDI library to trace and compile you
- `NEURON_COMPILED_ARTIFACTS`: set this environment variable to point to your pre-compiled model artifacts directory to avoid
compilation time upon server initialization. If this variable is not set, the Neuron module will perform compilation and save the
artifacts under `neuron-compiled-artifacts/{unique_hash}/` sub-directory in the model path. If this environment variable is set,
artifacts under `neuron-compiled-artifacts/{unique_hash}/` subdirectory in the model path. If this environment variable is set,
but the directory does not exist, or the contents are invalid, Neuron will also fall back to a new compilation and store the artifacts
under this specified path.
- `NEURON_CONTEXT_LENGTH_BUCKETS`: Bucket sizes for context encoding. (Only applicable to `transformers-neuronx` backend).

View File

@ -1,6 +1,6 @@
# --8<-- [start:installation]
vLLM has experimental support for macOS with Apple silicon. For now, users must build from source to natively run on macOS.
vLLM has experimental support for macOS with Apple Silicon. For now, users must build from source to natively run on macOS.
Currently the CPU implementation for macOS supports FP32 and FP16 datatypes.

View File

@ -48,7 +48,7 @@ uv pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VE
#### Install the latest code
LLM inference is a fast-evolving field, and the latest code may contain bug fixes, performance improvements, and new features that are not released yet. To allow users to try the latest code without waiting for the next release, vLLM provides wheels for Linux running on a x86 platform with CUDA 12 for every commit since `v0.5.3`.
LLM inference is a fast-evolving field, and the latest code may contain bug fixes, performance improvements, and new features that are not released yet. To allow users to try the latest code without waiting for the next release, vLLM provides wheels for Linux running on an x86 platform with CUDA 12 for every commit since `v0.5.3`.
```bash
uv pip install -U vllm \

View File

@ -258,4 +258,4 @@ Expected output:
{"id":"embd-5c21fc9a5c9d4384a1b021daccaf9f64","object":"list","created":1745476417,"model":"jinaai/jina-embeddings-v3","data":[{"index":0,"object":"embedding","embedding":[-0.3828125,-0.1357421875,0.03759765625,0.125,0.21875,0.09521484375,-0.003662109375,0.1591796875,-0.130859375,-0.0869140625,-0.1982421875,0.1689453125,-0.220703125,0.1728515625,-0.2275390625,-0.0712890625,-0.162109375,-0.283203125,-0.055419921875,-0.0693359375,0.031982421875,-0.04052734375,-0.2734375,0.1826171875,-0.091796875,0.220703125,0.37890625,-0.0888671875,-0.12890625,-0.021484375,-0.0091552734375,0.23046875]}],"usage":{"prompt_tokens":8,"total_tokens":8,"completion_tokens":0,"prompt_tokens_details":null}}
```
A openai client example can be found here: <gh-file:examples/online_serving/openai_embedding_matryoshka_fy.py>
An OpenAI client example can be found here: <gh-file:examples/online_serving/openai_embedding_matryoshka_fy.py>

View File

@ -40,7 +40,7 @@ If it is `TransformersForCausalLM` or `TransformersForMultimodalLM` then it mean
#### Custom models
If a model is neither supported natively by vLLM or Transformers, it can still be used in vLLM!
If a model is neither supported natively by vLLM nor Transformers, it can still be used in vLLM!
For a model to be compatible with the Transformers backend for vLLM it must:

View File

@ -51,7 +51,7 @@ tail ~/.config/vllm/usage_stats.json
## Opting out
You can opt-out of usage stats collection by setting the `VLLM_NO_USAGE_STATS` or `DO_NOT_TRACK` environment variable, or by creating a `~/.config/vllm/do_not_track` file:
You can opt out of usage stats collection by setting the `VLLM_NO_USAGE_STATS` or `DO_NOT_TRACK` environment variable, or by creating a `~/.config/vllm/do_not_track` file:
```bash
# Any of the following methods can disable usage stats collection