mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-09 04:54:56 +08:00

Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: Joseph Marinier <Joseph.Marinier@gmail.com>

2025-09-17 11:53:12 -07:00

1.6 KiB

Raw Blame History

Custom Arguments

You can use vLLM custom arguments to pass in arguments which are not part of the vLLM SamplingParams and REST API specifications. Adding or removing a vLLM custom argument does not require recompiling vLLM, since the custom arguments are passed in as a dictionary.

Custom arguments can be useful if, for example, you want to use a custom logits processor without modifying the vLLM source code.

Offline Custom Arguments

Custom arguments passed to SamplingParams.extra_args as a dict will be visible to any code which has access to SamplingParams:

SamplingParams(extra_args={"your_custom_arg_name": 67})

This allows arguments which are not already part of SamplingParams to be passed into LLM as part of a request.

Online Custom Arguments

The vLLM REST API allows custom arguments to be passed to the vLLM server via vllm_xargs. The example below integrates custom arguments into a vLLM REST API request:

curl http://localhost:8000/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "Qwen/Qwen2.5-1.5B-Instruct",
        ...
        "vllm_xargs": {"your_custom_arg": 67}
    }'

Furthermore, OpenAI SDK users can access vllm_xargs via the extra_body argument:

batch = await client.completions.create(
    model="Qwen/Qwen2.5-1.5B-Instruct",
    ...,
    extra_body={
        "vllm_xargs": {
            "your_custom_arg": 67
        }
    }
)

!!! note vllm_xargs is assigned to SamplingParams.extra_args under the hood, so code which uses SamplingParams.extra_args is compatible with both offline and online scenarios.

1.6 KiB Raw Blame History

Custom Arguments

Offline Custom Arguments

Online Custom Arguments

1.6 KiB

Raw Blame History