mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-09 17:55:01 +08:00
[Doc]Add asynchronous engine arguments to documentation. (#3810)
Co-authored-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
This commit is contained in:
parent
c391e4b68e
commit
78107fa091
@ -118,3 +118,19 @@ Below, you can find an explanation of every engine argument for vLLM:
|
||||
.. option:: --quantization (-q) {awq,squeezellm,None}
|
||||
|
||||
Method used to quantize the weights.
|
||||
|
||||
Async Engine Arguments
|
||||
----------------------
|
||||
Below are the additional arguments related to the asynchronous engine:
|
||||
|
||||
.. option:: --engine-use-ray
|
||||
|
||||
Use Ray to start the LLM engine in a separate process as the server process.
|
||||
|
||||
.. option:: --disable-log-requests
|
||||
|
||||
Disable logging requests.
|
||||
|
||||
.. option:: --max-log-len
|
||||
|
||||
Max number of prompt characters or prompt ID numbers being printed in log. Defaults to unlimited.
|
||||
Loading…
x
Reference in New Issue
Block a user