diff --git a/docs/source/models/engine_args.rst b/docs/source/models/engine_args.rst index 9f5f672ae4f34..d8a7ac72e0175 100644 --- a/docs/source/models/engine_args.rst +++ b/docs/source/models/engine_args.rst @@ -118,3 +118,19 @@ Below, you can find an explanation of every engine argument for vLLM: .. option:: --quantization (-q) {awq,squeezellm,None} Method used to quantize the weights. + +Async Engine Arguments +---------------------- +Below are the additional arguments related to the asynchronous engine: + +.. option:: --engine-use-ray + + Use Ray to start the LLM engine in a separate process as the server process. + +.. option:: --disable-log-requests + + Disable logging requests. + +.. option:: --max-log-len + + Max number of prompt characters or prompt ID numbers being printed in log. Defaults to unlimited. \ No newline at end of file