mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2026-04-13 16:27:04 +08:00
[Misc] add AutoGen integration (#18712)
Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
This commit is contained in:
parent
e76be06550
commit
0665e29998
83
docs/deployment/frameworks/autogen.md
Normal file
83
docs/deployment/frameworks/autogen.md
Normal file
@ -0,0 +1,83 @@
|
||||
---
|
||||
title: AutoGen
|
||||
---
|
||||
[](){ #deployment-autogen }
|
||||
|
||||
[AutoGen](https://github.com/microsoft/autogen) is a framework for creating multi-agent AI applications that can act autonomously or work alongside humans.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Setup vLLM environment
|
||||
|
||||
- Setup [AutoGen](https://microsoft.github.io/autogen/0.2/docs/installation/) environment
|
||||
|
||||
```console
|
||||
pip install vllm
|
||||
|
||||
# Install AgentChat and OpenAI client from Extensions
|
||||
# AutoGen requires Python 3.10 or later.
|
||||
pip install -U "autogen-agentchat" "autogen-ext[openai]"
|
||||
```
|
||||
|
||||
## Deploy
|
||||
|
||||
- Start the vLLM server with the supported chat completion model, e.g.
|
||||
|
||||
```console
|
||||
python -m vllm.entrypoints.openai.api_server \
|
||||
--model mistralai/Mistral-7B-Instruct-v0.2
|
||||
```
|
||||
|
||||
- Call it with AutoGen:
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
from autogen_core.models import UserMessage
|
||||
from autogen_ext.models.openai import OpenAIChatCompletionClient
|
||||
from autogen_core.models import ModelFamily
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
# Create a model client
|
||||
model_client = OpenAIChatCompletionClient(
|
||||
model="mistralai/Mistral-7B-Instruct-v0.2",
|
||||
base_url="http://{your-vllm-host-ip}:{your-vllm-host-port}/v1",
|
||||
api_key="EMPTY",
|
||||
model_info={
|
||||
"vision": False,
|
||||
"function_calling": False,
|
||||
"json_output": False,
|
||||
"family": ModelFamily.MISTRAL,
|
||||
"structured_output": True,
|
||||
},
|
||||
)
|
||||
|
||||
messages = [UserMessage(content="Write a very short story about a dragon.", source="user")]
|
||||
|
||||
# Create a stream.
|
||||
stream = model_client.create_stream(messages=messages)
|
||||
|
||||
# Iterate over the stream and print the responses.
|
||||
print("Streamed responses:")
|
||||
async for response in stream:
|
||||
if isinstance(response, str):
|
||||
# A partial response is a string.
|
||||
print(response, flush=True, end="")
|
||||
else:
|
||||
# The last response is a CreateResult object with the complete message.
|
||||
print("\n\n------------\n")
|
||||
print("The complete response:", flush=True)
|
||||
print(response.content, flush=True)
|
||||
|
||||
# Close the client when done.
|
||||
await model_client.close()
|
||||
|
||||
|
||||
asyncio.run(main())
|
||||
```
|
||||
|
||||
For details, see the tutorial:
|
||||
|
||||
- [Using vLLM in AutoGen](https://microsoft.github.io/autogen/0.2/docs/topics/non-openai-models/local-vllm/)
|
||||
|
||||
- [OpenAI-compatible API examples](https://microsoft.github.io/autogen/stable/reference/python/autogen_ext.models.openai.html#autogen_ext.models.openai.OpenAIChatCompletionClient)
|
||||
Loading…
x
Reference in New Issue
Block a user