vllm/openai at df04dffade84c87cafd74de4c39e6fd7cb95c24f - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-15 04:27:03 +08:00

History

Robert Shaw df04dffade

[V1] [4/N] API Server: ZMQ/MP Utilities (#11541 )

2024-12-28 01:45:08 +00:00

..

[Model] IBM Granite 3.1 (#11307 )

2024-12-19 11:27:24 +08:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

api_server.py

[V1] [4/N] API Server: ZMQ/MP Utilities (#11541 )

2024-12-28 01:45:08 +00:00

cli_args.py

[1/N] API Server (Remove Proxy) (#11529 )

2024-12-26 23:03:43 +00:00

logits_processors.py

[Bugfix] using len(tokenizer) instead of tokenizer.vocab_size in AllowedTokenIdsLogitsProcessor (#11156 )

2024-12-13 15:56:19 +00:00

protocol.py

[Frontend] Online Pooling API (#11457 )

2024-12-24 17:54:30 +08:00

run_batch.py

[Frontend] Online Pooling API (#11457 )

2024-12-24 17:54:30 +08:00

serving_chat.py

[Feature] Add load generation config from model (#11164 )

2024-12-19 10:50:38 +00:00

serving_completion.py

[Feature] Add load generation config from model (#11164 )

2024-12-19 10:50:38 +00:00

serving_embedding.py

[Frontend] Online Pooling API (#11457 )

2024-12-24 17:54:30 +08:00

serving_engine.py

[Frontend] Separate pooling APIs in offline inference (#11129 )

2024-12-13 10:40:07 +00:00

serving_pooling.py

[Frontend] Online Pooling API (#11457 )

2024-12-24 17:54:30 +08:00

serving_score.py

[Frontend] Online Pooling API (#11457 )

2024-12-24 17:54:30 +08:00

serving_tokenization.py

[Frontend] Use request id from header (#10968 )

2024-12-10 13:46:29 +08:00