xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-24 14:44:27 +08:00

Author	SHA1	Message	Date
Federico Cassano	66d18a7fb0	add support for tokenizer revision (#1163 ) Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2023-10-02 19:19:46 -07:00
Woosuk Kwon	f936657eb6	Provide default max model length (#1224 )	2023-09-28 14:44:02 -07:00
Dan Lord	20f7cc4cde	Add `skip_special_tokens` sampling params (#1186 )	2023-09-27 19:21:42 -07:00
Wen Sun	bbbf86565f	Align `max_tokens` behavior with openai (#852 )	2023-09-23 18:10:13 -07:00
Ricardo Lu	f98b745a81	feat: support stop_token_ids parameter. (#1097 )	2023-09-21 15:34:02 -07:00
Roy	2d1e86f1b1	clean api code, remove redundant background task. (#1102 )	2023-09-21 13:25:05 -07:00
Woosuk Kwon	bc0644574c	Add gpu_memory_utilization and swap_space to LLM (#1090 )	2023-09-19 22:16:04 -07:00
orellavie1212	fbe66e1d0b	added support for quantize on LLM module (#1080 )	2023-09-18 11:04:21 -07:00
Lukas Kreussel	b5f93d0631	Only fail if logit_bias has actual values (#1045 )	2023-09-14 17:33:01 -07:00
Jasmond L	ab019eea75	Add Model Revision Support (#1014 ) Co-authored-by: Jasmond Loh <Jasmond.Loh@hotmail.com> Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2023-09-13 15:20:02 -07:00
Antoni Baum	080438477f	Start background task in `AsyncLLMEngine.generate` (#988 ) Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2023-09-08 00:03:39 -07:00
Antoni Baum	c07ece5ca4	Make `AsyncLLMEngine` more robust & fix batched abort (#969 ) Signed-off-by: Antoni Baum <antoni.baum@protonmail.com> Co-authored-by: Avnish Narayan <38871737+avnishn@users.noreply.github.com>	2023-09-07 13:43:45 -07:00
Antoni Baum	1696725879	Initialize AsyncLLMEngine bg loop correctly (#943 )	2023-09-04 17:41:22 -07:00
lplcor	becd7a56f1	Enable request body OpenAPI spec for OpenAI endpoints (#865 )	2023-08-29 21:54:08 -07:00
WanMok	e06f504a76	Supports tokens and arrays of tokens as inputs to the OpenAI completion API (#715 )	2023-08-11 12:14:34 -07:00
Nicolas Basile	66c54aa9c3	Check the max prompt length for the OpenAI completions API (#472 )	2023-08-08 17:43:49 -07:00
YHPeter	e8ddc08ec8	[BUG FIX] upgrade fschat version to 0.2.23 (#650 ) Co-authored-by: hao.yu <hao.yu@cn-c017.server.mila.quebec>	2023-08-02 14:05:59 -07:00
Zhuohan Li	58a072be15	[Fix] Add model sequence length into model config (#575 )	2023-07-25 23:46:30 -07:00
Zhuohan Li	82ad323dee	[Fix] Add chat completion Example and simplify dependencies (#576 )	2023-07-25 23:45:48 -07:00
Ricardo Lu	8c4b2592fb	fix: enable trust-remote-code in api server & benchmark. (#509 )	2023-07-19 17:06:15 -07:00
Woosuk Kwon	b6fbb9a565	Sort the outputs before return (#402 )	2023-07-08 14:48:18 -07:00
codethazine	a945fcc2ae	Add trust-remote-code flag to handle remote tokenizers (#364 )	2023-07-07 11:04:58 -07:00
Nicolas Frenay	be54f8e5c4	[Fix] Change /generate response-type to json for non-streaming (#374 )	2023-07-06 18:15:17 -07:00
Ricardo Lu	b396cb4998	fix: only response [DONE] once when streaming response. (#378 )	2023-07-06 18:08:40 -07:00
akxxsb	3d64cf019e	[Server] use fastchat.model.model_adapter.get_conversation_template method to get model template (#357 )	2023-07-04 21:39:59 -07:00
Zhuohan Li	98fe8cb542	[Server] Add option to specify chat template for chat endpoint (#345 )	2023-07-03 23:01:56 -07:00
Zhuohan Li	42e0c1df78	[Quality] Add CI for formatting (#343 )	2023-07-03 14:50:56 -07:00
Zhuohan Li	d6fa1be3a8	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
Zhuohan Li	0ffded812a	[Fix] Better error message for batched prompts (#342 )	2023-07-03 09:27:31 -07:00
Michele Catalano	0bd2a573a5	Allow send list of str for the Prompt on openai demo endpoint /v1/completions (#323 ) * allow str or List[str] for prompt * Update vllm/entrypoints/openai/api_server.py Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> --------- Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2023-07-03 09:17:50 -07:00
Ricardo Lu	49b26e2cec	feat: add ChatCompletion endpoint in OpenAI demo server. (#330 )	2023-07-02 22:54:33 -07:00
Woosuk Kwon	998d9d1509	[Tokenizer] Add tokenizer mode (#298 )	2023-06-28 14:19:22 -07:00
Woosuk Kwon	4338cc4750	[Tokenizer] Add an option to specify tokenizer (#284 )	2023-06-28 09:46:58 -07:00
Jishnu Ray Chowdhury	bdd6b4c8bc	Add LLM.set_tokenizer (#283 )	2023-06-28 00:28:29 -07:00
Woosuk Kwon	14f0b39cda	[Bugfix] Fix a bug in RequestOutput.finished (#202 )	2023-06-22 00:17:24 -07:00
Woosuk Kwon	0b98ba15c7	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00

... 11 12 13 14 15

736 Commits