xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-10 23:25:34 +08:00

Author	SHA1	Message	Date
Cody Yu	d11bf435a0	[MISC] Consolidate cleanup() and refactor offline_inference_with_prefix.py (#9510 )	2024-10-18 14:30:55 -07:00
Cyrus Leung	26a68d5d7e	[CI/Build] Add test decorator for minimum GPU memory (#8925 )	2024-09-29 02:50:51 +00:00
alexeykondrat	d1dec64243	[CI/Build][ROCm] Enabling LoRA tests on ROCm (#7369 ) Co-authored-by: Simon Mo <simon.mo@hey.com>	2024-09-04 11:57:54 -07:00
Jee Jee Li	7ecee34321	[Kernel][RFC] Refactor the punica kernel based on Triton (#5036 )	2024-07-31 17:12:24 -07:00
Cyrus Leung	0e9164b40a	[mypy] Enable type checking for test directory (#5017 )	2024-06-15 04:45:31 +00:00
Jee Li	1096717ae9	[Core] Support LoRA on quantized models (#4012 )	2024-04-11 21:02:44 -07:00