xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-28 08:57:05 +08:00

Author	SHA1	Message	Date
Woosuk Kwon	e67b4f2c2a	Use FP32 in RoPE initialization (#1004 ) Co-authored-by: One <imone@tuta.io>	2023-09-11 00:26:35 -07:00
Antoni Baum	a62de9ecfd	Fix wrong dtype in PagedAttentionWithALiBi bias (#996 ) --------- Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>	2023-09-09 14:58:35 -07:00
Robert Irvine	4b5bcf8906	faster startup of vLLM (#982 ) * update --------- Co-authored-by: Robert Irvine <robert@seamlessml.com>	2023-09-08 14:48:54 +09:00
Woosuk Kwon	320a622ec4	[BugFix] Implement RoPE for GPT-J (#941 )	2023-09-06 11:54:33 +09:00
Zhuohan Li	002800f081	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
Dong-Yong Lee	e11222333f	fix: bug fix when penalties are negative (#913 ) Co-authored-by: dongyong-lee <dongyong.lee@navercorp.com>	2023-09-01 00:37:17 +09:00
Aman Gupta Karmani	28873a2799	Improve _prune_hidden_states micro-benchmark (#707 )	2023-08-31 13:28:43 +09:00
Aman Gupta Karmani	75471386de	use flash-attn via xformers (#877 )	2023-08-29 21:52:13 -07:00
Woosuk Kwon	94d2f59895	Set replacement=True in torch.multinomial (#858 )	2023-08-25 12:22:01 +09:00
Woosuk Kwon	2a4ec90854	Fix for breaking changes in xformers 0.0.21 (#834 )	2023-08-23 17:44:21 +09:00
Woosuk Kwon	d64bf1646c	Implement approximate GELU kernels (#828 )	2023-08-23 07:43:21 +09:00
Abraham-Xu	d1744376ae	Align with huggingface Top K sampling (#753 )	2023-08-15 16:44:33 -07:00
Woosuk Kwon	55fe8a81ec	Refactor scheduler (#658 )	2023-08-02 16:42:01 -07:00
Zhuohan Li	1b0bd0fe8a	Add Falcon support (new) (#592 )	2023-08-02 14:04:39 -07:00
Zhuohan Li	6fc2a38b11	Add support for LLaMA-2 (#505 )	2023-07-20 11:38:27 -07:00
Song	bda41c70dd	hotfix attn alibi wo head mapping (#496 ) Co-authored-by: oliveryuan <oliveryuan@basemind.com>	2023-07-18 11:31:48 -07:00
Zhuohan Li	96853af5a8	Optimize MQA Kernel (#452 )	2023-07-14 20:06:40 -04:00
Andre Slavescu	c894836108	[Model] Add support for GPT-J (#226 ) Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>	2023-07-08 17:55:16 -07:00
Woosuk Kwon	404422f42e	[Model] Add support for MPT (#334 )	2023-07-03 16:47:53 -07:00
Woosuk Kwon	e41f06702c	Add support for BLOOM (#331 )	2023-07-03 13:12:35 -07:00
Zhuohan Li	d6fa1be3a8	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
Lily Liu	425040d4c1	remove floats == 0 comparison (#285 )	2023-06-28 14:11:51 -07:00
Michael Feil	298695b766	GPTBigCode (StarCoder, SantaCoder Support) (#209 )	2023-06-23 01:49:27 +08:00
Woosuk Kwon	0b98ba15c7	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00

... 22 23 24 25 26

1274 Commits