xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-10 00:25:01 +08:00

Author	SHA1	Message	Date
Luka Govedič	71c60491f2	[Kernel] Build flash-attn from source (#8245 )	2024-09-20 23:27:10 -07:00
tomeras91	386087970a	[CI/Build] build on empty device for better dev experience (#4773 )	2024-08-11 13:09:44 -07:00
Woosuk Kwon	805a8a75f2	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
Sage Moore	7e0861bd0b	[CI/Build] Update PyTorch to 2.4.0 (#6951 ) Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-08-01 11:11:24 -07:00
Cody Yu	aa48e502fb	[MISC] Upgrade dependency to PyTorch 2.3.1 (#5327 )	2024-07-12 12:04:26 -07:00
Isotr0py	edd5fe5fa2	[Bugfix] Add phi3v resize for dynamic shape and fix torchvision requirement (#5772 )	2024-06-24 12:11:53 +08:00
Antoni Baum	0ab278ca31	[Core] Remove unnecessary copies in flash attn backend (#5138 )	2024-06-03 09:39:31 -07:00
youkaichao	5bd3c65072	[Core][Optimization] remove vllm-nccl (#5091 )	2024-05-29 05:13:52 +00:00
Woosuk Kwon	b57e6c5949	[Kernel] Add flash-attn back (#4907 )	2024-05-19 18:11:30 -07:00
Woosuk Kwon	89579a201f	[Misc] Use vllm-flash-attn instead of flash-attn (#4686 )	2024-05-08 13:15:34 -07:00
Michael Goin	d627a3d837	[Misc] Upgrade to `torch==2.3.0` (#4454 )	2024-04-29 20:05:47 -04:00
youkaichao	e4bf860a54	[CI][Build] change pynvml to nvidia-ml-py (#4302 )	2024-04-23 18:33:12 -07:00
Roy	8db1bf32f8	[Misc] Upgrade triton to 2.2.0 (#4061 )	2024-04-14 17:43:54 -07:00
Woosuk Kwon	cfaf49a167	[Misc] Define common requirements (#3841 )	2024-04-05 00:39:17 -07:00