bnellnm
|
3cb07a36a2
|
[Misc] Upgrade to pytorch 2.5 (#9588)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-10-27 09:44:24 +00:00 |
|
Luka Govedič
|
71c60491f2
|
[Kernel] Build flash-attn from source (#8245)
|
2024-09-20 23:27:10 -07:00 |
|
tomeras91
|
386087970a
|
[CI/Build] build on empty device for better dev experience (#4773)
|
2024-08-11 13:09:44 -07:00 |
|
Woosuk Kwon
|
805a8a75f2
|
[Misc] Support attention logits soft-capping with flash-attn (#7022)
|
2024-08-01 13:14:37 -07:00 |
|
Sage Moore
|
7e0861bd0b
|
[CI/Build] Update PyTorch to 2.4.0 (#6951)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-08-01 11:11:24 -07:00 |
|
Cody Yu
|
aa48e502fb
|
[MISC] Upgrade dependency to PyTorch 2.3.1 (#5327)
|
2024-07-12 12:04:26 -07:00 |
|
Isotr0py
|
edd5fe5fa2
|
[Bugfix] Add phi3v resize for dynamic shape and fix torchvision requirement (#5772)
|
2024-06-24 12:11:53 +08:00 |
|
Antoni Baum
|
0ab278ca31
|
[Core] Remove unnecessary copies in flash attn backend (#5138)
|
2024-06-03 09:39:31 -07:00 |
|
youkaichao
|
5bd3c65072
|
[Core][Optimization] remove vllm-nccl (#5091)
|
2024-05-29 05:13:52 +00:00 |
|
Woosuk Kwon
|
b57e6c5949
|
[Kernel] Add flash-attn back (#4907)
|
2024-05-19 18:11:30 -07:00 |
|
Woosuk Kwon
|
89579a201f
|
[Misc] Use vllm-flash-attn instead of flash-attn (#4686)
|
2024-05-08 13:15:34 -07:00 |
|
Michael Goin
|
d627a3d837
|
[Misc] Upgrade to torch==2.3.0 (#4454)
|
2024-04-29 20:05:47 -04:00 |
|
youkaichao
|
e4bf860a54
|
[CI][Build] change pynvml to nvidia-ml-py (#4302)
|
2024-04-23 18:33:12 -07:00 |
|
Roy
|
8db1bf32f8
|
[Misc] Upgrade triton to 2.2.0 (#4061)
|
2024-04-14 17:43:54 -07:00 |
|
Woosuk Kwon
|
cfaf49a167
|
[Misc] Define common requirements (#3841)
|
2024-04-05 00:39:17 -07:00 |
|