chenqianfzh
|
2f4117c38e
|
support bitsandbytes quantization with more models (#9148)
|
2024-10-08 19:52:19 -06:00 |
|
Cyrus Leung
|
b22b798471
|
[Model] PP support for embedding models and update docs (#9090)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2024-10-06 16:35:27 +08:00 |
|
Xin Yang
|
15986f598c
|
[Model] Support Gemma2 embedding model (#9004)
|
2024-10-05 06:57:05 +00:00 |
|
Murali Andoorveedu
|
0f6d7a9a34
|
[Models] Add remaining model PP support (#7168)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-04 10:56:58 +08:00 |
|
Blueyo0
|
1bf2dd9df0
|
[Gemma2] add bitsandbytes support for Gemma2 (#8338)
|
2024-09-11 21:53:12 -07:00 |
|
afeldman-nm
|
428dd1445e
|
[Core] Logprobs support in Multi-step (#7652)
|
2024-08-29 19:19:08 -07:00 |
|
Zijian Hu
|
f4fc7337bf
|
[Bugfix] support tie_word_embeddings for all models (#5724)
|
2024-08-19 20:00:04 -07:00 |
|
Woosuk Kwon
|
df845b2b46
|
[Misc] Remove Gemma RoPE (#7638)
|
2024-08-19 09:29:31 -07:00 |
|
Cyrus Leung
|
7025b11d94
|
[Bugfix] Fix weight loading for Chameleon when TP>1 (#7410)
|
2024-08-13 05:33:41 +00:00 |
|
Woosuk Kwon
|
805a8a75f2
|
[Misc] Support attention logits soft-capping with flash-attn (#7022)
|
2024-08-01 13:14:37 -07:00 |
|
Michael Goin
|
f4fd390f5d
|
[Bugfix] Lower gemma's unloaded_params exception to warning (#7002)
|
2024-08-01 12:01:07 -07:00 |
|
Lily Liu
|
69ec3ca14c
|
[Kernel][Model] logits_soft_cap for Gemma2 with flashinfer (#6051)
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-07-04 16:35:51 -07:00 |
|
Qubitium-ModelCloud
|
ee93f4f92a
|
[CORE] Quantized lm-head Framework (#4442)
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: ZX <zx@lbx.dev>
|
2024-07-02 22:25:17 +00:00 |
|
Murali Andoorveedu
|
c5832d2ae9
|
[Core] Pipeline Parallel Support (#4412)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
|
2024-07-02 10:58:08 -07:00 |
|
Woosuk Kwon
|
79c92c7c8a
|
[Model] Add Gemma 2 (#5908)
|
2024-06-27 13:33:56 -07:00 |
|