Asaf Joseph Gardin
|
00b31a36a2
|
[V1] [Hybrid] Mamba1 Automatic Prefix Caching (#26377)
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
|
2025-11-02 04:16:23 -08:00 |
|
Asaf Joseph Gardin
|
2b41cbbf03
|
[V1][Mamba1] - FP32 SSM Kernel Support (#23506)
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
|
2025-09-01 20:53:00 -07:00 |
|
Asaf Joseph Gardin
|
46a13949d5
|
[v1] - Mamba1 Attention Metadata (#21249)
Signed-off-by: asafg <asafg@ai21.com>
Co-authored-by: asafg <asafg@ai21.com>
|
2025-08-06 17:03:42 -07:00 |
|
TJian
|
c488b928a7
|
[ROCm] [Bugfix] [Critical]: Fix mamba compilation bug (#20883)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: vllmellm <vllm.ellm@embeddedllm.com>
|
2025-07-14 15:23:28 +08:00 |
|
Congcong Chen
|
2c11a738b3
|
[Model] New model support for microsoft/Phi-4-mini-flash-reasoning (#20702)
Signed-off-by: Congcong Chen <congcongchen@microsoft.com>
|
2025-07-12 06:02:10 -07:00 |
|
Tyler Michael Smith
|
e8c3bd2cd1
|
[Bugfix] Fix some narrowing conversion warnings (#20141)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-06-27 09:01:28 -07:00 |
|
almersawi
|
a547aeb828
|
feat(rocm-support): support mamba2 on rocm (#18565)
Signed-off-by: Islam Almersawi <islam.almersawi@openinnovation.ai>
Co-authored-by: Islam Almersawi <islam.almersawi@openinnovation.ai>
|
2025-05-27 00:07:53 -07:00 |
|
Lu Fang
|
4068f4b5b5
|
[MISC] Replace c10::optional with std::optional (#11730)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-01-05 10:20:34 +09:00 |
|
Mor Zusman
|
fb60ae9b91
|
[Kernel][Model] Improve continuous batching for Jamba and Mamba (#9189)
|
2024-10-16 12:12:43 -04:00 |
|
Mor Zusman
|
f13a07b1f8
|
[Kernel][Model] Varlen prefill + Prefill chunking support for mamba kernels and Jamba model (#8533)
|
2024-09-29 17:35:58 -04:00 |
|
bnellnm
|
300da09177
|
[Kernel] Fullgraph and opcheck tests (#8479)
|
2024-09-25 08:35:52 -06:00 |
|
Mor Zusman
|
fdd9daafa3
|
[Kernel/Model] Migrate mamba_ssm and causal_conv1d kernels to vLLM (#7651)
|
2024-08-28 15:06:52 -07:00 |
|