vllm/csrc at c45f3c3ab60f4bf4eaab791a76028b8b07ffe9bd - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-19 12:17:16 +08:00

History

Woosuk Kwon 88c0268a18

Implement custom kernel for LLaMA rotary embedding (#14 )

2023-03-30 11:04:21 -07:00

..

attention_kernels.cu

Implement single_query_cached_kv_attention kernel (#3 )

2023-03-01 15:02:19 -08:00

attention_utils.h

Implement single_query_cached_kv_attention kernel (#3 )

2023-03-01 15:02:19 -08:00

attention.cpp

Implement single_query_cached_kv_attention kernel (#3 )

2023-03-01 15:02:19 -08:00

cache_kernels.cu

Implement custom kernel for LLaMA rotary embedding (#14 )

2023-03-30 11:04:21 -07:00

cache.cpp

Support beam search & parallel generation (#7 )

2023-03-10 09:58:21 -08:00

cuda_primitives.h

Implement single_query_cached_kv_attention kernel (#3 )

2023-03-01 15:02:19 -08:00

pos_encoding_kernels.cu

Implement custom kernel for LLaMA rotary embedding (#14 )

2023-03-30 11:04:21 -07:00

pos_encoding.cpp

Implement custom kernel for LLaMA rotary embedding (#14 )

2023-03-30 11:04:21 -07:00