vllm/csrc at 74d55c065b104f816fca9c177e044415802796a1 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-18 03:37:09 +08:00

History

Chip Kerchner 38a1674abb

Support CPU inference with VSX PowerPC ISA (#5652 )

2024-06-26 21:53:04 +00:00

..

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

Support CPU inference with VSX PowerPC ISA (#5652 )

2024-06-26 21:53:04 +00:00

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

[Kernel] Add punica dimension for Qwen2 LoRA (#5441 )

2024-06-20 17:55:41 -07:00

[Kernel] Adding bias epilogue support for cutlass_scaled_mm (#5560 )

2024-06-26 15:16:00 +00:00

activation_kernels.cu

[Model] Port over CLIPVisionModel for VLMs (#5591 )

2024-06-20 11:52:09 +00:00

cache_kernels.cu

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

cache.h

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

cuda_compat.h

[Kernel][ROCm][AMD] enable fused topk_softmax kernel for moe layer (#4927 )

2024-06-02 14:13:26 -07:00

cuda_utils_kernels.cu

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

cuda_utils.h

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

custom_all_reduce_test.cu

[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722 )

2024-05-22 07:18:41 +00:00

custom_all_reduce.cu

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

custom_all_reduce.cuh

[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722 )

2024-05-22 07:18:41 +00:00

dispatch_utils.h

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

layernorm_kernels.cu

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

moe_align_block_size_kernels.cu

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

ops.h

Support CPU inference with VSX PowerPC ISA (#5652 )

2024-06-26 21:53:04 +00:00

pos_encoding_kernels.cu

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

reduction_utils.cuh

[Kernel] Dynamic Per-Token Activation Quantization (#5037 )

2024-06-07 09:36:26 -07:00

registration.h

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

torch_bindings.cpp

[Kernel] Adding bias epilogue support for cutlass_scaled_mm (#5560 )

2024-06-26 15:16:00 +00:00