vllm/kernels at dd3b865854c21c99ebc5d1bd34c12936002174c2 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-24 13:37:19 +08:00

History

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

2025-03-13 16:12:42 +08:00

deepgemm

Add benchmark for DeepGEMM and vLLM Block FP8 Dense GEMM (#13917 )

2025-03-05 17:08:51 -08:00

benchmark_aqlm.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

benchmark_layernorm.py

[Bugfix] Correctly call cudaProfilerStop in benchmarks script (#14183 )

2025-03-07 00:42:49 +00:00

benchmark_lora.py

[V1] LoRA - Add triton kernels for V1 (#13096 )

2025-03-10 17:27:53 -04:00

benchmark_machete.py

[Bugfix] Correctly call cudaProfilerStop in benchmarks script (#14183 )

2025-03-07 00:42:49 +00:00

benchmark_marlin.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

benchmark_moe.py

[Bugfix] fix benchmark moe (#14653 )

2025-03-13 16:12:42 +08:00

benchmark_paged_attention.py

[Bugfix] Correctly call cudaProfilerStop in benchmarks script (#14183 )

2025-03-07 00:42:49 +00:00

benchmark_quant.py

[Bugfix] Correctly call cudaProfilerStop in benchmarks script (#14183 )

2025-03-07 00:42:49 +00:00

benchmark_rmsnorm.py

Correct capitalisation: VLLM -> vLLM (#14562 )

2025-03-10 16:36:21 +00:00

benchmark_rope.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

benchmark_shapes.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

graph_machete_bench.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

requirements.txt

[Kernel] (2/N) Machete - Integrate into CompressedTensorsWNA16 and GPTQMarlin (#7701 )

2024-09-23 13:46:26 -04:00

utils.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

weight_shapes.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00