vllm/model_executor at 082cc07ef8f810bea61eaed77a60137684ca78f8 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-04 03:11:21 +08:00

History

Yongye Zhu 082cc07ef8

DP/EP Support for gpt-oss with deepep-ht comm kernel on SM100 (#23608 )

2025-08-27 17:33:21 -04:00

..

DP/EP Support for gpt-oss with deepep-ht comm kernel on SM100 (#23608 )

2025-08-27 17:33:21 -04:00

[Quantization] Allow GGUF quantization to skip unquantized layer (#23188 )

2025-08-22 13:04:22 -06:00

[V1][Mamba] - Enable V1 by default for Mamba Models (#23650 )

2025-08-27 20:53:30 +00:00

[Kernel] Add nvfp4 gemm flashinfer backends (#22346 )

2025-08-14 16:03:55 -04:00

__init__.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

custom_op.py

Optimize configuration access with LRU cache in custom ops (#22204 )

2025-08-04 21:43:24 -07:00

parameter.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

pooling_metadata.py

[Performance] V1 Pooling Models E2E Performance Optimization (#23162 )

2025-08-21 13:26:09 +00:00

sampling_metadata.py

Revert "Update sampling_metadata.py (#21937 )" (#22088 )

2025-08-01 05:24:46 -07:00

utils.py

[Quantization] Enable BNB support for InternS1 (#21953 )

2025-08-01 11:09:54 +00:00