vllm/model_executor at 432870829d5143840c45296b8c1f34e5f561fa85 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-27 15:07:53 +08:00

History

Lucia Fang 432870829d

[Bugfix] Fix missing per_act_token parameter in compressed_tensors_moe (#20509 )

Signed-off-by: Lu Fang <fanglu@fb.com>

2025-07-06 12:08:30 +08:00

..

guided_decoding

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

[Bugfix] Fix missing per_act_token parameter in compressed_tensors_moe (#20509 )

2025-07-06 12:08:30 +08:00

[v1] Re-add fp32 support to v1 engine through FlexAttention (#19754 )

2025-07-05 09:41:10 +00:00

Enable V1 for Hybrid SSM/Attention Models (#20016 )

2025-07-04 17:46:53 +00:00

__init__.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

custom_op.py

[Fix][torch.compile] Enable custom ops by default when Inductor off (#20102 )

2025-06-27 09:00:42 -06:00

parameter.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

pooling_metadata.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

sampling_metadata.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

utils.py

[Quant] [Bugfix] Fix quantization config matching with hf_to_vllm_mapper (#20046 )

2025-07-01 19:20:34 +09:00