vllm/model_executor at 01513a334a451e53162a2526ae28caba7fa868d4 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-27 21:37:04 +08:00

History

Nir David 01513a334a

Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor) (#12010 )

Signed-off-by: Nir David <ndavid@habana.ai>
Signed-off-by: Uri Livne <ulivne@habana.ai>
Co-authored-by: Uri Livne <ulivne@habana.ai>

2025-07-16 15:33:41 -04:00

..

guided_decoding

[V0][V1][Core] Add outlines integration for V1, and update V0 integration. (#15975 )

2025-07-10 15:30:26 -04:00

Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor) (#12010 )

2025-07-16 15:33:41 -04:00

Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor) (#12010 )

2025-07-16 15:33:41 -04:00

[Model] Remove model sampler (#21059 )

2025-07-16 19:03:37 +00:00

__init__.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

custom_op.py

[Fix][torch.compile] Enable custom ops by default when Inductor off (#20102 )

2025-06-27 09:00:42 -06:00

parameter.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

pooling_metadata.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

sampling_metadata.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

utils.py

[Quant] [Bugfix] Fix quantization config matching with hf_to_vllm_mapper (#20046 )

2025-07-01 19:20:34 +09:00