vllm/model_executor at c5b4b11d7f4b69160d6a0d99771cb5c04e923a8d - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-08-01 10:34:27 +08:00

History

Isotr0py c5b4b11d7f

[Bugfix] Fix k_proj's bias for whisper self attention (#12342 )

Signed-off-by: Isotr0py <2037008807@qq.com>

2025-01-23 10:15:33 +00:00

..

guided_decoding

[bugfix] catch xgrammar unsupported array constraints (#12210 )

2025-01-20 16:42:02 -08:00

[AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD (#12282 )

2025-01-23 00:10:37 +00:00

[Misc] Improve the readability of BNB error messages (#12320 )

2025-01-22 16:56:54 +00:00

[Bugfix] Fix k_proj's bias for whisper self attention (#12342 )

2025-01-23 10:15:33 +00:00

__init__.py

[Performance] Optimize e2e overheads: Reduce python allocations (#7162 )

2024-08-08 21:34:28 -07:00

custom_op.py

[platform] support pytorch custom op pluggable (#11328 )

2025-01-10 10:02:38 +00:00

parameter.py

[Misc][Quark] Upstream Quark format to VLLM (#10765 )

2025-01-15 11:05:15 -05:00

pooling_metadata.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

sampling_metadata.py

[Misc] typo find in sampling_metadata.py (#10740 )

2024-11-29 05:17:57 +00:00

utils.py

[platforms] enable platform plugins (#11602 )

2024-12-30 20:24:45 +08:00