vllm/model_executor at b09c755be89edaaca7c9e010f423545f0cd014b4 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-26 01:37:16 +08:00

History

Isotr0py b09c755be8

[Bugfix] Fix phi3v incorrect image_idx when using async engine (#7916 )

2024-08-27 17:36:09 +00:00

..

guided_decoding

[misc][core] lazy import outlines (#7831 )

2024-08-24 00:51:38 -07:00

[Misc] Update compressed tensors lifecycle to remove prefix from create_weights (#7825 )

2024-08-26 18:09:34 -06:00

Fix ShardedStateLoader for vllm fp8 quantization (#7708 )

2024-08-22 08:25:04 -04:00

[Bugfix] Fix phi3v incorrect image_idx when using async engine (#7916 )

2024-08-27 17:36:09 +00:00

__init__.py

[Performance] Optimize e2e overheads: Reduce python allocations (#7162 )

2024-08-08 21:34:28 -07:00

custom_op.py

[XPU] fallback to native implementation for xpu custom op (#7670 )

2024-08-20 00:26:09 -07:00

parameter.py

[Misc] update fp8 to use vLLMParameter (#7437 )

2024-08-22 08:36:18 -04:00

pooling_metadata.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

sampling_metadata.py

[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )

2024-08-18 17:57:20 -07:00

utils.py

[Hardware][Neuron] Refactor neuron support (#3471 )

2024-03-22 01:22:17 +00:00