vllm/model_executor at 9f68e00d27b0f8252549be3adbb47c5b735a8103 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-18 09:16:58 +08:00

History

Cyrus Leung 9f68e00d27

[Bugfix] Fix broken OpenAI tensorizer test (#8258 )

2024-09-07 08:02:39 +00:00

..

guided_decoding

[Feature] OpenAI-Compatible Tools API + Streaming for Hermes & Mistral models (#5649 )

2024-09-04 13:18:13 -07:00

[Misc] Remove SqueezeLLM (#8220 )

2024-09-06 16:29:03 -06:00

[Bugfix] Fix broken OpenAI tensorizer test (#8258 )

2024-09-07 08:02:39 +00:00

[Model] Multi-input support for LLaVA (#8238 )

2024-09-07 02:57:24 +00:00

__init__.py

[Performance] Optimize e2e overheads: Reduce python allocations (#7162 )

2024-08-08 21:34:28 -07:00

custom_op.py

[XPU] fallback to native implementation for xpu custom op (#7670 )

2024-08-20 00:26:09 -07:00

parameter.py

[Misc] Update GPTQ to use vLLMParameters (#7976 )

2024-09-03 17:21:44 -04:00

pooling_metadata.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

sampling_metadata.py

[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )

2024-08-18 17:57:20 -07:00

utils.py

[Hardware][Neuron] Refactor neuron support (#3471 )

2024-03-22 01:22:17 +00:00