vllm/model_executor at f775a07e30fdeafc14f53fe502b262b00540dd71 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-11 10:54:45 +08:00

History

Breno Faria f775a07e30

[FRONTEND] OpenAI tools support named functions (#5032 )

2024-06-03 18:25:29 -05:00

..

guided_decoding

[FRONTEND] OpenAI tools support named functions (#5032 )

2024-06-03 18:25:29 -05:00

[Kernel] Pass a device pointer into the quantize kernel for the scales (#5159 )

2024-06-03 09:52:30 -07:00

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00

[Core] Support image processor (#4197 )

2024-06-02 22:56:41 -07:00

__init__.py

[Core] Refactor Attention Take 2 (#3462 )

2024-03-25 04:39:33 +00:00

pooling_metadata.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

sampling_metadata.py

[Core][Model runner refactoring 1/N] Refactor attn metadata term (#4518 )

2024-05-03 10:20:12 -07:00

utils.py

[Hardware][Neuron] Refactor neuron support (#3471 )

2024-03-22 01:22:17 +00:00