vllm/model_executor at b9c0605a8e7d558f595bd59ba6e6c95578dc0f1e - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-11 02:51:21 +08:00

History

chenqianfzh b9c0605a8e

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00

..

guided_decoding

Allow user to define whitespace pattern for outlines (#4305 )

2024-04-30 20:48:39 -07:00

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00

[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776 )

2024-06-01 14:51:10 -06:00

__init__.py

[Core] Refactor Attention Take 2 (#3462 )

2024-03-25 04:39:33 +00:00

pooling_metadata.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

sampling_metadata.py

[Core][Model runner refactoring 1/N] Refactor attn metadata term (#4518 )

2024-05-03 10:20:12 -07:00

utils.py

[Hardware][Neuron] Refactor neuron support (#3471 )

2024-03-22 01:22:17 +00:00