vllm/lora at af9ad46fca6e594797b83e5ecb2e1f31ca5e9fac - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-24 06:37:12 +08:00

History

SangBin Cho f5e73c9f1b

[Lora] Use safetensor keys instead of adapter_config.json to find unexpected modules. (#5909 )

Co-authored-by: sang <sangcho@anyscale.com>

2024-06-30 17:11:15 +00:00

..

__init__.py

[Experimental] Add multi-LoRA support (#1804 )

2024-01-23 15:26:37 -08:00

fully_sharded_layers.py

[Bugfix] Add fully sharded layer for QKVParallelLinearWithLora (#5665 )

2024-06-21 04:46:28 +00:00

layers.py

[Model] Add Gemma 2 (#5908 )

2024-06-27 13:33:56 -07:00

lora.py

[Model] Add base class for LoRA-supported models (#5018 )

2024-06-27 16:03:04 +08:00

models.py

[Lora] Use safetensor keys instead of adapter_config.json to find unexpected modules. (#5909 )

2024-06-30 17:11:15 +00:00

punica.py

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

request.py

[Lora] Support long context lora (#4787 )

2024-05-18 16:05:23 +09:00

utils.py

[Bugfix] Add fully sharded layer for QKVParallelLinearWithLora (#5665 )

2024-06-21 04:46:28 +00:00

worker_manager.py

[LoRA] Add support for pinning lora adapters in the LRU cache (#5603 )

2024-06-21 15:42:46 -07:00