vllm/lora at d97011512e5a816acbdb5bd8ffbf691dd227fe27 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-24 08:47:13 +08:00

History

[CORE] Adding support for insertion of soft-tuned prompts (#4645 )

Co-authored-by: Swapnil Parekh <swapnilp@ibm.com>
Co-authored-by: Joe G <joseph.granados@h2o.ai>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>

2024-07-09 13:26:36 -07:00

__init__.py

[Experimental] Add multi-LoRA support (#1804 )

2024-01-23 15:26:37 -08:00

fully_sharded_layers.py

[Bugfix] Add fully sharded layer for QKVParallelLinearWithLora (#5665 )

2024-06-21 04:46:28 +00:00

layers.py

[CORE] Adding support for insertion of soft-tuned prompts (#4645 )

2024-07-09 13:26:36 -07:00

lora.py

[Model] Add base class for LoRA-supported models (#5018 )

2024-06-27 16:03:04 +08:00

models.py

[CORE] Adding support for insertion of soft-tuned prompts (#4645 )

2024-07-09 13:26:36 -07:00

punica.py

[hardware][misc] introduce platform abstraction (#6080 )

2024-07-02 20:12:22 -07:00

request.py

[CORE] Adding support for insertion of soft-tuned prompts (#4645 )

2024-07-09 13:26:36 -07:00

utils.py

[Bugfix] Add fully sharded layer for QKVParallelLinearWithLora (#5665 )

2024-06-21 04:46:28 +00:00

worker_manager.py

[CORE] Adding support for insertion of soft-tuned prompts (#4645 )

2024-07-09 13:26:36 -07:00