mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2025-12-09 15:44:57 +08:00
[Doc] Update docs for New Model Implementation (#20115)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
parent
65397e40f5
commit
1d7c29f5fe
@ -48,7 +48,12 @@ nav:
|
||||
- General:
|
||||
- glob: contributing/*
|
||||
flatten_single_child_sections: true
|
||||
- Model Implementation: contributing/model
|
||||
- Model Implementation:
|
||||
- contributing/model/README.md
|
||||
- contributing/model/basic.md
|
||||
- contributing/model/registration.md
|
||||
- contributing/model/tests.md
|
||||
- contributing/model/multimodal.md
|
||||
- Design Documents:
|
||||
- V0: design
|
||||
- V1: design/v1
|
||||
|
||||
@ -1,21 +1,23 @@
|
||||
---
|
||||
title: Adding a New Model
|
||||
title: Summary
|
||||
---
|
||||
[](){ #new-model }
|
||||
|
||||
This section provides more information on how to integrate a [PyTorch](https://pytorch.org/) model into vLLM.
|
||||
!!! important
|
||||
Many decoder language models can now be automatically loaded using the [Transformers backend][transformers-backend] without having to implement them in vLLM. See if `vllm serve <model>` works first!
|
||||
|
||||
Contents:
|
||||
vLLM models are specialized [PyTorch](https://pytorch.org/) models that take advantage of various [features][compatibility-matrix] to optimize their performance.
|
||||
|
||||
- [Basic](basic.md)
|
||||
- [Registration](registration.md)
|
||||
- [Tests](tests.md)
|
||||
- [Multimodal](multimodal.md)
|
||||
The complexity of integrating a model into vLLM depends heavily on the model's architecture.
|
||||
The process is considerably straightforward if the model shares a similar architecture with an existing model in vLLM.
|
||||
However, this can be more complex for models that include new operators (e.g., a new attention mechanism).
|
||||
|
||||
!!! note
|
||||
The complexity of adding a new model depends heavily on the model's architecture.
|
||||
The process is considerably straightforward if the model shares a similar architecture with an existing model in vLLM.
|
||||
However, for models that include new operators (e.g., a new attention mechanism), the process can be a bit more complex.
|
||||
Read through these pages for a step-by-step guide:
|
||||
|
||||
- [Implementing a Basic Model](basic.md)
|
||||
- [Registering a Model to vLLM](registration.md)
|
||||
- [Writing Unit Tests](tests.md)
|
||||
- [Multi-Modal Support](multimodal.md)
|
||||
|
||||
!!! tip
|
||||
If you are encountering issues while integrating your model into vLLM, feel free to open a [GitHub issue](https://github.com/vllm-project/vllm/issues)
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user