diff --git a/README.md b/README.md index 67fd8900d013..be0f23677728 100644 --- a/README.md +++ b/README.md @@ -39,6 +39,13 @@ vLLM is flexible and easy to use with: - Streaming outputs - OpenAI-compatible API server +vLLM seamlessly supports many Huggingface models, including the following architectures: + +- GPT-2 (e.g., `gpt2`, `gpt2-xl`, etc.) +- GPTNeoX (e.g., `EleutherAI/gpt-neox-20b`, `databricks/dolly-v2-12b`, `stabilityai/stablelm-tuned-alpha-7b`, etc.) +- LLaMA (e.g., `lmsys/vicuna-13b-v1.3`, `young-geng/koala`, `openlm-research/open_llama_13b`, etc.) +- OPT (e.g., `facebook/opt-66b`, `facebook/opt-iml-max-30b`, etc.) + Install vLLM with pip or [from source](https://llm-serving-cacheflow.readthedocs-hosted.com/en/latest/getting_started/installation.html#build-from-source): ```bash diff --git a/docs/source/conf.py b/docs/source/conf.py index c20e8075f79b..74d4aceacbfc 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -53,7 +53,9 @@ copybutton_prompt_is_regexp = True # html_title = project html_theme = 'sphinx_book_theme' +html_logo = 'assets/logos/vllm-logo-text-light.png' html_theme_options = { + 'logo_only': True, 'path_to_docs': 'docs/source', 'repository_url': 'https://github.com/WoosukKwon/vllm', 'use_repository_button': True, diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index cdbca3788259..b5ce4eb66811 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -8,19 +8,24 @@ The following is the list of model architectures that are currently supported by Alongside each architecture, we include some popular models that use it. .. list-table:: - :widths: 25 75 + :widths: 25 25 50 :header-rows: 1 * - Architecture - Models + - Example HuggingFace Models * - :code:`GPT2LMHeadModel` - GPT-2 + - :code:`gpt2`, :code:`gpt2-xl`, etc. * - :code:`GPTNeoXForCausalLM` - GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM + - :code:`EleutherAI/gpt-neox-20b`, :code:`EleutherAI/pythia-12b`, :code:`OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, :code:`databricks/dolly-v2-12b`, :code:`stabilityai/stablelm-tuned-alpha-7b`, etc. * - :code:`LlamaForCausalLM` - LLaMA, Vicuna, Alpaca, Koala, Guanaco + - :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`young-geng/koala`, :code:`JosephusCheung/Guanaco`, etc. * - :code:`OPTForCausalLM` - OPT, OPT-IML + - :code:`facebook/opt-66b`, :code:`facebook/opt-iml-max-30b`, etc. If your model uses one of the above model architectures, you can seamlessly run your model with vLLM. Otherwise, please refer to :ref:`Adding a New Model ` for instructions on how to implement support for your model.