docs: add instruction for langchain (#1162)

2026-05-02 21:24:36 +08:00 · 2023-11-30 19:57:44 +01:00 · 2023-11-30 19:57:44 +01:00 · 05a38612b0
commit 05a38612b0
parent d27f4bae39
2 changed files with 32 additions and 0 deletions
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -66,6 +66,7 @@ Documentation
   serving/run_on_sky
   serving/deploying_with_triton
   serving/deploying_with_docker
+   serving/serving_with_langchain

 .. toctree::
   :maxdepth: 1
--- a/docs/source/serving/serving_with_langchain.rst
+++ b/docs/source/serving/serving_with_langchain.rst
@ -0,0 +1,31 @@
+.. _run_on_langchain:
+
+Serving with Langchain
+============================
+
+vLLM is also available via `Langchain <https://github.com/langchain-ai/langchain>`_ .
+
+To install langchain, run
+
+.. code-block:: console
+
+    $ pip install langchain -q
+
+To run inference on a single or multiple GPUs, use ``VLLM`` class from ``langchain``.
+
+.. code-block:: python
+
+    from langchain.llms import VLLM
+
+    llm = VLLM(model="mosaicml/mpt-7b",
+               trust_remote_code=True,  # mandatory for hf models
+               max_new_tokens=128,
+               top_k=10,
+               top_p=0.95,
+               temperature=0.8,
+               # tensor_parallel_size=... # for distributed inference
+    )
+
+    print(llm("What is the capital of France ?"))
+
+Please refer to this `Tutorial <https://github.com/langchain-ai/langchain/blob/master/docs/extras/integrations/llms/vllm.ipynb>`_ for more details.