vllm/serving at fb2716d64117aaa6c36b97b09765aa10a89e2fe5 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-03-20 05:47:54 +08:00

History

Cyrus Leung 06386a64dd

[Frontend] Chat-based Embeddings API (#9759 )

2024-11-01 08:13:35 +00:00

..

compatibility_matrix.rst

[Bugfix][Frontend] Reject guided decoding in multistep mode (#9892 )

2024-11-01 01:09:46 +00:00

deploying_with_bentoml.rst

docs: Add BentoML deployment doc (#3336 )

2024-03-12 10:34:30 -07:00

deploying_with_cerebrium.rst

[DOC] - Add docker image to Cerebrium Integration (#6510 )

2024-07-17 10:22:53 -07:00

deploying_with_docker.rst

[Doc] Update docker references (#5614 )

2024-06-19 15:01:45 -07:00

deploying_with_dstack.rst

[Doc][CI/Build] Update docs and tests to use vllm serve (#6431 )

2024-07-17 07:43:21 +00:00

deploying_with_k8s.rst

[Doc]: Add deploying_with_k8s guide (#8451 )

2024-10-07 13:31:45 -07:00

deploying_with_kserve.rst

Update link to KServe deployment guide (#9173 )

2024-10-09 03:58:49 +00:00

deploying_with_lws.rst

Support to serve vLLM on Kubernetes with LWS (#4829 )

2024-05-16 16:37:29 -07:00

deploying_with_nginx.rst

[Hardware][Intel CPU][DOC] Update docs for CPU backend (#6212 )

2024-10-22 10:38:04 -07:00

deploying_with_triton.rst

Add documentation to Triton server tutorial (#983 )

2023-09-20 10:32:40 -07:00

distributed_serving.rst

[doc] update pp support (#9853 )

2024-10-30 13:36:51 -07:00

env_vars.rst

[doc][misc] add note for Kubernetes users (#5916 )

2024-06-27 10:07:07 -07:00

faq.rst

[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM (#7962 )

2024-09-05 16:25:29 -04:00

integrations.rst

llama_index serving integration documentation (#6973 )

2024-08-14 15:38:37 -07:00

metrics.rst

Add Production Metrics in Prometheus format (#1890 )

2023-12-02 16:37:44 -08:00

openai_compatible_server.md

[Frontend] Chat-based Embeddings API (#9759 )

2024-11-01 08:13:35 +00:00

run_on_sky.rst

[Misc][OpenAI] deprecate max_tokens in favor of new max_completion_tokens field for chat completion endpoint (#9837 )

2024-10-30 18:15:56 -07:00

serving_with_langchain.rst

docs: fix langchain (#2736 )

2024-02-03 18:17:55 -08:00

serving_with_llamaindex.rst

llama_index serving integration documentation (#6973 )

2024-08-14 15:38:37 -07:00

tensorizer.rst

[Doc]: Update tensorizer docs to include vllm[tensorizer] (#7889 )

2024-10-22 15:43:25 -07:00

usage_stats.md

Usage Stats Collection (#2852 )

2024-03-28 22:16:12 -07:00