mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2026-05-31 06:37:04 +08:00
[Docs] Improve docstring for ray data llm example (#20597)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
This commit is contained in:
parent
0d914c81a2
commit
e60d422f19
@ -3,17 +3,19 @@
|
|||||||
"""
|
"""
|
||||||
This example shows how to use Ray Data for data parallel batch inference.
|
This example shows how to use Ray Data for data parallel batch inference.
|
||||||
|
|
||||||
Ray Data is a data processing framework that can handle large datasets
|
Ray Data is a data processing framework that can process very large datasets
|
||||||
and integrates tightly with vLLM for data-parallel inference.
|
with first-class support for vLLM.
|
||||||
|
|
||||||
As of Ray 2.44, Ray Data has a native integration with
|
|
||||||
vLLM (under ray.data.llm).
|
|
||||||
|
|
||||||
Ray Data provides functionality for:
|
Ray Data provides functionality for:
|
||||||
* Reading and writing to cloud storage (S3, GCS, etc.)
|
* Reading and writing to most popular file formats and cloud object storage.
|
||||||
* Automatic sharding and load-balancing across a cluster
|
* Streaming execution, so you can run inference on datasets that far exceed
|
||||||
* Optimized configuration of vLLM using continuous batching
|
the aggregate RAM of the cluster.
|
||||||
* Compatible with tensor/pipeline parallel inference as well.
|
* Scale up the workload without code changes.
|
||||||
|
* Automatic sharding, load-balancing, and autoscaling across a Ray cluster,
|
||||||
|
with built-in fault-tolerance and retry semantics.
|
||||||
|
* Continuous batching that keeps vLLM replicas saturated and maximizes GPU
|
||||||
|
utilization.
|
||||||
|
* Compatible with tensor/pipeline parallel inference.
|
||||||
|
|
||||||
Learn more about Ray Data's LLM integration:
|
Learn more about Ray Data's LLM integration:
|
||||||
https://docs.ray.io/en/latest/data/working-with-llms.html
|
https://docs.ray.io/en/latest/data/working-with-llms.html
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user