vllm/metrics at fe69f331f84d99541564dfe4852dd45220ed7875 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-01 12:47:24 +08:00

History

[Metrics] Fix KV cache usage percent metric multiproc (#28792 )

The `vllm:kv_cache_usage_perc` Gauge metric is missing `multiprocess_mode="mostrecent"` and ends up returning

```
vllm:kv_cache_usage_perc{engine="0",model_name="Qwen/Qwen3-VL-8B-Instruct",pid="277"} 0.0
vllm:kv_cache_usage_perc{engine="0",model_name="Qwen/Qwen3-VL-8B-Instruct",pid="275"} 0.0
vllm:kv_cache_usage_perc{engine="0",model_name="Qwen/Qwen3-VL-8B-Instruct",pid="273"} 0.6530455880475035
...
```

The deprecated `vllm:gpu_cache_usage_perc` Gauge metric has `multiprocess_mode="mostrecent"`.

Signed-off-by: Jae-Won Chung <jwnchung@umich.edu>

2025-11-17 09:54:15 +00:00

__init__.py

[V1][Core][1/n] Logging and Metrics (#11962 )

2025-01-12 21:02:02 +00:00

loggers.py

[Metrics] Fix KV cache usage percent metric multiproc (#28792 )

2025-11-17 09:54:15 +00:00

prometheus.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

ray_wrappers.py

[KVConnector] Add metrics to Prometheus-Grafana dashboard (#26811 )