Seiji Eicher
b2e65cb4a7
[benchmark] Make request IDs unique across clients by default ( #27723 )
...
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
2025-10-30 17:40:35 -07:00
Harry Mellor
6c9fdbf725
[Docs] Replace rst style double-backtick with md single-backtick ( #27091 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-17 02:47:34 -07:00
Tomas Ruiz
965c5f4914
vllm bench serve shows num of failed requests ( #26478 )
...
Signed-off-by: Tomas Ruiz <tomas.ruiz.te@gmail.com>
2025-10-16 19:55:09 -07:00
kimbochen
013abde6ef
Adding Warmup to Benchmark Serving ( #26943 )
...
Signed-off-by: Kimbo Chen <chentenghung@gmail.com>
2025-10-16 12:44:32 -07:00
Cyrus Leung
334535b6fb
[Benchmark] Show E2EL by default for pooling models ( #27014 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-16 12:47:09 +00:00
rongfu.leng
a27b288e4a
[Feature] default --extra-body param to disable thinking in vllm bench serve ( #26784 )
...
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-10-15 04:23:44 +00:00
Harry Mellor
8fcaaf6a16
Update Optional[x] -> x | None and Union[x, y] to x | y ( #26633 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
Cyrus Leung
dc7976dd9f
[Misc] Upgrade more code to Python 3.10 ( #26463 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-09 10:43:53 +01:00
Cyrus Leung
44b9af5bb2
[Benchmark] Enable MM Embedding benchmarks ( #26310 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-06 19:51:58 +00:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort ( #26247 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Sergei Skvortsov
b71fcd4905
[Misc] Add penalties sampling parameters to serve tool ( #25974 )
...
Signed-off-by: Sergei Skvortsov <sergeyskv@nebius.com>
Co-authored-by: Sergei Skvortsov <sergeyskv@nebius.com>
2025-10-03 15:43:14 -07:00
Nathan Scott
f9e714813a
[Benchmark] Finish documented v0.11.0 deprecation of --endpoint-type ( #26007 )
...
Signed-off-by: Nathan Scott <nathans@redhat.com>
2025-10-01 12:41:57 +00:00
Lucia Fang
eea1783989
[benchmarks]allow skip ready check for bench serve ( #25420 )
...
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <116399278+luccafong@users.noreply.github.com>
Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>
2025-09-23 03:21:48 +00:00
Roger Wang
21da73343a
[Misc] Clean up flags in vllm bench serve ( #25138 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-09-18 12:43:33 +00:00
Simon Mo
a904ea78ea
[benchmark] add peak throughput metrics and plot ( #23867 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-09-17 22:30:02 -07:00
samzong
47f670b03b
[Docs] improve code formatting and comments for eliminate griffe build warning. ( #25010 )
...
Signed-off-by: samzong <samzong.lu@gmail.com>
2025-09-17 07:31:20 -07:00
Clayton Coleman
bc636f21a6
[Benchmark] Allow arbitrary headers to be passed to benchmarked endpoints ( #23937 )
...
Signed-off-by: Clayton Coleman <smarterclayton@gmail.com>
2025-09-12 13:57:53 -07:00
co63oc
1bd007f234
fix some typos ( #24071 )
...
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-09-02 20:44:50 -07:00
Jiangyun Zhu
3ecbb14b81
[Benchmarks] add benchmark for embedding models ( #23000 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-08-25 23:57:08 -07:00
hustxiayang
31436e8b4f
[Misc] Add request_id into benchmark_serve.py ( #23065 )
...
Signed-off-by: yangxia <yangxiast@gmail.com>
2025-08-19 08:32:18 +00:00
Breno Baldas Skuk
65a7917be4
Fix(benchmarks): allow multiple mm contents in OpenAI Chat Completion Benchmarks ( #22534 )
...
Signed-off-by: breno.skuk <breno.skuk@hcompany.ai>
2025-08-10 09:03:15 -07:00
lkchen
808a7b69df
[bench] Fix benchmark/serve.py to ignore unavailable results ( #22382 )
...
Signed-off-by: Linkun <github@lkchen.net>
2025-08-07 23:15:50 -07:00
lkchen
4d4297e8fe
[Bench] Split serve.py:main into async/async versions ( #22405 )
...
Signed-off-by: Linkun <github@lkchen.net>
2025-08-06 23:05:07 -07:00
Seiji Eicher
6f5478298d
Use aiohttp connection pool for benchmarking ( #21981 )
...
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
2025-08-03 19:23:32 -07:00
Ye (Charlotte) Qi
3f36c325fa
[Benchmark] Support ready check timeout in vllm bench serve ( #21696 )
...
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-08-03 00:52:38 -07:00
Peter Pan
533db0935d
[benchmark] add max-concurrency in result table ( #21095 )
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-30 01:15:43 -07:00
rongfu.leng
18cc33dd60
[bugfix] fix profile impact benchmark results ( #21507 )
...
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-07-27 22:44:24 -07:00
Jialin Ouyang
10904e6d75
[benchmark] Port benchmark request sent optimization to benchmark_serving ( #21209 )
...
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
2025-07-22 05:28:00 -07:00
Jialin Ouyang
1bf65138f6
[benchmark] Sending request strictly follows the random intervals ( #21108 )
...
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
2025-07-18 06:22:08 +00:00
Kebe
b1c1fe35a5
[Misc] remove redundant char ( #20287 )
...
Signed-off-by: Kebe <mail@kebe7jun.com>
2025-07-01 15:33:22 +08:00
Ekagra Ranjan
9502c38138
[Benchmark][Bug] Fix multiple bugs in bench and add args to spec_decode offline ( #20083 )
2025-06-25 22:06:27 -07:00
d.transposed
c635c5f744
[Misc][Benchmarking] Add variable request-rate ("ramp-up") to the benchmarking client. ( #19423 )
...
Signed-off-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
Co-authored-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-06-24 18:41:49 +00:00
Ekagra Ranjan
017ef648e9
[Spec Decode][Benchmark] Generalize spec decode offline benchmark to more methods and datasets ( #18847 )
2025-06-12 10:30:56 -07:00
Simon Mo
02f0c7b220
[Misc] Add SPDX-FileCopyrightText ( #19100 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
Ekagra Ranjan
bbfa0c61d1
[Misc][Benchmark] Add support for CustomDataset ( #18511 )
2025-05-31 19:07:38 +00:00
Michael Goin
e56f44d9ec
Support datasets in vllm bench serve and sync with benchmark_[serving,datasets].py ( #18566 )
2025-05-27 19:59:48 -04:00
yihong
04149cce27
[BugFix] fix some typos found by typos. ( #16314 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-09 03:43:59 -07:00
Reid
26df46ee59
[Misc] cli auto show default value ( #15582 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-03-28 22:23:00 +00:00
Randy Chen
36e0c8f7da
[Feature] Add vllm bench CLI ( #13993 )
...
Signed-off-by: Randy Chen <acad.randyjhc@gmail.com>
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
2025-03-12 00:31:48 +00:00