Huy Do
6ace2f72b0
Fix writing benchmark results with tuple keys ( #23633 )
...
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-08-26 19:16:09 +08:00
Jiangyun Zhu
3ecbb14b81
[Benchmarks] add benchmark for embedding models ( #23000 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-08-25 23:57:08 -07:00
Breno Baldas Skuk
0cb7b065c3
Feature/benchmark/random mm data/images ( #23119 )
...
Signed-off-by: breno.skuk <breno.skuk@hcompany.ai>
2025-08-25 01:28:35 -07:00
Jared O'Connell
31282401b6
[BugFix] Fix Python 3.9 Support ( #23306 )
...
Signed-off-by: Jared O'Connell <46976761+jaredoconnell@users.noreply.github.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-08-20 23:23:56 -07:00
Cyrus Leung
0c31e28e95
[Bugfix] Fix extra whitespace in strings caused by newline ( #23272 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 22:03:00 -07:00
Zhewen Li
f729023272
[CI/Build] Also check DP in benchmarks throughput script ( #23038 )
...
Co-authored-by: Simon Mo <simon.mo@hey.com>
2025-08-20 04:09:27 +00:00
Chenheli Hua
1630cc8d0f
[Benchmarks] Add video inputs to ShareGPTDataset. ( #23199 )
...
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
2025-08-19 23:42:31 +00:00
Ruixiang Tan
03d4235fd2
[Misc] Fix the benchmark's README and improve the error messages for the benchmark's argument checks ( #22654 )
...
Signed-off-by: tanruixiang <tanruixiang0104@gmail.com>
2025-08-19 10:18:51 -07:00
hustxiayang
31436e8b4f
[Misc] Add request_id into benchmark_serve.py ( #23065 )
...
Signed-off-by: yangxia <yangxiast@gmail.com>
2025-08-19 08:32:18 +00:00
Seiji Eicher
de9cb61763
Add docs for PrefixRepetitionDataset + enable usage with vllm bench throughput ( #23012 )
...
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-08-16 10:21:20 +00:00
Seiji Eicher
00d6cba0cf
Add PrefixRepetitionRandomDataset to vllm bench serve datasets ( #20638 )
...
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
2025-08-15 14:09:23 -07:00
Chenheli Hua
993d3d122b
[Benchmarks] Include image data when ShareGPT4V dataset is used. ( #22955 )
...
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
2025-08-15 18:23:06 +00:00
Harry Mellor
bc1d02ac85
[Docs] Add comprehensive CLI reference for all large vllm subcommands ( #22601 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-11 00:13:33 -07:00
Breno Baldas Skuk
65a7917be4
Fix(benchmarks): allow multiple mm contents in OpenAI Chat Completion Benchmarks ( #22534 )
...
Signed-off-by: breno.skuk <breno.skuk@hcompany.ai>
2025-08-10 09:03:15 -07:00
lkchen
808a7b69df
[bench] Fix benchmark/serve.py to ignore unavailable results ( #22382 )
...
Signed-off-by: Linkun <github@lkchen.net>
2025-08-07 23:15:50 -07:00
lkchen
4d4297e8fe
[Bench] Split serve.py:main into async/async versions ( #22405 )
...
Signed-off-by: Linkun <github@lkchen.net>
2025-08-06 23:05:07 -07:00
Lionel Villard
ad6c655dde
preload heavy modules when mp method is forkserver ( #22214 )
...
Signed-off-by: Lionel Villard <villard@us.ibm.com>
2025-08-06 20:33:24 -07:00
Seiji Eicher
6f5478298d
Use aiohttp connection pool for benchmarking ( #21981 )
...
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
2025-08-03 19:23:32 -07:00
Ye (Charlotte) Qi
3f36c325fa
[Benchmark] Support ready check timeout in vllm bench serve ( #21696 )
...
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-08-03 00:52:38 -07:00
Peter Pan
533db0935d
[benchmark] add max-concurrency in result table ( #21095 )
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-30 01:15:43 -07:00
rongfu.leng
18cc33dd60
[bugfix] fix profile impact benchmark results ( #21507 )
...
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-07-27 22:44:24 -07:00
Huy Do
971948b846
Handle non-serializable objects in vllm bench ( #21665 )
2025-07-27 03:35:22 +00:00
Cyrus Leung
34ddcf9ff4
[Frontend] run-batch supports V1 ( #21541 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-24 20:05:55 -07:00
Jialin Ouyang
10904e6d75
[benchmark] Port benchmark request sent optimization to benchmark_serving ( #21209 )
...
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
2025-07-22 05:28:00 -07:00
Jialin Ouyang
1bf65138f6
[benchmark] Sending request strictly follows the random intervals ( #21108 )
...
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
2025-07-18 06:22:08 +00:00
Michael Goin
8bb43b9c9e
Add benchmark dataset for mlperf llama tasks ( #20338 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-14 19:10:07 +00:00
Li Wang
9ff2af6d2b
[Benchmark] Parameterization of streaming loading of multimodal datasets ( #20528 )
...
Signed-off-by: wangli <wangli858794774@gmail.com>
2025-07-09 13:35:16 +00:00
Kebe
b1c1fe35a5
[Misc] remove redundant char ( #20287 )
...
Signed-off-by: Kebe <mail@kebe7jun.com>
2025-07-01 15:33:22 +08:00
Ekagra Ranjan
9502c38138
[Benchmark][Bug] Fix multiple bugs in bench and add args to spec_decode offline ( #20083 )
2025-06-25 22:06:27 -07:00
d.transposed
c635c5f744
[Misc][Benchmarking] Add variable request-rate ("ramp-up") to the benchmarking client. ( #19423 )
...
Signed-off-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
Co-authored-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-06-24 18:41:49 +00:00
Wang, Yi
202c5df935
[Benchmark] fix request loss if "ping" is returned ( #19535 )
...
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-06-22 07:21:04 +00:00
Brayden Zhong
5aa4a015ce
[Benchmark] Fix Value of type "SampleRequest" is not indexable ( #18032 )
...
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-06-19 21:28:55 -07:00
Ekagra Ranjan
017ef648e9
[Spec Decode][Benchmark] Generalize spec decode offline benchmark to more methods and datasets ( #18847 )
2025-06-12 10:30:56 -07:00
Isotr0py
8711bc5e68
[Misc] Add packages for benchmark as extra dependency ( #19089 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-06-04 04:18:48 -07:00
Ekagra Ranjan
135cf55cd1
[V1][Spec Decode][Ngram] 1.35x gain -> 1.95x gain on InstructCoder with prompt fix ( #18971 )
2025-06-03 15:26:33 -07:00
Simon Mo
02f0c7b220
[Misc] Add SPDX-FileCopyrightText ( #19100 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
Michael Goin
cc977286e7
Reduce logs in CLI scripts and plugin loader ( #18970 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-03 06:00:45 +00:00
Ekagra Ranjan
bbfa0c61d1
[Misc][Benchmark] Add support for CustomDataset ( #18511 )
2025-05-31 19:07:38 +00:00
Divakar Verma
774c5fde30
[V1] fix torch profiling for V1 offline scenarios ( #18445 )
...
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2025-05-28 04:16:30 +00:00
cascade
51e98e4ffd
[Bugfix] Disable prefix caching by default for benchmark ( #18771 )
...
Signed-off-by: cascade812 <cascade812@outlook.com>
2025-05-28 08:18:09 +08:00
Michael Goin
e56f44d9ec
Support datasets in vllm bench serve and sync with benchmark_[serving,datasets].py ( #18566 )
2025-05-27 19:59:48 -04:00
cascade
aaa4ac1c95
Disable prefix cache by default for benchmark ( #18639 )
...
Signed-off-by: cascade812 <cascade812@outlook.com>
2025-05-27 20:06:34 +08:00
Cyrus Leung
273cb3b4d9
[Doc] Fix top-level API links/docs ( #18621 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-23 09:46:56 -07:00
Chenheli Hua
04eb88dc80
Re-submit: Fix: Proper RGBA -> RGB conversion for PIL images. ( #18569 )
...
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
2025-05-23 01:59:18 +00:00
Brayden Zhong
891b9d33de
[Fix] Benchmark "EngineClient" has no attribute "model_config" ( #17976 )
...
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-11 22:55:53 -07:00
d.transposed
d456aea71f
[Misc] Add Next Edit Prediction (NEP) datasets support in benchmark_serving.py ( #16839 )
...
Signed-off-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
Signed-off-by: dtransposed <>
Co-authored-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal>
2025-05-06 15:38:45 -04:00
Christian Heimes
65e262b93b
Fix Python packaging edge cases ( #17159 )
...
Signed-off-by: Christian Heimes <christian@python.org>
2025-04-26 06:15:07 +08:00
Michael Goin
b4fe16c75b
Add vllm bench [latency, throughput] CLI commands ( #16508 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-04-14 23:10:35 -07:00
yihong
04149cce27
[BugFix] fix some typos found by typos. ( #16314 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-09 03:43:59 -07:00
Reid
26df46ee59
[Misc] cli auto show default value ( #15582 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-03-28 22:23:00 +00:00