xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-27 09:31:48 +08:00

Author	SHA1	Message	Date
Harry Mellor	992e5c3d34	Merge similar examples in `offline_inference` into single `basic` example (#12737 )	2025-02-20 04:53:51 -08:00
Varun Sundar Rabindranath	b69692a2d8	[Kernel] LoRA - Refactor sgmv kernels (#13110 )	2025-02-20 07:28:06 -05:00
Kevin H. Luu	a64a84433d	[2/n][ci] S3: Use full model path (#13564 ) Signed-off-by: <>	2025-02-20 01:20:15 -08:00
Kevin H. Luu	aa1e62d0db	[ci] Fix spec decode test (#13600 )	2025-02-20 16:56:00 +08:00
Michael Goin	497bc83124	[CI/Build] Use uv in the Dockerfile (#13566 )	2025-02-19 23:05:44 -08:00
Yuan Tang	3738e6fa80	[API Server] Add port number range validation (#13506 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-20 15:05:13 +08:00
Gregory Shtrasberg	0023cd2b9d	[ROCm] MI300A compile targets deprecation (#13560 )	2025-02-19 23:05:00 -08:00
燃	041e294716	[Misc] add mm_processor_kwargs to extra_body for Qwen2.5-VL (#13533 )	2025-02-19 23:04:30 -08:00
Alex Brooks	9621667874	[Misc] Warn if the vLLM version can't be retrieved (#13501 ) Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-02-20 06:24:48 +00:00
Simon Mo	8c755c3b6d	[bugfix] spec decode worker get tp group only when initialized (#13578 )	2025-02-20 04:46:28 +00:00
youkaichao	ba81163997	[core] add sleep and wake up endpoint and v1 support (#12987 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: cennn <2523403608@qq.com> Co-authored-by: cennn <2523403608@qq.com>	2025-02-20 12:41:17 +08:00
Divakar Verma	0d243f2a54	[ROCm][MoE] mi300 mixtral8x7B perf for specific BS (#13577 ) Signed-off-by: Divakar Verma <divakar.verma@amd.com>	2025-02-20 04:01:02 +00:00
Kevin H. Luu	88f6ba3281	[ci] Add AWS creds for AMD (#13572 )	2025-02-20 03:56:06 +00:00
Jee Jee Li	512368e34a	[Misc] Qwen2.5 VL support LoRA (#13261 )	2025-02-19 18:37:55 -08:00
Kevin H. Luu	473f51cfd9	[3/n][CI] Load Quantization test models with S3 (#13570 ) Signed-off-by: <> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-20-117.us-west-2.compute.internal>	2025-02-20 10:12:30 +08:00
Nick Hill	a4c402a756	[BugFix] Avoid error traceback in logs when V1 `LLM` terminates (#13565 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-02-20 00:49:01 +00:00
Isotr0py	550d97eb58	[Misc] Avoid calling unnecessary `hf_list_repo_files` for local model path (#13348 ) Signed-off-by: isotr0py <2037008807@qq.com>	2025-02-19 18:57:48 +00:00
Cody Yu	fbbe1fbac6	[MISC] Logging the message about Ray teardown (#13502 ) Signed-off-by: Cody Yu <hao.yu.cody@gmail.com> Co-authored-by: Rui Qiao <161574667+ruisearch42@users.noreply.github.com>	2025-02-19 09:40:50 -08:00
Wilson Wu	01c184b8f3	Fix copyright year to auto get current year (#13561 )	2025-02-19 16:55:34 +00:00
youkaichao	ad5a35c21b	[doc] clarify multi-node serving doc (#13558 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-19 22:32:17 +08:00
shangmingc	5ae9f26a5a	[Bugfix] Fix device ordinal for multi-node spec decode (#13269 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-02-19 22:13:15 +08:00
Cyrus Leung	377d10bd14	[VLM][Bugfix] Pass processor kwargs properly on init (#13516 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-02-19 13:13:50 +00:00
youkaichao	52ce14d31f	[doc] clarify profiling is only for developers (#13554 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-19 20:55:58 +08:00
Daniele	81dabf24a8	[CI/Build] force writing version file (#13544 ) Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>	2025-02-19 18:48:03 +08:00
Yannick Schnider	423330263b	[Feature] Pluggable platform-specific scheduler (#13161 ) Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>	2025-02-19 17:16:38 +08:00
Nick Hill	caf7ff4456	[V1][Core] Generic mechanism for handling engine utility (#13060 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-02-19 17:09:22 +08:00
Lucia Fang	f525c0be8b	[Model][Speculative Decoding] DeepSeek MTP spec decode (#12755 ) Signed-off-by: Lu Fang <fanglu@fb.com> Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>	2025-02-19 17:06:23 +08:00
Alex Brooks	983a40a8bb	[Bugfix] Fix Positive Feature Layers in Llava Models (#13514 ) Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-02-19 08:50:07 +00:00
Zhe Zhang	fdc5df6f54	use device param in load_model method (#13037 )	2025-02-19 16:05:02 +08:00
Kevin H. Luu	3b05cd4555	[perf-benchmark] Fix ECR path for premerge benchmark (#13512 ) Signed-off-by: <> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-20-117.us-west-2.compute.internal>	2025-02-19 07:56:11 +00:00
Kevin H. Luu	d5d214ac7f	[1/n][CI] Load models in CI from S3 instead of HF (#13205 ) Signed-off-by: <> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-20-117.us-west-2.compute.internal>	2025-02-19 07:34:59 +00:00
Roger Wang	fd84857f64	[Doc] Add clarification note regarding paligemma (#13511 )	2025-02-18 22:24:03 -08:00
Divakar Verma	8aada19dfc	[ROCm][MoE configs] mi325 mixtral & mi300 qwen_moe (#13503 )	2025-02-18 22:23:24 -08:00
Kevin H. Luu	9aa95b0e6a	[perf-benchmark] Allow premerge ECR (#13509 ) Signed-off-by: <> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-20-117.us-west-2.compute.internal>	2025-02-19 05:13:41 +00:00
Yu-Zhou	d0a7a2769d	[Hardware][Gaudi][Feature] Support Contiguous Cache Fetch (#12139 ) Signed-off-by: yuzhou <yuzhou@habana.ai> Signed-off-by: zhouyu5 <yu.zhou@intel.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>	2025-02-18 19:40:19 -08:00
Harry Mellor	00b69c2d27	[Misc] Remove dangling references to `--use-v2-block-manager` (#13492 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-19 03:37:26 +00:00
Woosuk Kwon	4c82229898	[V1][Spec Decode] Optimize N-gram matching with Numba (#13365 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-02-18 13:19:58 -08:00
Woosuk Kwon	c8d70e2437	Pin Ray version to 2.40.0 (#13490 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-02-18 12:50:31 -08:00
Nick Hill	30172b4947	[V1] Optimize handling of sampling metadata and req_ids list (#13244 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-02-18 12:15:33 -08:00
Murali Andoorveedu	a4d577b379	[V1][Tests] Adding additional testing for multimodal models to V1 (#13308 ) Signed-off-by: andoorve <37849411+andoorve@users.noreply.github.com>	2025-02-18 09:53:14 -08:00
youkaichao	7b203b7694	[misc] fix debugging code (#13487 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-18 09:37:11 -08:00
Woosuk Kwon	4fb8142a0e	[V1][PP] Enable true PP with Ray executor (#13472 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-02-18 09:15:32 -08:00
Daniele	a02c86b4dd	[CI/Build] migrate static project metadata from setup.py to pyproject.toml (#8772 )	2025-02-18 08:02:49 -08:00
Liangfu Chen	3809458456	[Bugfix] Fix invalid rotary embedding unit test (#13431 ) Signed-off-by: Liangfu Chen <liangfc@amazon.com>	2025-02-18 11:52:03 +00:00
zifeitong	d3231cb436	[Bugfix] Handle content type with optional parameters (#13383 ) Signed-off-by: Zifei Tong <zifeitong@gmail.com>	2025-02-18 11:29:13 +00:00
Cyrus Leung	435b502a6e	[ROCm] Make amdsmi import optional for other platforms (#13460 )	2025-02-18 03:15:56 -08:00
Isotr0py	29fc5772c4	[Bugfix] Remove noisy error logging during local model loading (#13458 )	2025-02-18 03:15:48 -08:00
Harry Mellor	2358ca527b	[Doc]: Improve feature tables (#13224 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-18 18:52:39 +08:00
Isotr0py	8cf97f8661	[Bugfix] Fix failing transformers dynamic module resolving with spawn multiproc method (#13403 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-02-18 10:25:53 +00:00
Yuan Tang	e2603fefb8	[Bugfix] Ensure LoRA path from the request can be included in err msg (#13450 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-18 16:19:15 +08:00

1 2 3 4 5 ...

4707 Commits