xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-26 01:07:21 +08:00

Author	SHA1	Message	Date
Jee Jee Li	4ff79a136e	[Misc] Set the minimum openai version (#20539 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-07 09:15:26 +00:00
Abirdcfly	448acad31e	[Misc] remove unused jinaai_serving_reranking (#18878 ) Signed-off-by: Abirdcfly <fp544037857@gmail.com>	2025-07-07 09:14:12 +00:00
Michael Yao	eb0b2d2f08	[Docs] Clean up tables in supported_models.md (#20552 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2025-07-07 01:46:31 -07:00
Yan Ma	3112271f6e	[XPU] log clean up for XPU platform (#20553 ) Signed-off-by: yan <yan.ma@intel.com>	2025-07-07 01:38:22 -07:00
Michael Yao	1fd471e957	Add docstrings to url_schemes.py to improve readability (#20545 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2025-07-07 08:31:49 +00:00
Liangliang Ma	2c5ebec064	[XPU][CI] add v1/core test in xpu hardware ci (#20537 ) Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>	2025-07-07 01:16:40 -07:00
Jee Jee Li	2e610deb72	[CI/Build] Enable phi2 lora test (#20540 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-07 05:10:41 +00:00
Yang Yang	6e2c19ce22	[Refactor]Abstract Platform Interface for Distributed Backend and Add xccl Support for Intel XPU (#19410 ) Signed-off-by: dbyoung18 <yang5.yang@intel.com> Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>	2025-07-07 04:32:32 +00:00
Reid	47db8c2c15	[Misc] add a tip for pre-commit (#20536 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-07-06 19:42:06 -07:00
Woosuk Kwon	462b269280	Implement OpenAI Responses API [1/N] (#20504 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-07-06 18:32:13 -07:00
Cyrus Leung	c18b3b8e8b	[Bugfix] Add `use_cross_encoder` flag to use correct activation in `ClassifierPooler` (#20527 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-06 14:01:48 -07:00
Woosuk Kwon	9528e3a05e	[BugFix][Spec Decode] Fix spec token ids in model runner (#20530 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-07-06 19:44:52 +00:00
Cyrus Leung	9fb52e523a	[V1] Support any head size for FlexAttention backend (#20467 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-06 09:54:36 -07:00
Woosuk Kwon	e202dd2736	[V0 deprecation] Remove V0 CPU/XPU/TPU backends (#20412 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: jiang1.li <jiang1.li@intel.com> Co-authored-by: Li, Jiang <jiang1.li@intel.com>	2025-07-06 08:48:13 -07:00
Reid	43813e6361	[Misc] call the pre-defined func (#20518 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-07-06 10:25:29 +00:00
Brayden Zhong	cede942b87	[Benchmark] Add support for multiple batch size benchmark through CLI in `benchmark_moe.py` (#20516 ) Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-07-06 09:20:11 +00:00
Flora Feng	fe1e924811	[Frontend] Support image object in llm.chat (#19635 ) Signed-off-by: sfeng33 <4florafeng@gmail.com> Signed-off-by: Flora Feng <4florafeng@gmail.com>	2025-07-06 06:47:13 +00:00
Chengji Yao	4548c03c50	[TPU][Bugfix] fix the MoE OOM issue (#20339 ) Signed-off-by: Chengji Yao <chengjiyao@google.com>	2025-07-05 21:19:09 -07:00
Lucas Wilkinson	40b86aa05e	[BugFix] Fix: ImportError when building on hopper systems (#20513 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-07-06 12:17:30 +08:00
Lucia Fang	432870829d	[Bugfix] Fix missing per_act_token parameter in compressed_tensors_moe (#20509 ) Signed-off-by: Lu Fang <fanglu@fb.com>	2025-07-06 12:08:30 +08:00
Vadim Gimpelson	f73d02aadc	[BUG] Fix #20484 . Support empty sequence in cuda penalty kernel (#20491 ) Signed-off-by: Vadim Gimpelson <vadim.gimpelson@centml.ai>	2025-07-05 19:38:02 -07:00
Jeremy Reizenstein	c5ebe040ac	test_attention compat with coming xformers change (#20487 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-07-05 19:37:59 -07:00
Reid	8d763cb891	[Misc] remove unused import (#20517 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-07-05 19:17:06 -07:00
Reid	cf4cd53982	[Misc] Add logger.exception for TPU information collection failures (#20510 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-07-05 07:24:32 -07:00
Isotr0py	32c9be2200	[v1] Re-add fp32 support to v1 engine through FlexAttention (#19754 ) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-07-05 09:41:10 +00:00
Lucia Fang	8aeaa910a2	Fix unknown attribute of topk_indices_dtype in CompressedTensorsW8A8Fp8MoECutlassMethod (#20507 ) Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>	2025-07-05 14:03:20 +08:00
Jee Jee Li	906e05d840	[Misc] Remove the unused LoRA test code (#20494 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-05 13:48:16 +08:00
Reid	ef9a2990ae	[doc] small fix (#20506 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-07-04 20:56:39 -07:00
Reid	7e90870491	[Misc] Add security warning for development mode endpoints (#20508 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-07-04 20:52:13 -07:00
Guy Stone	d3f05c9248	[Doc] fix mutltimodal_inputs.md gh examples link (#20497 ) Signed-off-by: Guy Stone <guys@spotify.com>	2025-07-04 16:41:35 -07:00
Michael Goin	c108781c85	[CI Bugfix] Fix pre-commit failures on main (#20502 )	2025-07-04 14:17:30 -07:00
Duncan Moss	3d184b95b8	[feat]: CUTLASS block scaled group gemm for SM100 (#19757 ) Signed-off-by: Duncan Moss <djm.moss@gmail.com> Co-authored-by: Duncan Moss <dmoss@nvidia.com>	2025-07-04 12:58:04 -06:00
Thomas Parnell	2f35a022e6	Enable V1 for Hybrid SSM/Attention Models (#20016 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Stanislaw Wozniak <stw@zurich.ibm.com> Co-authored-by: Tyler Michael Smith <tysmith@redhat.com> Co-authored-by: Chen Zhang <zhangch99@outlook.com>	2025-07-04 17:46:53 +00:00
Chenheli Hua	ffe00ef77a	[Misc] Small: Remove global media connector. Each test should have its own test connector object. (#20395 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com>	2025-07-04 08:15:03 -07:00
Peter Pan	5561681d04	[CI] add kvcache-connector dependency definition and add into CI build (#18193 ) Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>	2025-07-04 06:49:18 -07:00
Cyrus Leung	fbd62d8750	[Doc] Fix classification table in list of supported models (#20489 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-04 06:08:02 -07:00
wang.yuqi	2e26f9156a	[Model][3/N] Automatic conversion of CrossEncoding model (#20168 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-04 05:47:39 -07:00
sangbumlikeagod	9e5452ee34	[Bug][Frontend] Fix structure of transcription's decoder_prompt (#18809 ) Signed-off-by: sangbumlikeagod <oironese@naver.com>	2025-07-04 11:28:07 +00:00
Michael Goin	0e3fe896e2	Support Llama 4 for fused_marlin_moe (#20457 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-04 07:55:10 +00:00
Jee Jee Li	1caca5a589	[Misc] Add SPDX-FileCopyrightText (#20428 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-04 07:40:42 +00:00
Wentao Ye	783921d889	[Perf] Optimize Vectorization Utils for Int 8 Quantization Kernels (#20331 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-07-04 15:06:24 +08:00
Aaron Pham	4a98edff1f	[Structured Outputs][V1] Skipping with models doesn't contain tokenizers (#20365 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-07-04 15:05:49 +08:00
Reid	a7bab0c9e5	[Misc] small update (#20462 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-07-03 20:33:44 -07:00
汪志鹏	25950dca9b	Add ignore consolidated file in mistral example code (#20420 ) Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>	2025-07-04 02:55:07 +00:00
Gabriel Marinho	a4113b035c	[Platform] Add custom default max tokens (#18557 ) Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>	2025-07-04 10:50:17 +08:00
Michael Goin	7e1665b089	[Misc] Change warn_for_unimplemented_methods to debug (#20455 )	2025-07-04 02:35:08 +00:00
Seiji Eicher	8d1096e7db	[Bugfix] Register reducer even if transformers_modules not available (#19510 ) Signed-off-by: Seiji Eicher <seiji@anyscale.com>	2025-07-03 22:08:12 +00:00
Nicolò Lucchesi	8d775dd30a	[Misc] Fix `Unable to detect current VLLM config. Defaulting to NHD kv cache layout` warning (#20400 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-07-03 14:56:09 -07:00
bnellnm	78fe77534b	[Kernel] Enable fp8 support for pplx and BatchedTritonExperts. (#18864 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2025-07-03 14:55:40 -07:00
Yuxuan Zhang	2f2fcb31b8	[Misc] Remove _maybe_ignore_quant_config from GLM4.1v (#20432 ) Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> v0.9.2rc1	2025-07-03 21:41:13 +00:00

1 2 3 4 5 ...

7511 Commits