xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-30 00:57:27 +08:00

Author	SHA1	Message	Date
xinli-centml	90d0a74b60	[Bugfix] Add revision to `transformers.Auto*.from_pretrained` processors (#17948 ) Signed-off-by: Xin Li <xin@centml.ai>	2025-05-11 07:52:44 +00:00
Jinzhen Lin	d74e5f37bc	[Kernel] fp4 marlin kernel (#17687 ) Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>	2025-05-10 19:58:49 -07:00
Chen Zhang	ca66a1674c	[v1] Rename specialized_manager.py to single_type_kv_cache_manager.py (#17946 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-05-10 16:14:12 -07:00
Chen Zhang	950751a987	[v1] Pass BlockTable and KVCacheSpec to AttentionMetadataBuilders (#17483 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-05-10 16:12:04 -07:00
Reid	4c31218f80	[Misc] remove --model from vllm serve usage (#17944 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-10 13:23:31 +00:00
Harry Mellor	68311891f5	Don't default construct `ModelConfig` when default constructing `VllmConfig` (#17943 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-10 13:23:00 +00:00
Ximo Guanter	fc4441a4ee	Add missing content type headers to /ping and /health (#17036 ) (#17786 ) Signed-off-by: Ximo Guanter <ximo.guanter@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-10 07:13:32 +01:00
tracelogfb	246e3e0a36	fix broken test vllm:test_kernels - test_attention_selector.py::test_flash_attn (#17873 ) Co-authored-by: Stephen Chen <tracelog@meta.com>	2025-05-10 10:46:54 +08:00
Mark McLoughlin	7042cc96b0	[V1][Spec Decoding] Log accumulated metrics after system goes idle (#17913 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-09 18:23:07 -07:00
Pavani Majety	0c0fdae84f	[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362 )	2025-05-09 16:24:41 -07:00
Alexei-V-Ivanov-AMD	3b602cdea7	AMD conditional all test execution // new test groups (#17556 ) Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com> Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>	2025-05-09 15:35:58 -07:00
Harry Mellor	4b2ed7926a	Improve configs - the rest! (#17562 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-09 15:18:44 -07:00
Mark McLoughlin	7e3571134f	[V1][Spec Decoding] Include bonus tokens in mean acceptance length (#17908 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-05-09 13:32:36 -07:00
Richard Zou	ea2236bf95	Add option to use torch._inductor.standalone_compile (#17057 ) Signed-off-by: rzou <zou3519@gmail.com>	2025-05-09 12:59:04 -07:00
Harry Mellor	7d4aedae7c	Handle error when `str` passed to `/v1/audio/transcriptions` (#17909 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-09 19:23:59 +00:00
Michael Goin	22481fbfa3	Update CT WNA16MarlinMoE integration (#16666 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-05-09 13:19:45 -04:00
Isotr0py	5c4c08f6f1	[Misc] Auto fallback to float16 for pre-Ampere GPUs when detected bfloat16 config (#17265 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-09 17:16:12 +00:00
Rui Qiao	c44c384b1c	[Misc] Add references in ray_serve_deepseek example (#17907 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-05-09 16:59:36 +00:00
Michael Goin	85b72cb7b1	Revert "[BugFix][AMD] Compatible patch for latest AITER(05/07/2025)" (#17910 )	2025-05-09 08:58:18 -07:00
Cyrus Leung	6e5595ca39	[CI/Build] Automatically retry flaky tests (#17856 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-09 09:55:17 -06:00
Chen Zhang	200da9a517	[v1] Move block management logic from KVCacheManager to SpecializedManager (#17474 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-05-09 15:25:34 +00:00
qli88	9f64e93415	[BugFix][AMD] Compatible patch for latest AITER(05/07/2025) (#17864 ) Signed-off-by: Qiang Li <qiang.li2@amd.com>	2025-05-09 08:59:36 -06:00
Reid	ec61ea20a8	[Misc] add dify integration (#17895 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-09 03:42:39 -07:00
Harry Mellor	c6798baa9c	Change `top_k` to be disabled with `0` (still accept `-1` for now) (#17773 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-09 10:01:49 +00:00
inkcherry	5b2dcbf0b8	Fix Whisper crash caused by invalid`` `max_num_batched_tokens``` config (#17853 ) Signed-off-by: inkcherry <mingzhi.liu@intel.com>	2025-05-09 09:16:26 +00:00
Isotr0py	6e4a93e3f7	[Bugfix][CPU] Fix broken AVX2 CPU TP support (#17252 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-05-09 08:55:14 +00:00
vllmellm	217db4baa6	[Bugfix][ROCm] Fix AITER MLA V1 (#17880 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>	2025-05-09 08:38:21 +00:00
Yan Ma	ff8c400502	[Doc] remove visible token in doc (#17884 ) Signed-off-by: yan <yanma1@habana.ai>	2025-05-09 01:21:31 -07:00
Michael Yao	89a0315f4c	[Doc] Update several links in reasoning_outputs.md (#17846 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2025-05-09 01:20:55 -07:00
Simon Mo	3d1e387652	[Docs] Add Slides from NYC Meetup (#17879 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-05-08 21:46:54 -07:00
Ning Xie	d310e6de98	[BUGFIX]: return fast when request requires prompt logprobs (#17251 )	2025-05-08 21:25:41 -07:00
Lucas Wilkinson	5e6f939484	[Attention] MLA move rotary embedding to cuda-graph region (#17668 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2025-05-09 11:14:42 +08:00
Shanshan Shen	760e3ecc8f	[V1][Structured Output] Update llguidance (`>= 0.7.11`) to avoid AttributeError (no `StructTag`) (#17839 ) Signed-off-by: shen-shanshan <467638484@qq.com>	2025-05-08 20:14:18 -07:00
vllmellm	3c9396a64f	[FEAT][ROCm]: Support AITER MLA on V1 Engine (#17523 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by: qli88 <qiang.li2@amd.com> Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>	2025-05-09 10:42:05 +08:00
Shu Wang	376786fac1	Add cutlass support for blackwell fp8 blockwise gemm (#14383 ) Signed-off-by: Shu Wang <shuw@nvidia.com>	2025-05-08 15:09:55 -07:00
Michael Goin	4f605a6de5	Fix noisy warning for uncalibrated q_scale/p_scale (#17414 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-05-08 15:56:59 -04:00
Michael Goin	8342e3abd1	[CI] Prune down lm-eval small tests (#17012 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-05-08 19:00:26 +00:00
yarongmu-google	a83a0f92b5	[Test] Attempt all TPU V1 tests, even if some of them fail. (#17334 ) Signed-off-by: Yarong Mu <ymu@google.com>	2025-05-08 17:20:54 +00:00
Russell Bryant	226a4272cf	[V1] Improve VLLM_ALLOW_INSECURE_SERIALIZATION logging (#17860 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-05-08 16:57:35 +00:00
Russell Bryant	ec54d73c31	[CI] Fix test_collective_rpc (#17858 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-05-08 16:47:12 +00:00
Jee Jee Li	a944f8ede7	[Misc] Delete LoRA-related redundancy code (#17841 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-05-08 06:02:21 -07:00
Cyrus Leung	015815fe01	[Bugfix] `use_fast` failing to be propagated to Qwen2-VL image processor (#17838 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-08 05:39:21 -07:00
Harry Mellor	e4ca6e3a99	Fix transient dependency error in docs build (#17848 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-05-08 03:42:03 -07:00
Reid	53d0cb7423	[Misc] add chatbox integration (#17828 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-05-08 10:05:26 +00:00
Lu Fang	f50dcb7c21	[Easy] Eliminate c10::optional usage in vllm/csrc (#17819 )	2025-05-08 03:05:10 -07:00
Cyrus Leung	a1e19b635d	[Doc] Fix a typo in the file name (#17836 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-05-08 18:04:18 +08:00
fxmarty-amd	bb239a730f	[Bugfix] Fix quark fp8 format loading on AMD GPUs (#12612 ) Signed-off-by: Felix Marty <felmarty@amd.com> Signed-off-by: kewang2 <kewang2@amd.com> Co-authored-by: kewang2 <kewang2@amd.com>	2025-05-08 02:53:53 -07:00
Jevin Jiang	a463555dee	[TPU] Fix the test_sampler (#17820 )	2025-05-08 05:51:33 -04:00
Rick Yuan	ca04b97c93	[Bugfix] Fix tool call template validation for Mistral models (#17644 ) Signed-off-by: Rick Yuan <yuan821120@gmail.com> Signed-off-by: RIck Yuan <yuan821120@gmail.com> Co-authored-by: Aaron Pham <Aaronpham0103@gmail.com>	2025-05-08 09:47:19 +00:00
xsank	0a9bbaa104	[Misc] support model prefix & add deepseek vl2 tiny fused moe config (#17763 ) Signed-off-by: 唯勤 <xsank.mz@alibaba-inc.com> Co-authored-by: 唯勤 <xsank.mz@alibaba-inc.com>	2025-05-08 07:50:22 +00:00

... 7 8 9 10 11 ...

6796 Commits