xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-26 11:37:06 +08:00

Author	SHA1	Message	Date
Harry Mellor	88edf5994c	[Docs] Reduce the size of the built docs (#21920 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-30 07:35:08 -07:00
Michael Goin	a33ea28b1b	Add `flashinfer_python` to CUDA wheel requirements (#21389 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-29 12:51:58 -07:00
Isotr0py	31084b3b1f	[Bugfix][CI/Build] Update peft version in test requirement (#21729 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-07-28 06:17:43 -07:00
Harry Mellor	1395dd9c28	[Docs] Add revision date to rendered docs (#21752 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-28 06:12:46 -07:00
Chengji Yao	f1b286b2fb	[TPU] Update ptxla nightly version to 20250724 (#21555 ) Signed-off-by: Chengji Yao <chengjiyao@google.com>	2025-07-25 17:09:00 -07:00
Kebe	396ee94180	[CI] Unifying Dockerfiles for ARM and X86 Builds (#21343 ) Signed-off-by: Kebe <mail@kebe7jun.com>	2025-07-25 07:33:56 -07:00
Juncheng Gu	6066284914	[P/D] Support CPU Transfer in NixlConnector (#18293 ) Signed-off-by: Juncheng Gu <juncgu@gmail.com> Signed-off-by: Richard Liu <ricliu@google.com> Co-authored-by: Richard Liu <39319471+richardsliu@users.noreply.github.com> Co-authored-by: Richard Liu <ricliu@google.com>	2025-07-24 17:58:42 +01:00
elvischenv	5a19a6c670	[Fix] Update mamba_ssm to 2.2.5 (#21421 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-07-24 03:25:41 -07:00
Chauncey	6da0078523	[Feat] Allow custom naming of vLLM processes (#21445 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-07-24 03:15:23 -07:00
Julien Denize	6d8d0a24c0	Add think chunk (#21333 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai>	2025-07-23 21:51:32 -07:00
Christian Pinto	8560a5b258	[Core][Model] PrithviMAE Enablement on vLLM v1 engine (#20577 ) Signed-off-by: Christian Pinto <christian.pinto@ibm.com>	2025-07-23 11:00:23 -07:00
Li, Jiang	e3a0e43d7f	[bugfix] Fix auto thread-binding when world_size > 1 in CPU backend and refactor code (#21032 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-07-19 05:13:55 -07:00
Woosuk Kwon	4de7146351	[V0 deprecation] Remove V0 HPU backend (#21131 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-07-17 16:37:36 -07:00
kYLe	4ef00b5cac	[VLM] Add Nemotron-Nano-VL-8B-V1 support (#20349 ) Signed-off-by: Kyle Huang <kylhuang@nvidia.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-07-17 03:07:55 -07:00
XiongfeiWei	58760e12b1	[TPU] Start using python 3.12 (#21000 ) Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>	2025-07-16 19:37:44 -07:00
Michael Goin	4e7dfbe7b4	Update PyTorch to `torch==2.7.1` for CUDA (#21011 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-17 02:30:44 +00:00
Chauncey	b5c3b68359	[Misc] bump xgrammar version to v0.1.21 (#20992 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-07-15 19:42:16 -07:00
Harry Mellor	b637e9dcb8	Add full serve CLI reference back to docs (#20978 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-15 17:42:30 +00:00
Patrick von Platen	e7e3e6d263	Voxtral (#20970 ) Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-07-15 07:35:30 -07:00
Reid	37e2ecace2	feat: add image zoom to improve image viewing experience (#20763 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-07-14 20:14:23 -07:00
XiongfeiWei	d4170fad39	Use w8a8 quantized matmul Pallas kernel (#19170 ) Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>	2025-07-15 03:06:33 +00:00
22quinn	f326ab9c88	[Bugfix] Bump up mistral_common to support v13 tokenizer (#20905 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>	2025-07-14 10:45:03 +00:00
Maroon Ayoub	66f6fbd393	[Prefix Cache] Add reproducible prefix-cache block hashing using SHA-256 + CBOR (64bit) (#20511 ) Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>	2025-07-14 02:45:31 +00:00
Woosuk Kwon	f45a332886	[Sched] Enhance the logic to remove stopped requests from queues (#20739 )	2025-07-12 15:33:13 -07:00
Isotr0py	01cae37713	[CI/Build] Ensure compatability with Transformers v4.53 (#20541 ) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-07-11 20:53:07 -07:00
Luka Govedič	762be26a8e	[Bugfix] Upgrade depyf to 0.19 and streamline custom pass logging (#20777 ) Signed-off-by: Luka Govedic <lgovedic@redhat.com> Signed-off-by: luka <lgovedic@redhat.com>	2025-07-11 00:15:22 -07:00
Nathan Hoos	d6902ce79f	[V0][V1][Core] Add outlines integration for V1, and update V0 integration. (#15975 ) Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>	2025-07-10 15:30:26 -04:00
Harry Mellor	3482fd7e4e	[Doc] Add engine args back in to the docs (#20674 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-07-10 08:02:40 -07:00
Jacob Manning	bf03ff3575	[Kernel] Add Conch backend for mixed-precision linear layer (#19818 ) Signed-off-by: Jacob Manning <jmanning+oss@stackav.com>	2025-07-09 13:17:55 -07:00
XiongfeiWei	849590a2a7	Update torch/xla pin to 20250703 (#20589 ) Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>	2025-07-08 07:44:02 -07:00
Sanger Steel	72d14d0eed	[Frontend] [Core] Integrate Tensorizer in to S3 loading machinery, allow passing arbitrary arguments during save/load (#19619 ) Signed-off-by: Sanger Steel <sangersteel@gmail.com> Co-authored-by: Eta <esyra@coreweave.com>	2025-07-07 22:47:43 -07:00
Jee Jee Li	4ff79a136e	[Misc] Set the minimum openai version (#20539 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-07 09:15:26 +00:00
Peter Pan	5561681d04	[CI] add kvcache-connector dependency definition and add into CI build (#18193 ) Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>	2025-07-04 06:49:18 -07:00
Nicolò Lucchesi	d1b689c445	[Bugfix] Fix flaky `test_streaming_response` test (#20363 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-07-03 14:46:24 +00:00
Jee Jee Li	1819fbda63	[Quantization] Bump to use latest bitsandbytes (#20424 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-03 21:58:46 +08:00
Prashant Gupta	22e9d42040	[Misc] add xgrammar for arm64 (#18359 ) Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>	2025-07-01 07:02:20 +00:00
Yang Wang	8b64c895c0	[CI] Sync test dependency with test.in for torch nightly (#19632 ) Signed-off-by: Yang Wang <elainewy@meta.com> Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Concurrensee <yida.wu@amd.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-06-26 20:55:25 -07:00
Dipika Sikka	a57d57fa72	[Quantization] Bump to use latest `compressed-tensors` (#20033 ) Signed-off-by: Dipika <dipikasikka1@gmail.com> Co-authored-by: Kyle Sayers <kylesayrs@gmail.com>	2025-06-26 20:50:06 -07:00
Kunshang Ji	b69781f107	[Hardware][Intel GPU] Add v1 Intel GPU support with Flash attention backend. (#19560 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2025-06-26 09:27:18 -07:00
Li, Jiang	0567c8249f	[CPU] Fix torch version in x86 CPU backend (#19258 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-06-26 03:34:47 -07:00
h-avsha	3443aaf8dd	Move to a faster base64 implementation (#19984 ) Signed-off-by: h-avsha <avshalom.manevich@hcompany.ai>	2025-06-24 20:33:51 -07:00
Li, Jiang	01220ce89a	[CI][CPU] Improve dummy Triton interfaces and fix the CPU CI (#19838 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-06-19 15:46:09 +00:00
QiliangCui	04fefe7c9a	[TPU] Update torch-xla version to include paged attention tuned block change (#19813 ) Signed-off-by: Qiliang Cui <derrhein@gmail.com>	2025-06-18 22:41:13 +00:00
Chenyaaang	dac8cc49f4	[TPU] Update torch version to include paged attention kernel change (#19706 ) Signed-off-by: Chenyaaang <chenyangli@google.com>	2025-06-17 22:24:49 +00:00
Ning Xie	c3fec47bb7	[MISC] bump huggingface_hub pkg to 0.33.0 (#19547 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-06-16 05:22:28 -07:00
wang.yuqi	f40f763f12	[CI] Add mteb testing for rerank models (#19344 )	2025-06-16 01:36:43 -07:00
Richard Zou	91b2c17a55	[CI/Build] Fix torch nightly CI dependencies part 2 (#19589 )	2025-06-15 20:01:10 +08:00
汪志鹏	ace5cdaff0	[Fix] bump mistral common to support magistral (#19533 ) Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>	2025-06-12 22:28:12 -07:00
kourosh hakhamaneshi	e6aab5de29	Revert "[Build/CI] Add tracing deps to vllm container image (#15224 )" (#19378 )	2025-06-12 17:26:40 -07:00
Richard Zou	42f52cc95b	[CI/Build] Fix torch nightly CI dependencies (#19505 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-06-11 14:40:42 -07:00

1 2 3 4

166 Commits