xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-27 14:17:21 +08:00

Author	SHA1	Message	Date
Aleksandr Malyshev	812c981fa0	Splitting attention kernel file (#10091 ) Signed-off-by: maleksan85 <maleksan@amd.com> Co-authored-by: Aleksandr Malyshev <maleksan@amd.com>	2024-11-11 22:55:07 -08:00
Jee Jee Li	7f5edb5900	[Misc][LoRA] Replace hardcoded cuda device with configurable argument (#10223 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-11-12 11:10:15 +08:00
youkaichao	eea55cca5b	[1/N] torch.compile user interface design (#10237 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-11 18:01:06 -08:00
Russell Bryant	9cdba9669c	[Doc] Update help text for `--distributed-executor-backend` (#10231 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-11-12 09:55:09 +08:00
youkaichao	d1c6799b88	[doc] update debugging guide (#10236 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-11 15:21:12 -08:00
Robert Shaw	6ace6fba2c	[V1] `AsyncLLM` Implementation (#9826 ) Signed-off-by: Nick Hill <nickhill@us.ibm.com> Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2024-11-11 23:05:38 +00:00
Nikolai Shcheglov	08f93e7439	Make shutil rename in python_only_dev (#10233 ) Signed-off-by: shcheglovnd <shcheglovnd@avride.ai>	2024-11-11 14:29:19 -08:00
Woosuk Kwon	9d5b4e4dea	[V1] Enable custom ops with piecewise CUDA graphs (#10228 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-11-11 11:58:07 -08:00
youkaichao	8a7fe47d32	[misc][distributed] auto port selection and disable tests (#10226 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-11 11:54:59 -08:00
Yuan Tang	4800339c62	Add docs on serving with Llama Stack (#10183 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Co-authored-by: Russell Bryant <rbryant@redhat.com>	2024-11-11 11:28:55 -08:00
Woosuk Kwon	fe15729a2b	[V1] Use custom ops for piecewise CUDA graphs (#10227 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-11-11 11:26:48 -08:00
youkaichao	330e82d34a	[v1][torch.compile] support managing cudagraph buffer (#10203 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-11-11 11:10:27 -08:00
Woosuk Kwon	d7a4f2207b	[V1] Do not use inductor for piecewise CUDA graphs (#10225 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-11-11 11:05:57 -08:00
Woosuk Kwon	f9dadfbee3	[V1] Fix detokenizer ports (#10224 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-11-11 10:42:07 -08:00
dependabot[bot]	25144ceed0	Bump actions/setup-python from 5.2.0 to 5.3.0 (#10209 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-11 17:24:10 +00:00
youkaichao	e6de9784d2	[core][distributed] add stateless process group (#10216 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-11 09:02:14 -08:00
Yangcheng Li	36fc439de0	[Doc] fix doc string typo in block_manager `swap_out` function (#10212 )	2024-11-11 08:53:07 -08:00
harrywu	874f551b36	[Metrics] add more metrics (#4464 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Robert Shaw <rshaw@neuralmagic.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-12 00:17:38 +08:00
Isotr0py	2cebda42bb	[Bugfix][Hardware][CPU] Fix broken encoder-decoder CPU runner (#10218 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-11 12:37:58 +00:00
Roger Wang	5fb1f935b0	[V1] Allow `tokenizer_mode` and `trust_remote_code` for Detokenizer (#10211 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2024-11-11 18:01:18 +08:00
Jee Jee Li	36e4acd02a	[LoRA][Kernel] Remove the unused libentry module (#10214 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2024-11-11 09:43:23 +00:00
Isotr0py	58170d6503	[Hardware][CPU] Add embedding models support for CPU backend (#10193 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-11 08:54:28 +00:00
dependabot[bot]	9804ac7c7c	Bump the patch-update group with 5 updates (#10210 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-11 07:22:40 +00:00
youkaichao	f89d18ff74	[6/N] pass whole config to inner model (#10205 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-11 06:41:46 +00:00
youkaichao	f0f2e5638e	[doc] improve debugging code (#10206 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-10 17:49:40 -08:00
yansh97	ad9a78bf64	[Doc] Fix typo error in vllm/entrypoints/openai/cli_args.py (#10196 )	2024-11-11 00:14:22 +00:00
youkaichao	73b9083e99	[misc] improve cloudpickle registration and tests (#10202 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-11 00:10:53 +00:00
Shawn Du	20cf2f553c	[Misc] small fixes to function tracing file path (#9543 ) Signed-off-by: Shawn Du <shawnd200@outlook.com> Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2024-11-10 15:21:06 -08:00
Yongzao	bfb7d61a7c	[doc] Polish the integration with huggingface doc (#10195 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2024-11-10 10:22:04 -08:00
FuryMartin	19682023b6	[Doc] Fix typo error in CONTRIBUTING.md (#10190 ) Signed-off-by: FuryMartin <furymartin9910@outlook.com>	2024-11-10 07:47:24 +00:00
youkaichao	9fa4bdde9d	[ci][build] limit cmake version (#10188 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-09 16:27:26 -08:00
Cyrus Leung	51c2e1fcef	[CI/Build] Split up models tests (#10069 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-09 11:39:14 -08:00
Krishna Mandal	b09895a618	[Frontend][Core] Override HF `config.json` via CLI (#5836 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-09 16:19:27 +00:00
cjackal	d88bff1b96	[Frontend] add `add_request_id` middleware (#9594 ) Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>	2024-11-09 10:18:29 +00:00
Zhao Yingzhuo	9e37266420	bugfix: fix the bug that stream generate not work (#2756 )	2024-11-09 10:09:48 +00:00
youkaichao	8a4358ecb5	[doc] explaining the integration with huggingface (#10173 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-09 01:02:54 -08:00
youkaichao	bd46357ad9	[bugfix] fix broken tests of mlp speculator (#10177 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-09 00:04:50 -08:00
bnellnm	f192aeba74	[Bugfix] Enable some fp8 and quantized fullgraph tests (#10171 ) Signed-off-by: Bill Nell <bill@neuralmagic.com>	2024-11-09 08:01:27 +00:00
Chendi.Xue	8e1529dc57	[CI/Build] Add run-hpu-test.sh script (#10167 ) Signed-off-by: Chendi.Xue <chendi.xue@intel.com>	2024-11-09 06:26:52 +00:00
youkaichao	1a95f10ee7	[5/N] pass the whole config to model (#9983 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-09 14:17:28 +08:00
Cyrus Leung	49d2a41a86	[Doc] Adjust RunLLM location (#10176 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-08 20:07:10 -08:00
Isotr0py	47672f38b5	[CI/Build] Fix VLM broadcast tests `tensor_parallel_size` passing (#10161 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-09 04:02:59 +00:00
Michael Goin	f83feccd7f	[Bugfix] Ignore GPTQ quantization of Qwen2-VL visual module (#10169 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2024-11-09 03:36:46 +00:00
Cyrus Leung	e0191a95d8	[0/N] Rename `MultiModalInputs` to `MultiModalKwargs` (#10040 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-09 11:31:02 +08:00
Li, Jiang	d7edca1dee	[CI/Build] Adding timeout in CPU CI to avoid CPU test queue blocking (#6892 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-09 03:27:11 +00:00
rasmith	127c07480e	[Kernel][Triton] Add Triton implementation for scaled_mm_triton to support fp8 and int8 SmoothQuant, symmetric case (#9857 ) Signed-off-by: Randall Smith <Randall.Smith@amd.com>	2024-11-08 19:59:22 -05:00
bnellnm	10b67d865d	[Bugfix] SymIntArrayRef expected to contain concrete integers (#10170 ) Signed-off-by: Bill Nell <bill@neuralmagic.com>	2024-11-08 14:44:18 -08:00
Luka Govedič	4f93dfe952	[torch.compile] Fuse RMSNorm with quant (#9138 ) Signed-off-by: luka <luka@neuralmagic.com> Co-authored-by: youkaichao <youkaichao@126.com>	2024-11-08 21:20:08 +00:00
Florian Zimmermeister	e1b5a82179	Rename vllm.logging to vllm.logging_utils (#10134 )	2024-11-08 20:53:24 +00:00
Luka Govedič	87713c6053	[CI/Build] Ignore .gitignored files for shellcheck (#10162 ) Signed-off-by: luka <luka@neuralmagic.com>	2024-11-08 19:53:36 +00:00

1 2 3 4 5 ...

3384 Commits