xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-10 21:59:08 +08:00

Author	SHA1	Message	Date
Ryan McConville	6c11ecf8d3	[Bugfix] Validate logit biases to prevent out of vocab ids crashing engine (#16529 ) Signed-off-by: Ryan McConville <ryan@ryanmcconville.com>	2025-04-12 20:19:19 +00:00
Nick Hill	35fad35a48	[V1][Sampler] Faster top-k only implementation (#15478 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-03-26 10:56:47 -07:00
Chen1022	0893567db9	[V1][Minor] fix comments (#15392 ) Signed-off-by: chenjincong <chenjincong@baidu.com> Signed-off-by: Chen-0210 <chenjincong11@gmail.com> Co-authored-by: chenjincong <chenjincong@baidu.com>	2025-03-24 08:45:32 -07:00
Woosuk Kwon	05ccd0aa35	[V1] Ensure using int64 for sampled token ids (#15065 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-03-18 23:52:19 -07:00
Chujie Zheng	027827cc1d	fix long dtype in topk sampling (#15049 )	2025-03-18 15:57:31 -07:00
Lily Liu	8d6cf89526	[V1] [Spec Decode] Support random sampling for spec decode (#13933 ) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-03-16 22:00:20 -07:00
22quinn	eb8b5eb183	[V1] Support bad_words in sampler (#13376 ) Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-03-08 14:50:26 -08:00
Lily Liu	5629f26df7	[V1][Spec Decode] Change Spec Decode Rejection Sampling API (#13729 )	2025-02-25 18:14:48 -08:00
Lu Fang	bb78fb318e	[v1] Support allowed_token_ids in v1 Sampler (#13210 ) Signed-off-by: Lu Fang <lufang@fb.com>	2025-02-22 14:13:05 +08:00
Nick Hill	31aa045c11	[V1][Sampler] Avoid an operation during temperature application (#13587 )	2025-02-20 22:05:56 -08:00
Nick Hill	30172b4947	[V1] Optimize handling of sampling metadata and req_ids list (#13244 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-02-18 12:15:33 -08:00
Lily Liu	80f63a3966	[V1][Spec Decode] Ngram Spec Decode (#12193 ) Signed-off-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>	2025-02-15 18:05:11 -08:00
Nick Hill	6a854c7a2b	[V1][Sampler] Don't apply temp for greedy-only (#13311 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-02-14 18:10:53 -08:00
Aoyu	a12934d3ec	[V1][Core] min_p sampling support (#13191 ) Signed-off-by: Aoyu <aoyuzhan@amazon.com> Co-authored-by: Aoyu <aoyuzhan@amazon.com>	2025-02-14 15:50:05 -08:00
Lu Fang	6224a9f620	Support logit_bias in v1 Sampler (#13079 )	2025-02-14 04:34:59 -08:00
afeldman-nm	0630d4537a	[V1] Logprobs and prompt logprobs support (#9880 ) This PR is adding support for sample logprobs & prompt logprobs to vLLM v1. New behavior: - During model execution, model runner computes sample logprobs (if user-provided logprobs setting is not None) and prompt logprobs (if user-provided prompt_logprobs setting is not None). For both sample and prompt logprobs, the engine core returns 3 vectors: token ids, token logprob values, token ranks. Ranks reflect tokens' 1-indexed positions in the vocabulary vector after sorting the vocabulary by log probability in descending order. - In scheduler.update_from_output(), sample and prompt logprobs are incorporated into the EngineCoreOutput data structure which is transferred to the engine client. If multiprocessing is enabled, then sample and prompt logprobs will be (de)serialized when the EngineCoreOutput data structure is (de)serialized. - During output processing, the LogprobsProcessor transforms the triplet of token ids, token logprobs values, and token ranks into the OpenAI-compatible List[Dict[token id,Logprob]] format (for sample and prompt logprobs respectively.) - Each Logprob instance (whether sample- or prompt-) consists of a token's log-probability, rank, and detokenized string representation. Note that logprob detokenization is handled by the LogprobsProcessor not the detokenizer. Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com> Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-02-07 07:26:20 -08:00
Russell Bryant	e489ad7a21	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 ) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-02-02 11:58:18 -08:00
Keyun Tong	fa63e710c7	[V1][Perf] Reduce scheduling overhead in model runner after cuda sync (#12094 ) Signed-off-by: Keyun Tong <tongkeyun@gmail.com>	2025-01-26 00:42:37 -08:00
Woosuk Kwon	81b979f2a8	[V1] Fix yapf (#11538 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-12-27 09:47:10 +09:00
Woosuk Kwon	371d04d39b	[V1] Use FlashInfer Sampling Kernel for Top-P & Top-K Sampling (#11394 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-12-27 09:32:38 +09:00
sroy745	dcb1a944d4	[V1] Adding min tokens/repetition/presence/frequence penalties to V1 sampler (#10681 ) Signed-off-by: Sourashis Roy <sroy@roblox.com> Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-12-26 19:02:58 +09:00
Tyler Michael Smith	28b3a1c7e5	[V1] Multiprocessing Tensor Parallel Support for v1 (#9856 ) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2024-12-10 06:28:14 +00:00
Nick Hill	1f1b6d6eda	[V1] Support per-request seed (#9945 ) Signed-off-by: Nick Hill <nickhill@us.ibm.com>	2024-11-03 09:14:17 -08:00
Woosuk Kwon	6c5af09b39	[V1] Implement vLLM V1 [1/N] (#9289 )	2024-10-22 01:24:07 -07:00

24 Commits