vllm/v1 at af51d80fa14ca8e01c6be36232170683f3e47f09 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-28 06:37:12 +08:00

History

Roger Wang af51d80fa1

Revert "[V1] Scatter and gather placeholders in the model runner" (#16075 )

2025-04-04 14:50:57 -07:00

..

Revert "[V1] Scatter and gather placeholders in the model runner" (#16075 )

2025-04-04 14:50:57 -07:00

[V1] Implement sliding window attention in kv_cache_manager (#14097 )

2025-04-01 00:33:17 -07:00

[V1] AsyncLLM data parallel (#13923 )

2025-03-27 16:14:41 -07:00

[V1] Fix json_object support with xgrammar (#15488 )

2025-04-02 02:00:08 -07:00

[V1][Sampler] Faster top-k only implementation (#15478 )

2025-03-26 10:56:47 -07:00

[V1][Spec Decode] Respect prompt_lookup_max (#15348 )

2025-03-23 10:41:44 -07:00

structured_output

[CI] xgrammar structured output supports Enum. (#15757 )

2025-03-29 20:20:02 -07:00

[TPU] Support sliding window and logit soft capping in the paged attention kernel for TPU. (#15732 )

2025-04-03 14:23:28 -07:00

[V1] Scheduler Refactoring [1/N] - Add Scheduler Interface (#15250 )

2025-03-20 17:50:43 -07:00

__init__.py

[V1] AsyncLLM Implementation (#9826 )

2024-11-11 23:05:38 +00:00

test_async_llm_dp.py

[V1] AsyncLLM data parallel (#13923 )

2025-03-27 16:14:41 -07:00

test_oracle.py

[Misc] Enable V1 LoRA by default (#15320 )

2025-04-01 16:53:56 +08:00

test_stats.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

test_utils.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00