Logo
Explore Help
Sign In
xinyun/vllm
1
0
Fork 0
You've already forked vllm
mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-28 18:57:13 +08:00
Code Issues Packages Projects Releases Wiki Activity
vllm/vllm/v1/engine
History
Woosuk Kwon cd4a72a28d
[V1][Spec decode] Move drafter to model runner (#13363)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-02-17 15:40:12 -08:00
..
__init__.py
[V1] LoRA - Enable Serving Usecase (#12883)
2025-02-14 14:21:12 +08:00
async_llm.py
[V1][Metrics] Add iteration_tokens_total histogram from V0 (#13288)
2025-02-15 03:56:19 -08:00
core_client.py
[V1] LoRA - Enable Serving Usecase (#12883)
2025-02-14 14:21:12 +08:00
core.py
[V1][Spec decode] Move drafter to model runner (#13363)
2025-02-17 15:40:12 -08:00
detokenizer.py
[V1] Logprobs and prompt logprobs support (#9880)
2025-02-07 07:26:20 -08:00
llm_engine.py
[V1][Metrics] Add several request timing histograms (#12644)
2025-02-11 10:14:00 -05:00
logprobs.py
[V1] Logprobs and prompt logprobs support (#9880)
2025-02-07 07:26:20 -08:00
mm_input_cache.py
[V1] Consolidate MM cache size to vllm.envs (#13239)
2025-02-13 20:19:03 -08:00
output_processor.py
[Bug] [V1] Try fetching stop_reason from EngineOutput before checking the request (#13108)
2025-02-12 02:39:16 -08:00
processor.py
[V1] Clarify input processing and multimodal feature caching logic (#13211)
2025-02-13 03:43:24 -08:00
Powered by Gitea Version: 1.23.1 Page: 531ms Template: 4ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API