vllm/core at 18e9e1f7b34c46857466fe24e9f9bdee17542f2c - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-23 23:47:25 +08:00

History

Cody Yu e3580537a4

[Performance] Enable chunked prefill and prefix caching together (#7753 )

2024-08-28 00:36:31 -07:00

..

[Performance][BlockManagerV2] Mark prefix cache block as computed after schedule (#7822 )

2024-08-26 11:24:53 -07:00

__init__.py

[Tests] Add block manager and scheduler tests (#3108 )

2024-03-05 18:23:34 -08:00

test_block_manager.py

[Performance] Enable chunked prefill and prefix caching together (#7753 )

2024-08-28 00:36:31 -07:00

test_chunked_prefill_scheduler.py

[Performance] Enable chunked prefill and prefix caching together (#7753 )

2024-08-28 00:36:31 -07:00

test_scheduler_encoder_decoder.py

[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )

2024-08-06 16:51:47 -04:00

test_scheduler.py

[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )

2024-08-06 16:51:47 -04:00

test_serialization.py

[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )

2024-08-18 17:57:20 -07:00

utils.py

[Core] Asynchronous Output Processor (#7049 )

2024-08-26 20:53:20 -07:00