vllm/e2e at 99ded1e1c4dc00baa77beae74602ebafe4921176 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-20 11:47:15 +08:00

History

Abhinav Goyal 2416b26e11

[Speculative Decoding] Medusa Implementation with Top-1 proposer (#4978 )

2024-07-09 18:34:02 -07:00

..

__init__.py

[Speculative decoding 7/9] Speculative decoding end-to-end correctness tests. (#3951 )

2024-04-23 08:02:36 +00:00

conftest.py

[CORE] Adding support for insertion of soft-tuned prompts (#4645 )

2024-07-09 13:26:36 -07:00

test_compatibility.py

[Speculative decoding][Re-take] Enable TP>1 speculative decoding (#4840 )

2024-05-16 00:53:51 -07:00

test_integration_dist_tp2.py

[Speculative Decoding] MLPSpeculator Tensor Parallel support (1/2) (#6050 )

2024-07-02 07:20:29 -07:00

test_integration_dist_tp4.py

[Speculative Decoding] Support draft model on different tensor-parallel size than target model (#5414 )

2024-06-25 09:56:06 +00:00

test_integration.py

[Speculative decoding][Re-take] Enable TP>1 speculative decoding (#4840 )

2024-05-16 00:53:51 -07:00

test_logprobs.py

[Speculative decoding] Support target-model logprobs (#4378 )

2024-05-03 15:52:01 -07:00

test_medusa_correctness.py

[Speculative Decoding] Medusa Implementation with Top-1 proposer (#4978 )

2024-07-09 18:34:02 -07:00

test_mlp_correctness.py

[CORE] Quantized lm-head Framework (#4442 )

2024-07-02 22:25:17 +00:00

test_multistep_correctness.py

[Speculative Decoding 2/2 ] Integrate typical acceptance sampler into Spec Decode Worker (#5348 )

2024-07-01 00:33:05 -07:00

test_ngram_correctness.py

[Dynamic Spec Decoding] Minor fix for disabling speculative decoding (#5000 )

2024-05-25 10:00:14 -07:00