vllm/e2e at 1856aff4d66833b258ce64132413ab8a18cc18a6 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-23 16:27:14 +08:00

History

Travis Johnson cc0eaf12b1

[Bugfix] spec decode handle None entries in topk args in create_sequence_group_output (#7232 )

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

2024-08-22 09:33:48 -04:00

..

__init__.py

[Speculative decoding 7/9] Speculative decoding end-to-end correctness tests. (#3951 )

2024-04-23 08:02:36 +00:00

conftest.py

[Speculative Decoding] Fixing hidden states handling in batch expansion (#7508 )

2024-08-19 17:58:14 -07:00

test_compatibility.py

[Speculative decoding][Re-take] Enable TP>1 speculative decoding (#4840 )

2024-05-16 00:53:51 -07:00

test_eagle_correctness.py

[Speculative Decoding] EAGLE Implementation with Top-1 proposer (#6830 )

2024-08-22 02:42:24 -07:00

test_integration_dist_tp2.py

[Model] RowParallelLinear: pass bias to quant_method.apply (#6327 )

2024-07-19 07:15:22 -06:00

test_integration_dist_tp4.py

[BUGFIX] Raise an error for no draft token case when draft_tp>1 (#6369 )

2024-07-19 06:01:09 -07:00

test_integration.py

[Misc] Add quantization config support for speculative model. (#7343 )

2024-08-15 19:34:28 -07:00

test_logprobs.py

[Bugfix] spec decode handle None entries in topk args in create_sequence_group_output (#7232 )

2024-08-22 09:33:48 -04:00

test_medusa_correctness.py

[Speculative Decoding] EAGLE Implementation with Top-1 proposer (#6830 )

2024-08-22 02:42:24 -07:00

test_mlp_correctness.py

[Speculative Decoding] Fixing hidden states handling in batch expansion (#7508 )

2024-08-19 17:58:14 -07:00

test_multistep_correctness.py

[Misc] Log spec decode metrics (#6454 )

2024-07-16 20:37:10 +00:00

test_ngram_correctness.py

[Dynamic Spec Decoding] Minor fix for disabling speculative decoding (#5000 )

2024-05-25 10:00:14 -07:00

test_seed.py

[BugFix] Fix use of per-request seed with pipeline parallel (#6698 )

2024-07-30 10:40:08 -07:00