vllm/spec_decode at a04720bc36401d831cb048c3917b9e58173d9c1d - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-09 12:27:20 +08:00

History

Ekagra Ranjan a04720bc36

[V1][Spec Decode][Bugfix] Load quantize weights for EAGLE (#18290 )

2025-05-22 15:17:33 -07:00

..

__init__.py

[V1][BugFix] Add __init__.py to v1/spec_decode/ (#13359 )

2025-02-16 09:39:08 -08:00

eagle.py

[V1][Spec Decode][Bugfix] Load quantize weights for EAGLE (#18290 )

2025-05-22 15:17:33 -07:00

medusa.py

[Model] vLLM v1 supports Medusa (#17956 )

2025-05-15 21:05:31 -07:00

metadata.py

[V1][Spec Decode] Optimize Rejection Sampler with Triton Kernels (#14930 )

2025-03-18 14:31:54 -07:00

metrics.py

[Misc] Add Ray Prometheus logger to V1 (#17925 )

2025-05-16 01:02:42 -07:00

ngram_proposer.py

[V1][Spec Decode] Handle draft tokens beyond max_model_len (#16087 )

2025-04-21 12:38:50 -07:00

utils.py

[V1][Spec Decode] Enable spec decode for top-p & top-k sampling (#15063 )

2025-03-24 17:16:46 -07:00