This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2025-12-21 00:55:01 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
tests
/
spec_decode
/
e2e
History
Qubitium-ModelCloud
ee93f4f92a
[CORE] Quantized lm-head Framework (
#4442
)
...
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com> Co-authored-by: ZX <zx@lbx.dev>
2024-07-02 22:25:17 +00:00
..
__init__.py
[Speculative decoding 7/9] Speculative decoding end-to-end correctness tests. (
#3951
)
2024-04-23 08:02:36 +00:00
conftest.py
[VLM] Remove
image_input_type
from VLM config (
#5852
)
2024-07-02 07:57:09 +00:00
test_compatibility.py
[Speculative decoding][Re-take] Enable TP>1 speculative decoding (
#4840
)
2024-05-16 00:53:51 -07:00
test_integration_dist_tp2.py
[Speculative Decoding] MLPSpeculator Tensor Parallel support (1/2) (
#6050
)
2024-07-02 07:20:29 -07:00
test_integration_dist_tp4.py
[Speculative Decoding] Support draft model on different tensor-parallel size than target model (
#5414
)
2024-06-25 09:56:06 +00:00
test_integration.py
[Speculative decoding][Re-take] Enable TP>1 speculative decoding (
#4840
)
2024-05-16 00:53:51 -07:00
test_logprobs.py
[Speculative decoding] Support target-model logprobs (
#4378
)
2024-05-03 15:52:01 -07:00
test_mlp_correctness.py
[CORE] Quantized lm-head Framework (
#4442
)
2024-07-02 22:25:17 +00:00
test_multistep_correctness.py
[Speculative Decoding 2/2 ] Integrate typical acceptance sampler into Spec Decode Worker (
#5348
)
2024-07-01 00:33:05 -07:00
test_ngram_correctness.py
[Dynamic Spec Decoding] Minor fix for disabling speculative decoding (
#5000
)
2024-05-25 10:00:14 -07:00