This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-03-18 14:57:21 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
tests
History
Jinzhen Lin
33e0823de5
[Bugfix] fix rope error when load models with different dtypes (
#4835
)
2024-05-17 18:43:34 +09:00
..
async_engine
…
basic_correctness
…
core
…
distributed
[Core][Distributed] remove graph mode function (
#4818
)
2024-05-16 10:59:52 -07:00
engine
[Build/CI] Extending the set of AMD tests with Regression, Basic Correctness, Distributed, Engine, Llava Tests (
#4797
)
2024-05-16 20:58:25 -07:00
entrypoints
[Frontend] Support OpenAI batch file format (
#4794
)
2024-05-15 19:13:36 -04:00
fp8_kv
…
kernels
[Bugfix] fix rope error when load models with different dtypes (
#4835
)
2024-05-17 18:43:34 +09:00
lora
[Kernel] Add punica dimension for Qwen1.5-32B LoRA (
#4850
)
2024-05-16 11:16:09 -07:00
metrics
…
model_executor
…
models
Add GPTQ Marlin 2:4 sparse structured support (
#4790
)
2024-05-16 12:56:15 -04:00
prefix_caching
…
prompts
…
quantization
…
samplers
…
spec_decode
[Speculative decoding][Re-take] Enable TP>1 speculative decoding (
#4840
)
2024-05-16 00:53:51 -07:00
tensorizer_loader
…
tokenization
…
worker
…
__init__.py
…
conftest.py
…
test_cache_block_hashing.py
…
test_config.py
…
test_logger.py
…
test_logits_processor.py
…
test_regression.py
…
test_sampling_params.py
…
test_sequence.py
…
test_sharded_state_loader.py
[Core] Implement sharded state loader (
#4690
)
2024-05-15 22:11:54 -07:00
utils.py
…