This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2025-12-20 06:55:01 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
tests
/
entrypoints
History
Joe Runde
de4008e2ab
[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (
#9352
)
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-10-17 22:47:27 -04:00
..
llm
[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (
#9352
)
2024-10-17 22:47:27 -04:00
offline_mode
[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (
#9352
)
2024-10-17 22:47:27 -04:00
openai
[Bugfix] Fix vLLM UsageInfo and logprobs None AssertionError with empty token_ids (
#9034
)
2024-10-15 15:40:43 -07:00
__init__.py
[CI/Build] Move
test_utils.py
to
tests/utils.py
(
#4425
)
2024-05-13 23:50:09 +09:00
conftest.py
Support for guided decoding for offline LLM (
#6878
)
2024-08-04 03:12:09 +00:00
test_chat_utils.py
[Frontend] Multimodal support in offline chat (
#8098
)
2024-09-04 05:22:17 +00:00