vllm/entrypoints at 1bbbcc0b1d96384a72b13d34600b1bdd24cb0f7f - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-21 17:17:17 +08:00

History

Joe Runde de4008e2ab

[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (#9352 )

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

2024-10-17 22:47:27 -04:00

..

[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (#9352 )

2024-10-17 22:47:27 -04:00

[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (#9352 )

2024-10-17 22:47:27 -04:00

[Bugfix] Fix vLLM UsageInfo and logprobs None AssertionError with empty token_ids (#9034 )

2024-10-15 15:40:43 -07:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

conftest.py

Support for guided decoding for offline LLM (#6878 )

2024-08-04 03:12:09 +00:00

test_chat_utils.py

[Frontend] Multimodal support in offline chat (#8098 )

2024-09-04 05:22:17 +00:00