Logo
Explore Help
Sign In
xinyun/vllm
1
0
Fork 0
You've already forked vllm
mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-24 21:55:38 +08:00
Code Issues Packages Projects Releases Wiki Activity
4,666 Commits 155 Branches 89 Tags
Commit Graph

57 Commits

Author SHA1 Message Date
SangBin Cho
0d62fe58db
[Bug fix][Core] assert num_new_tokens == 1 fails when SamplingParams.n is not 1 and max_tokens is large & Add tests for preemption (#4451) 2024-05-01 19:24:13 -07:00
SangBin Cho
36729bac13
[Test] Test multiple attn backend for chunked prefill. (#4023) 2024-04-12 09:56:57 -07:00
SangBin Cho
e42df7227d
[Test] Add xformer and flash attn tests (#3961)
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-04-11 03:09:50 +00:00
SangBin Cho
67b4221a61
[Core][5/N] Fully working chunked prefill e2e (#3884) 2024-04-10 17:56:48 -07:00
SangBin Cho
26422e477b
[Test] Make model tests run again and remove --forked from pytest (#3631)
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-03-28 21:06:40 -07:00
SangBin Cho
6e435de766
[1/n][Chunked Prefill] Refactor input query shapes (#3236) 2024-03-20 14:46:05 -07:00
Zhuohan Li
a61f0521b8
[Test] Add basic correctness test (#2908) 2024-02-18 16:44:50 -08:00
First Previous 1 2 Next Last
Powered by Gitea Version: 1.23.1 Page: 357ms Template: 4ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API