Cody Yu
|
bc8ad68455
|
[Misc][Refactor] Introduce ExecuteModelData (#4540)
|
2024-05-03 17:47:07 -07:00 |
|
leiwen83
|
b38e42fbca
|
[Speculative decoding] Add ngram prompt lookup decoding (#4237)
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
|
2024-05-01 11:13:03 -07:00 |
|
Nick Hill
|
2e240c69a9
|
[Core] Centralize GPU Worker construction (#4419)
|
2024-05-01 01:06:34 +00:00 |
|
leiwen83
|
4bb53e2dde
|
[BugFix] fix num_lookahead_slots missing in async executor (#4165)
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
|
2024-04-30 10:12:59 -07:00 |
|
Nick Hill
|
ba4be44c32
|
[BugFix] Fix return type of executor execute_model methods (#4402)
|
2024-04-27 11:17:45 -07:00 |
|
SangBin Cho
|
a88081bf76
|
[CI] Disable non-lazy string operation on logging (#4326)
Co-authored-by: Danny Guinther <dguinther@neuralmagic.com>
|
2024-04-26 00:16:58 -07:00 |
|
Cade Daniel
|
62b8aebc6f
|
[Speculative decoding 7/9] Speculative decoding end-to-end correctness tests. (#3951)
|
2024-04-23 08:02:36 +00:00 |
|
Cade Daniel
|
d150e4f89f
|
[Misc] [CI] Fix CI failure caught after merge (#4126)
|
2024-04-16 17:56:01 -07:00 |
|
Cade Daniel
|
e95cd87959
|
[Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894)
|
2024-04-16 13:09:21 -07:00 |
|
Antoni Baum
|
69e1d2fb69
|
[Core] Refactor model loading code (#4097)
|
2024-04-16 11:34:39 -07:00 |
|
Nick Hill
|
eb46fbfda2
|
[Core] Simplifications to executor classes (#4071)
|
2024-04-15 13:05:09 -07:00 |
|
Sanger Steel
|
711a000255
|
[Frontend] [Core] feat: Add model loading using tensorizer (#3476)
|
2024-04-13 17:13:01 -07:00 |
|
SangBin Cho
|
09473ee41c
|
[mypy] Add mypy type annotation part 1 (#4006)
|
2024-04-12 14:35:50 -07:00 |
|
Cade Daniel
|
e7c7067b45
|
[Misc] [Core] Implement RFC "Augment BaseExecutor interfaces to enable hardware-agnostic speculative decoding" (#3837)
|
2024-04-09 11:44:15 -07:00 |
|
Cade Daniel
|
5757d90e26
|
[Speculative decoding] Adding configuration object for speculative decoding (#3706)
Co-authored-by: Lily Liu <lilyliupku@gmail.com>
|
2024-04-03 00:40:57 +00:00 |
|
Cade Daniel
|
14ccd94c89
|
[Core][Bugfix]Refactor block manager for better testability (#3492)
|
2024-03-27 23:59:28 -07:00 |
|
xwjiang2010
|
64172a976c
|
[Feature] Add vision language model support. (#3042)
|
2024-03-25 14:16:30 -07:00 |
|
SangBin Cho
|
01bfb22b41
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
Zhuohan Li
|
e90fc21f2e
|
[Hardware][Neuron] Refactor neuron support (#3471)
|
2024-03-22 01:22:17 +00:00 |
|
Zhuohan Li
|
4c922709b6
|
Add distributed model executor abstraction (#3191)
|
2024-03-11 11:03:45 -07:00 |
|