wangxiyuan
|
405eb8e396
|
[platform] Allow platform specify attention backend (#11609)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
|
2025-01-09 21:46:50 +08:00 |
|
Cyrus Leung
|
65097ca0af
|
[Doc] Add model development API Reference (#11884)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-09 09:43:40 +00:00 |
|
Ye (Charlotte) Qi
|
1d967acb45
|
[Bugfix] fix beam search input errors and latency benchmark script (#11875)
Signed-off-by: Ye Qi <yeq@meta.com>
Co-authored-by: yeq <yeq@devgpu004.lla3.facebook.com>
|
2025-01-09 17:36:39 +08:00 |
|
Cyrus Leung
|
0bd1ff4346
|
[Bugfix] Override dunder methods of placeholder modules (#11882)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-09 09:02:53 +00:00 |
|
youkaichao
|
310aca88c9
|
[perf]fix current stream (#11870)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-09 07:18:21 +00:00 |
|
Guspan Tanadi
|
a732900efc
|
[Doc] Intended links Python multiprocessing library (#11878)
|
2025-01-09 05:39:39 +00:00 |
|
Cyrus Leung
|
d848800e88
|
[Misc] Move print_*_once from utils to logger (#11298)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
Co-authored-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
|
2025-01-09 12:48:12 +08:00 |
|
Michael Goin
|
730e9592e9
|
[Doc] Recommend uv and python 3.12 for quickstart guide (#11849)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2025-01-09 11:37:48 +08:00 |
|
Maximilien de Bayser
|
1fe554bac3
|
treat do_lower_case in the same way as the sentence-transformers library (#11815)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-01-09 11:05:43 +08:00 |
|
Tyler Michael Smith
|
615e4a5401
|
[CI] Turn on basic correctness tests for V1 (#10864)
|
2025-01-08 21:20:44 -05:00 |
|
Simon Mo
|
3db0cafdf1
|
[Docs] Add Google Cloud Meetup (#11864)
|
2025-01-08 12:38:28 -08:00 |
|
rasmith
|
526de822d5
|
[Kernel][Triton][AMD] Use block size heuristic for avg 2.8x speedup for int8 models (#11698)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2025-01-08 20:23:15 +00:00 |
|
Robert Shaw
|
56fe4c297c
|
[TPU][Quantization] TPU W8A8 (#11785)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-01-08 19:33:29 +00:00 |
|
WangErXiao
|
47de8821d3
|
[Misc]add some explanations for BlockHashType (#11847)
|
2025-01-08 18:21:30 +00:00 |
|
Cyrus Leung
|
5984499e47
|
[Doc] Expand Multimodal API Reference (#11852)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 17:14:14 +00:00 |
|
Cyrus Leung
|
ca47e176af
|
[Misc] Move some model utils into vision file (#11848)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 17:04:46 +00:00 |
|
Yan Ma
|
78f4590b60
|
[Bugfix][XPU] fix silu_and_mul (#11823)
Signed-off-by: yan ma <yan.ma@intel.com>
|
2025-01-09 00:11:50 +08:00 |
|
Li, Jiang
|
2f7024987e
|
[CI/Build][Bugfix] Fix CPU CI image clean up (#11836)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-01-08 15:18:28 +00:00 |
|
Cyrus Leung
|
6cd40a5bfe
|
[Doc][4/N] Reorganize API Reference (#11843)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 21:34:44 +08:00 |
|
Harry Mellor
|
aba8d6ee00
|
[Doc] Move examples into categories (#11840)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 13:09:53 +00:00 |
|
Cyrus Leung
|
2a0596bc48
|
[VLM] Reorganize profiling/processing-related code (#11812)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 18:59:58 +08:00 |
|
youkaichao
|
f12141170a
|
[torch.compile] consider relevant code in compilation cache (#11614)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-08 10:46:43 +00:00 |
|
Wallas Henrique
|
cfd3219f58
|
[Hardware][Apple] Native support for macOS Apple Silicon (#11696)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2025-01-08 16:35:49 +08:00 |
|
Simon Mo
|
a1b2b8606e
|
[Docs] Update sponsor name: 'Novita' to 'Novita AI' (#11833)
|
2025-01-07 23:05:46 -08:00 |
|
youkaichao
|
ad9f1aa679
|
[doc] update wheels url (#11830)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-08 14:36:49 +08:00 |
|
youkaichao
|
889e662eae
|
[misc] improve memory profiling (#11809)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-01-08 06:36:03 +00:00 |
|
Cyrus Leung
|
ef68eb28d8
|
[Bug] Fix pickling of ModelConfig when RunAI Model Streamer is used (#11825)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 13:40:09 +08:00 |
|
Simon Mo
|
259abd8953
|
[Docs] reorganize sponsorship page (#11639)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-01-07 21:16:08 -08:00 |
|
Jee Jee Li
|
f645eb6954
|
[Bugfix] Add checks for LoRA and CPU offload (#11810)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-01-08 13:08:48 +08:00 |
|
Ilya Lavrenov
|
f4923cb8bc
|
[OpenVINO] Fixed Docker.openvino build (#11732)
Signed-off-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
|
2025-01-08 13:08:30 +08:00 |
|
Nishidha
|
b640b19cc0
|
Fixed docker build for ppc64le (#11518)
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
|
2025-01-08 13:05:37 +08:00 |
|
WangErXiao
|
dc71af0a71
|
Remove the duplicate imports of MultiModalKwargs and PlaceholderRange… (#11824)
|
2025-01-08 04:09:25 +00:00 |
|
Divakar Verma
|
4d29e91be8
|
[Misc] sort torch profiler table by kernel timing (#11813)
|
2025-01-08 10:57:04 +08:00 |
|
Cyrus Leung
|
91445c7bc8
|
[Bugfix] Fix image input for Pixtral-HF (#11741)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 10:17:16 +08:00 |
|
Harry Mellor
|
5950f555a1
|
[Doc] Group examples into categories (#11782)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 09:20:12 +08:00 |
|
Jie Fu (傅杰)
|
a4e2b26856
|
[Bugfix] Significant performance drop on CPUs with --num-scheduler-steps > 1 (#11794)
|
2025-01-07 16:15:50 -08:00 |
|
sroy745
|
973f5dc581
|
[Doc]Add documentation for using EAGLE in vLLM (#11417)
Signed-off-by: Sourashis Roy <sroy@roblox.com>
|
2025-01-07 19:19:12 +00:00 |
|
jiangjiadi
|
c994223d56
|
[Bugfix] update the prefix for qwen2 (#11795)
Co-authored-by: jiadi.jjd <jiadi.jjd@antgroup.com>
|
2025-01-07 18:36:34 +00:00 |
|
youkaichao
|
869579a702
|
[optimization] remove python function call for custom op (#11750)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-07 17:04:28 +00:00 |
|
Cyrus Leung
|
c0efe92d8b
|
[Doc] Add note to gte-Qwen2 models (#11808)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-07 21:50:58 +08:00 |
|
youkaichao
|
d9fa1c05ad
|
[doc] update how pip can install nightly wheels (#11806)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-07 21:42:58 +08:00 |
|
Roger Wang
|
2de197bdd4
|
[V1] Support audio language models on V1 (#11733)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-01-07 19:47:36 +08:00 |
|
youkaichao
|
869e829b85
|
[doc] add doc to explain how to use uv (#11773)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-01-07 18:41:17 +08:00 |
|
Cyrus Leung
|
8f37be38eb
|
[Bugfix] Comprehensively test and fix LLaVA-NeXT feature size calculation (#11800)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-07 18:25:02 +08:00 |
|
Roger Wang
|
8082ad7950
|
[V1][Doc] Update V1 support for LLaVa-NeXT-Video (#11798)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-01-07 09:55:39 +00:00 |
|
Yuan
|
1e4ce295ae
|
[CI][CPU] adding build number to docker image name (#11788)
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
|
2025-01-07 07:28:01 +00:00 |
|
Russell Bryant
|
ce1917fcf2
|
[Doc] Create a vulnerability management team (#9925)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-01-06 22:57:32 -08:00 |
|
XiaobingZhang
|
e512f76a89
|
fix init error for MessageQueue when n_local_reader is zero (#11768)
|
2025-01-07 06:12:48 +00:00 |
|
Liangfu Chen
|
898cdf033e
|
[CI] Fix neuron CI and run offline tests (#11779)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
|
2025-01-06 21:36:10 -08:00 |
|
Roger Wang
|
0f3f3c86ec
|
[Bugfix] Update attention interface in Whisper (#11784)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-01-07 04:36:24 +00:00 |
|