Tyler Michael Smith
615e4a5401
[CI] Turn on basic correctness tests for V1 ( #10864 )
2025-01-08 21:20:44 -05:00
Simon Mo
3db0cafdf1
[Docs] Add Google Cloud Meetup ( #11864 )
2025-01-08 12:38:28 -08:00
rasmith
526de822d5
[Kernel][Triton][AMD] Use block size heuristic for avg 2.8x speedup for int8 models ( #11698 )
...
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
2025-01-08 20:23:15 +00:00
Robert Shaw
56fe4c297c
[TPU][Quantization] TPU W8A8 ( #11785 )
...
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-01-08 19:33:29 +00:00
WangErXiao
47de8821d3
[Misc]add some explanations for BlockHashType ( #11847 )
2025-01-08 18:21:30 +00:00
Cyrus Leung
5984499e47
[Doc] Expand Multimodal API Reference ( #11852 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 17:14:14 +00:00
Cyrus Leung
ca47e176af
[Misc] Move some model utils into vision file ( #11848 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 17:04:46 +00:00
Yan Ma
78f4590b60
[Bugfix][XPU] fix silu_and_mul ( #11823 )
...
Signed-off-by: yan ma <yan.ma@intel.com>
2025-01-09 00:11:50 +08:00
Li, Jiang
2f7024987e
[CI/Build][Bugfix] Fix CPU CI image clean up ( #11836 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-01-08 15:18:28 +00:00
Cyrus Leung
6cd40a5bfe
[Doc][4/N] Reorganize API Reference ( #11843 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 21:34:44 +08:00
Harry Mellor
aba8d6ee00
[Doc] Move examples into categories ( #11840 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-08 13:09:53 +00:00
Cyrus Leung
2a0596bc48
[VLM] Reorganize profiling/processing-related code ( #11812 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 18:59:58 +08:00
youkaichao
f12141170a
[torch.compile] consider relevant code in compilation cache ( #11614 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-08 10:46:43 +00:00
Wallas Henrique
cfd3219f58
[Hardware][Apple] Native support for macOS Apple Silicon ( #11696 )
...
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2025-01-08 16:35:49 +08:00
Simon Mo
a1b2b8606e
[Docs] Update sponsor name: 'Novita' to 'Novita AI' ( #11833 )
2025-01-07 23:05:46 -08:00
youkaichao
ad9f1aa679
[doc] update wheels url ( #11830 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-08 14:36:49 +08:00
youkaichao
889e662eae
[misc] improve memory profiling ( #11809 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-01-08 06:36:03 +00:00
Cyrus Leung
ef68eb28d8
[Bug] Fix pickling of ModelConfig when RunAI Model Streamer is used ( #11825 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 13:40:09 +08:00
Simon Mo
259abd8953
[Docs] reorganize sponsorship page ( #11639 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-01-07 21:16:08 -08:00
Jee Jee Li
f645eb6954
[Bugfix] Add checks for LoRA and CPU offload ( #11810 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-01-08 13:08:48 +08:00
Ilya Lavrenov
f4923cb8bc
[OpenVINO] Fixed Docker.openvino build ( #11732 )
...
Signed-off-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
2025-01-08 13:08:30 +08:00
Nishidha
b640b19cc0
Fixed docker build for ppc64le ( #11518 )
...
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
2025-01-08 13:05:37 +08:00
WangErXiao
dc71af0a71
Remove the duplicate imports of MultiModalKwargs and PlaceholderRange… ( #11824 )
2025-01-08 04:09:25 +00:00
Divakar Verma
4d29e91be8
[Misc] sort torch profiler table by kernel timing ( #11813 )
2025-01-08 10:57:04 +08:00
Cyrus Leung
91445c7bc8
[Bugfix] Fix image input for Pixtral-HF ( #11741 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 10:17:16 +08:00
Harry Mellor
5950f555a1
[Doc] Group examples into categories ( #11782 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-08 09:20:12 +08:00
Jie Fu (傅杰)
a4e2b26856
[Bugfix] Significant performance drop on CPUs with --num-scheduler-steps > 1 ( #11794 )
2025-01-07 16:15:50 -08:00
sroy745
973f5dc581
[Doc]Add documentation for using EAGLE in vLLM ( #11417 )
...
Signed-off-by: Sourashis Roy <sroy@roblox.com>
2025-01-07 19:19:12 +00:00
jiangjiadi
c994223d56
[Bugfix] update the prefix for qwen2 ( #11795 )
...
Co-authored-by: jiadi.jjd <jiadi.jjd@antgroup.com>
2025-01-07 18:36:34 +00:00
youkaichao
869579a702
[optimization] remove python function call for custom op ( #11750 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-07 17:04:28 +00:00
Cyrus Leung
c0efe92d8b
[Doc] Add note to gte-Qwen2 models ( #11808 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-07 21:50:58 +08:00
youkaichao
d9fa1c05ad
[doc] update how pip can install nightly wheels ( #11806 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-07 21:42:58 +08:00
Roger Wang
2de197bdd4
[V1] Support audio language models on V1 ( #11733 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-01-07 19:47:36 +08:00
youkaichao
869e829b85
[doc] add doc to explain how to use uv ( #11773 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-01-07 18:41:17 +08:00
Cyrus Leung
8f37be38eb
[Bugfix] Comprehensively test and fix LLaVA-NeXT feature size calculation ( #11800 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-07 18:25:02 +08:00
Roger Wang
8082ad7950
[V1][Doc] Update V1 support for LLaVa-NeXT-Video ( #11798 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-01-07 09:55:39 +00:00
Yuan
1e4ce295ae
[CI][CPU] adding build number to docker image name ( #11788 )
...
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
2025-01-07 07:28:01 +00:00
Russell Bryant
ce1917fcf2
[Doc] Create a vulnerability management team ( #9925 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-01-06 22:57:32 -08:00
XiaobingZhang
e512f76a89
fix init error for MessageQueue when n_local_reader is zero ( #11768 )
2025-01-07 06:12:48 +00:00
Liangfu Chen
898cdf033e
[CI] Fix neuron CI and run offline tests ( #11779 )
...
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
2025-01-06 21:36:10 -08:00
Roger Wang
0f3f3c86ec
[Bugfix] Update attention interface in Whisper ( #11784 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-01-07 04:36:24 +00:00
Jee Jee Li
b278557935
[Kernel][LoRA]Punica prefill kernels fusion ( #11234 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Abatom <abzhonghua@gmail.com>
Co-authored-by: Zhonghua Deng <abatom@163.com>
2025-01-07 04:01:39 +00:00
Cyrus Leung
8ceffbf315
[Doc][3/N] Reorganize Serving section ( #11766 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-07 11:20:01 +08:00
YiSheng5
d93d2d74fd
[XPU] Make pp group initilized for pipeline-parallelism ( #11648 )
...
Signed-off-by: yisheng <yi.sheng@intel.com>
2025-01-07 11:09:58 +08:00
Cyrus Leung
d0169e1b0f
[Model] Future-proof Qwen2-Audio multi-modal processor ( #11776 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-07 11:05:17 +08:00
Cyrus Leung
08fb75c72e
[Bugfix] Fix LLaVA-NeXT feature size precision error (for real) ( #11772 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-07 01:10:54 +00:00
Roger Wang
91b361ae89
[V1] Extend beyond image modality and support mixed-modality inference with Llava-OneVision ( #11685 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-06 19:58:16 +00:00
Chen Zhang
e20c92bb61
[Kernel] Move attn_type to Attention.__init__() ( #11690 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-01-07 00:11:28 +08:00
Jee Jee Li
32c9eff2ff
[Bugfix][V1] Fix molmo text-only inputs ( #11676 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-01-06 15:22:25 +00:00
youkaichao
4ca5d40adc
[doc] explain how to add interleaving sliding window support ( #11771 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-06 21:57:44 +08:00