Harry Mellor
|
d85c47d6ad
|
Replace "online inference" with "online serving" (#11923)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-10 12:05:56 +00:00 |
|
Li, Jiang
|
2f7024987e
|
[CI/Build][Bugfix] Fix CPU CI image clean up (#11836)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-01-08 15:18:28 +00:00 |
|
Harry Mellor
|
aba8d6ee00
|
[Doc] Move examples into categories (#11840)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 13:09:53 +00:00 |
|
Yuan
|
1e4ce295ae
|
[CI][CPU] adding build number to docker image name (#11788)
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
|
2025-01-07 07:28:01 +00:00 |
|
Li, Jiang
|
63f1fde277
|
[Hardware][CPU] Support chunked-prefill and prefix-caching on CPU (#10355)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2024-11-20 10:57:39 +00:00 |
|
Yuan
|
b4614656b8
|
[CI][CPU] adding numa node number as container name suffix (#10441)
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
|
2024-11-19 13:16:43 +00:00 |
|
Cyrus Leung
|
b40cf6402e
|
[Model] Support Qwen2 embeddings and use tags to select model tests (#10184)
|
2024-11-14 20:23:09 -08:00 |
|
Cyrus Leung
|
675d603400
|
[CI/Build] Make shellcheck happy (#10285)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-14 09:47:53 +00:00 |
|
Isotr0py
|
03025c023f
|
[CI/Build] Fix CPU CI online inference timeout (#10314)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-14 16:45:32 +08:00 |
|
Yuan
|
d201d41973
|
[CI][CPU]refactor CPU tests to allow to bind with different cores (#10222)
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
|
2024-11-12 10:07:32 +00:00 |
|
Isotr0py
|
2cebda42bb
|
[Bugfix][Hardware][CPU] Fix broken encoder-decoder CPU runner (#10218)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-11 12:37:58 +00:00 |
|
Isotr0py
|
58170d6503
|
[Hardware][CPU] Add embedding models support for CPU backend (#10193)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-11 08:54:28 +00:00 |
|
Li, Jiang
|
d7edca1dee
|
[CI/Build] Adding timeout in CPU CI to avoid CPU test queue blocking (#6892)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-09 03:27:11 +00:00 |
|
Cyrus Leung
|
b489fc3c91
|
[CI/Build] Update CPU tests to include all "standard" tests (#5481)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-08 23:30:04 +08:00 |
|
Russell Bryant
|
3be5b26a76
|
[CI/Build] Add shell script linting using shellcheck (#7925)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-07 18:17:29 +00:00 |
|
Li, Jiang
|
a4b3e0c1e9
|
[Hardware][CPU] Update torch 2.5 (#9911)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2024-11-07 04:43:08 +00:00 |
|
Li, Jiang
|
5eda21e773
|
[Hardware][CPU] compressed-tensor INT8 W8A8 AZP support (#9344)
|
2024-10-17 12:21:04 -04:00 |
|
Tyler Michael Smith
|
7342a7d7f8
|
[Model] Support Mamba (#6484)
|
2024-10-11 15:40:06 +00:00 |
|
Li, Jiang
|
ca77dd7a44
|
[Hardware][CPU] Support AWQ for CPU backend (#7515)
|
2024-10-09 10:28:08 -06:00 |
|
Isotr0py
|
4f95ffee6f
|
[Hardware][CPU] Cross-attention and Encoder-Decoder models support on CPU backend (#9089)
|
2024-10-07 06:50:35 +00:00 |
|
ywfang
|
8a0cf1ddc3
|
[Model] support minicpm3 (#8297)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-09-14 14:50:26 +00:00 |
|
Cyrus Leung
|
a84e598e21
|
[CI/Build] Reorganize models tests (#7820)
|
2024-09-13 10:20:06 -07:00 |
|
Li, Jiang
|
0b952af458
|
[Hardware][Intel] Support compressed-tensor W8A8 for CPU backend (#7257)
|
2024-09-11 09:46:46 -07:00 |
|
Cody Yu
|
2ad2e5608e
|
[MISC] Consolidate FP8 kv-cache tests (#8131)
|
2024-09-04 18:53:25 +00:00 |
|
Alex Brooks
|
40e1360bb6
|
[CI/Build] Add text-only test for Qwen models (#7475)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2024-08-19 07:43:46 +08:00 |
|
PHILO-HE
|
f4da5f7b6d
|
[Misc] Update dockerfile for CPU to cover protobuf installation (#7182)
|
2024-08-15 10:03:01 -07:00 |
|
youkaichao
|
ea49e6a3c8
|
[misc][ci] fix cpu test with plugins (#7489)
|
2024-08-13 19:27:46 -07:00 |
|
Joe
|
14dbd5a767
|
[Model] H2O Danube3-4b (#6451)
|
2024-07-26 20:47:50 -07:00 |
|
Li, Jiang
|
3bbb4936dc
|
[Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125)
|
2024-07-26 13:50:10 -07:00 |
|
Yuan
|
81d7a50f24
|
[Hardware][Intel CPU] Adding intel openmp tunings in Docker file (#6008)
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
|
2024-07-04 15:22:12 -07:00 |
|
Mor Zusman
|
9d6a8daa87
|
[Model] Jamba support (#4115)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Erez Schwartz <erezs@ai21.com>
Co-authored-by: Mor Zusman <morz@ai21.com>
Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com>
Co-authored-by: Tomer Asida <tomera@ai21.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
|
2024-07-02 23:11:29 +00:00 |
|
Roger Wang
|
4ad7b53e59
|
[CI/Build][Misc] Update Pytest Marker for VLMs (#5623)
|
2024-06-18 13:10:04 +00:00 |
|
Jie Fu (傅杰)
|
ab66536dbf
|
[CI/BUILD] Support non-AVX512 vLLM building and testing (#5574)
|
2024-06-17 14:36:10 -04:00 |
|
Cyrus Leung
|
d47af2bc02
|
[CI/Build] Disable LLaVA-NeXT CPU test (#5529)
|
2024-06-14 09:27:30 -07:00 |
|
Li, Jiang
|
45c35f0d58
|
[CI/Build] Reducing CPU CI execution time (#5241)
|
2024-06-04 10:26:40 -07:00 |
|
Yuan
|
cafb8e06c5
|
[CI/BUILD] enable intel queue for longer CPU tests (#4113)
|
2024-06-03 10:39:50 -07:00 |
|
Letian Li
|
2ba80bed27
|
[Bugfix] Update Dockerfile.cpu to fix NameError: name 'vllm_ops' is not defined (#5009)
|
2024-05-23 09:08:58 -07:00 |
|
bigPYJ1151
|
77a6572aa5
|
[HotFix] [CI/Build] Minor fix for CPU backend CI (#3787)
|
2024-04-01 22:50:53 -07:00 |
|
bigPYJ1151
|
0e3f06fe9c
|
[Hardware][Intel] Add CPU inference backend (#3634)
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>
|
2024-04-01 22:07:30 -07:00 |
|