Sage Moore
|
82ae694de6
|
comments cleanup etc
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 20:47:39 +00:00 |
|
Ning Xie
|
1dba2c4ebe
|
[Misc] adjust for ipv6 for mookcacke url parse (#20107)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-07-03 20:27:17 +00:00 |
|
Sage Moore
|
10ca263058
|
split some of the ubatching logic out of _run_model
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 20:26:56 +00:00 |
|
Isotr0py
|
71d6de3a26
|
[Misc] Clean up InternVL family config registration (#19992)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-07-03 20:01:47 +00:00 |
|
Sage Moore
|
908e9f8f54
|
cleanup
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 19:52:41 +00:00 |
|
Sage Moore
|
06cc133a63
|
cleanup
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 17:51:08 +00:00 |
|
Sage Moore
|
3a41a3dcff
|
cleanup
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 17:23:30 +00:00 |
|
Sage Moore
|
bb0645c644
|
separate ubatch and normal runs
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 17:07:58 +00:00 |
|
Sage Moore
|
510e839429
|
more cleanup
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 16:35:52 +00:00 |
|
Sage Moore
|
f7b6e600b8
|
gpu_model_runner cleanup
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 16:23:11 +00:00 |
|
Alexei-V-Ivanov-AMD
|
536fd33003
|
[CI] Trimming some failing test groups from AMDPRODUCTION. (#20390)
|
2025-07-03 08:21:31 -07:00 |
|
Reid
|
619b9f5c7e
|
[Frontend] fix duplicate output for bench subcmd (#20446)
Signed-off-by: reidliu41 <reid201711@gmail.com>
|
2025-07-03 08:02:06 -07:00 |
|
Nicolò Lucchesi
|
d1b689c445
|
[Bugfix] Fix flaky test_streaming_response test (#20363)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-07-03 14:46:24 +00:00 |
|
Sage Moore
|
0056be26f6
|
less ARs
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 14:33:53 +00:00 |
|
Sage Moore
|
7cc5a549ad
|
cleanup some of the should_ubatch logic
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 14:22:53 +00:00 |
|
Reid
|
9854dc9040
|
[Frontend] improve vllm bench <bench_type> --help display (#20430)
Signed-off-by: reidliu41 <reid201711@gmail.com>
|
2025-07-03 14:22:16 +00:00 |
|
Isotr0py
|
ff5c60fad8
|
[Misc] Automatically tag PRs to add new models (#20222)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-07-03 07:11:03 -07:00 |
|
wang.yuqi
|
6f1229f91d
|
[Model][2/N] Automatic conversion of CrossEncoding model (#19978)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-07-03 13:59:23 +00:00 |
|
Jee Jee Li
|
1819fbda63
|
[Quantization] Bump to use latest bitsandbytes (#20424)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-07-03 21:58:46 +08:00 |
|
Sage Moore
|
83caef8bac
|
cleanups for ubatching.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:50:19 +00:00 |
|
Sage Moore
|
2f3461ad23
|
cleanup flashmla.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:45:52 +00:00 |
|
Sage Moore
|
7e2ff2620e
|
cleanup flashmla.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:45:07 +00:00 |
|
Sage Moore
|
1d75a029a9
|
remove cudagraph logic from flashmla.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:41:49 +00:00 |
|
Sage Moore
|
17a7ceef27
|
cleanup deepep ll
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:35:21 +00:00 |
|
Sage Moore
|
6e2a3c0841
|
minor changes
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:29:32 +00:00 |
|
Sage Moore
|
631be12edb
|
refactoring pplx_prepare_finalize.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:16:34 +00:00 |
|
Sage Moore
|
a9d47e8652
|
remove always_microbatch_if_enabled
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:09:33 +00:00 |
|
Sage Moore
|
fc562e22e2
|
cleanup gpu_worker.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:07:59 +00:00 |
|
Sage Moore
|
1ca65412b8
|
cleanup backends/utils.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:07:33 +00:00 |
|
Sage Moore
|
3112714bdc
|
cleanup logger.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:05:38 +00:00 |
|
Sage Moore
|
0c03d154b5
|
cleanup config.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:03:26 +00:00 |
|
Sage Moore
|
9b7edc0343
|
cleanup data_parallel.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:02:12 +00:00 |
|
Sage Moore
|
be2e1632fd
|
delete basic-ub.py
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-03 13:01:01 +00:00 |
|
Li, Jiang
|
7f0367109e
|
[CI/Build][CPU] Enable cross compilation in CPU release pipeline (#20423)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-03 05:26:12 -07:00 |
|
Ning Xie
|
fb14d53cf6
|
[Kernel] refactor cpu worker v0 cache dtype (#20080)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-07-03 08:39:14 +00:00 |
|
Cyrus Leung
|
b024a42e93
|
[Core] Move multimodal placeholder from chat utils to model definition (#20355)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-03 08:18:30 +00:00 |
|
Michael Yao
|
cb97f2bfc5
|
[Docs] Replace two list with tables in intel_gaudi.md (#20414)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
|
2025-07-03 00:48:25 -07:00 |
|
Reid
|
359200f6ac
|
[doc] fix link (#20417)
Signed-off-by: reidliu41 <reid201711@gmail.com>
|
2025-07-03 00:21:57 -07:00 |
|
Lifans
|
220aee902a
|
[Misc] Add rules to label Speculative Decoding Related PRs (#20406)
Signed-off-by: Lifan Shen <lifans@meta.com>
|
2025-07-02 23:56:49 -07:00 |
|
Nick Hill
|
67d25eca05
|
[Tests] Update online DP tests to verify that requests are balanced (#20157)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-07-03 14:49:13 +08:00 |
|
qscqesze
|
363528de27
|
[Feature] Support MiniMax-M1 function calls features (#20297)
Signed-off-by: QscQ <qscqesze@gmail.com>
Signed-off-by: qingjun <qingjun@minimaxi.com>
|
2025-07-03 06:48:27 +00:00 |
|
QiliangCui
|
4ff61ababa
|
[TPU] Add a case to cover RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8 (#20385)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-07-03 06:46:41 +00:00 |
|
Li, Jiang
|
0ec3779df7
|
[Bugfix][CI/CD][CPU] Fix CPU CI tests (#20383)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-02 20:11:36 -07:00 |
|
Chenheli Hua
|
b616f6a53d
|
[Misc] Small: Fix video loader return type annotations. (#20389)
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
|
2025-07-03 03:10:39 +00:00 |
|
bnellnm
|
2e25bb12a8
|
[Bugfix] Fix import of CutlassExpertsFp8 in compressed_tensors_moe.py (#20381)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2025-07-03 02:07:43 +00:00 |
|
Louie Tsai
|
9965c47d0d
|
Enable CPU nightly performance benchmark and its Markdown report (#18444)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
|
2025-07-02 17:50:25 -07:00 |
|
Nick Hill
|
059d4cdb49
|
[BugFix] Fix DP headless mode arg validation (#20398)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-07-02 17:15:32 -07:00 |
|
Tyler Michael Smith
|
bdb84e26b0
|
[Bugfix] Fixes for FlashInfer's TORCH_CUDA_ARCH_LIST (#20136)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
|
2025-07-02 17:15:11 -07:00 |
|
Nicolò Lucchesi
|
3dd359147d
|
[Docs] Update EAGLE example (#20375)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-07-02 17:13:51 -07:00 |
|
Sage Moore
|
ce3ef95c11
|
turn yields on for pplx
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-07-02 22:34:02 +00:00 |
|