inkcherry
|
63e6cff196
|
update proxy path
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
77321502e7
|
update lock
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
374cc25e0f
|
format
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
16d2a7a343
|
updata finished request collection
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
1c10f47dc6
|
tp write single pass
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
b29f405aa5
|
update
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
4776e2ddcf
|
more
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
72ccb5d77c
|
remove handle_proxy_request
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
38d51f6dd8
|
refine code
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:59 +00:00 |
|
inkcherry
|
fd63437837
|
update
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
0a3ae0b0cc
|
update
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
9d29f361fb
|
update
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
96da87bfe0
|
refine
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
857d93cbfb
|
fix all commit
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
795a305b1b
|
fix format
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
e0885e52d9
|
break long line
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
f75eecde0a
|
fix all mypy
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
3f7120368e
|
fix mypy and tp test pass
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
4c79f34e8a
|
fix mypy
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
9b90f5ddb2
|
update
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
a0d74ebf7f
|
fix format error
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
08cd2efbb6
|
refine
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
bba4c89ca4
|
format
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
4034937733
|
remove port
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
b60ee86585
|
format
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
4f592ae696
|
format
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
245b71a891
|
refine
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
64694c3e76
|
refine
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
70ea1b2460
|
refine code
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
68a2333339
|
fix dp proxy
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:58 +00:00 |
|
inkcherry
|
f8e9adfea8
|
refine
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:57 +00:00 |
|
inkcherry
|
ecbad2a70b
|
add proxy example
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:57 +00:00 |
|
inkcherry
|
e0f4336a5b
|
format
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:57 +00:00 |
|
inkcherry
|
675943e018
|
fix dp router
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:57 +00:00 |
|
inkcherry
|
a7ea23d16d
|
fix with new main branch
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:57 +00:00 |
|
inkcherry
|
b3e31b42d8
|
update gitignore
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:57 +00:00 |
|
inkcherry
|
9a15ae9f72
|
initial commit
Signed-off-by: inkcherry <mingzhi.liu@amd.com>
|
2025-11-27 07:30:57 +00:00 |
|
Matthew Bonanni
|
4c23690f43
|
[Attention] FlashAttention ViT support, make default backend (#28763)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-11-18 20:06:21 -08:00 |
|
Strahinja Stamenkovic
|
814843e021
|
Enable bitsandbytes quantization on AMD GPUs that use warp size 32 (#27307)
Signed-off-by: sstamenk <strahinja.stamenkovic@amd.com>
|
2025-11-19 03:12:31 +00:00 |
|
Li, Jiang
|
20852c8f4c
|
[CPU] Refactor CPU WNA16 (#28826)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-11-19 10:32:00 +08:00 |
|
Jialin Ouyang
|
40b6b38f2c
|
[Core] Switch Flat logprob control from environment variable to SamplingParams (#28914)
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
2025-11-19 02:10:02 +00:00 |
|
Jerry Zhang
|
da94c7c0eb
|
Move online quantization to model.load_weights (#26327)
Signed-off-by: Jerry Zhang <jerryzh168@gmail.com>
|
2025-11-18 16:52:41 -08:00 |
|
tomeras91
|
1395461f5f
|
[Hybrid][torch.compile] Refactor mamba2 forward to avoid obscuring linear projections under custom op (#28587)
Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
|
2025-11-18 16:49:36 -08:00 |
|
Varun Sundar Rabindranath
|
9912b8ccb8
|
[Build] Add OpenAI triton_kernels (#28788)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
|
2025-11-18 16:45:20 -08:00 |
|
Johnny
|
49ef847aa8
|
[NVIDIA] Guard SM100 CUTLASS MoE macro to SM100 builds v2 (#28938)
Signed-off-by: johnnynunez <johnnynuca14@gmail.com>
Signed-off-by: Johnny <johnnynuca14@gmail.com>
|
2025-11-18 16:44:27 -08:00 |
|
Michael Goin
|
67745d189f
|
Supress verbose logs from model_hosting_container_standards (#28949)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-11-18 12:29:06 -08:00 |
|
Kunshang Ji
|
2a2d5d2780
|
Replace torch.cuda.Event with torch.Event for better hardware compatibility (#26985)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-11-18 11:34:36 -08:00 |
|
Chendi.Xue
|
c3e2978620
|
[NIXL] fix cpu PD after physical <> logical block_size PR (#28904)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2025-11-18 14:03:23 -05:00 |
|
Isotr0py
|
e4bb2684bc
|
[Models] Replace all nn.Conv2d with vLLM's Conv2dLayer (#28842)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-18 18:56:04 +00:00 |
|
Kevin H. Luu
|
c64c0b78de
|
[chore] Move the rest of wikimedia url to S3 (#28921)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-11-18 09:44:18 -08:00 |
|