youkaichao
|
f12141170a
|
[torch.compile] consider relevant code in compilation cache (#11614)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-08 10:46:43 +00:00 |
|
youkaichao
|
3682e33f9f
|
[v1] fix compilation cache (#11598)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-30 04:24:12 +00:00 |
|
youkaichao
|
dba4d9dec6
|
[v1][bugfix] fix cudagraph with inplace buffer assignment (#11596)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-29 09:03:49 +00:00 |
|
youkaichao
|
eb881ed006
|
[misc] fix typing (#11540)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-27 11:05:08 +08:00 |
|
Lucas Tucker
|
dbeac95dbb
|
Mypy checking for vllm/compilation (#11496)
Signed-off-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: lucast2021 <lucast2021@headroyce.org>
|
2024-12-26 05:04:07 +00:00 |
|
youkaichao
|
88a412ed3d
|
[torch.compile] fast inductor (#11108)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2024-12-16 16:15:22 -08:00 |
|
Luka Govedič
|
30870b4f66
|
[torch.compile] Dynamic fp8 + rms_norm fusion (#10906)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-12-13 03:19:23 +00:00 |
|
youkaichao
|
66aaa7722d
|
[torch.compile] remove graph logging in ci (#11110)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-11 10:59:50 -08:00 |
|
youkaichao
|
91642db952
|
[torch.compile] use depyf to dump torch.compile internals (#10972)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-11 10:43:05 -08:00 |
|
youkaichao
|
d1c2e15eb3
|
[torch.compile] add dynamo time tracking (#11005)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-08 23:09:04 -08:00 |
|
youkaichao
|
a1887f2c96
|
[torch.compile] fix deprecated code (#10948)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-06 11:01:23 +00:00 |
|
youkaichao
|
b031a455a9
|
[torch.compile] add logging for compilation time (#10941)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-12-06 10:07:15 +00:00 |
|
youkaichao
|
db87eb6c67
|
[torch.compile] use size tuning for specific sizes (#10933)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-05 20:30:41 -08:00 |
|
youkaichao
|
dc5ce861bf
|
[torch.compile] remove compilation_context and simplify code (#10838)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-03 06:19:02 +00:00 |
|
Cyrus Leung
|
f877a7d12a
|
[Misc] Improve type annotations for support_torch_compile (#10763)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-30 17:48:35 -08:00 |
|
youkaichao
|
05d1f8c9c6
|
[misc] move functions to config.py (#10624)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-25 09:27:30 +00:00 |
|
youkaichao
|
65813781a2
|
[torch.compile] add warning for unsupported models (#10622)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-24 23:27:51 -08:00 |
|
Luka Govedič
|
8b0fe06c89
|
[torch.compile] Inductor code caching fix (#10273)
Signed-off-by: luka <luka@neuralmagic.com>
Signed-off-by: Luka Govedic <luka.govedic@gmail.com>
|
2024-11-20 21:44:57 -08:00 |
|
youkaichao
|
0cd3d9717e
|
[7/N] torch.compile, reduce compilation time (#10460)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-20 11:20:38 -08:00 |
|
youkaichao
|
51bb12d17b
|
[4/N][torch.compile] clean up set_torch_compile_backend (#10401)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-17 23:57:20 -08:00 |
|
youkaichao
|
4fd9375028
|
[2/N][torch.compile] make compilation cfg part of vllm cfg (#10383)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 18:02:14 -08:00 |
|
Tyler Michael Smith
|
2885ba0e24
|
[Misc] Change RedundantReshapesPass and FusionPass logging from info to debug (#10308)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2024-11-15 02:44:26 +00:00 |
|
youkaichao
|
eea55cca5b
|
[1/N] torch.compile user interface design (#10237)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-11 18:01:06 -08:00 |
|
youkaichao
|
330e82d34a
|
[v1][torch.compile] support managing cudagraph buffer (#10203)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-11 11:10:27 -08:00 |
|
bnellnm
|
10b67d865d
|
[Bugfix] SymIntArrayRef expected to contain concrete integers (#10170)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
|
2024-11-08 14:44:18 -08:00 |
|
Luka Govedič
|
4f93dfe952
|
[torch.compile] Fuse RMSNorm with quant (#9138)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-11-08 21:20:08 +00:00 |
|
Woosuk Kwon
|
4089985552
|
[V1] Integrate Piecewise CUDA graphs (#10058)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-05 22:16:04 -08:00 |
|
youkaichao
|
c4cacbaa7f
|
[v1] reduce graph capture time for piecewise cudagraph (#10059)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-05 18:19:50 -08:00 |
|
youkaichao
|
ca9844b340
|
[bugfix] fix weak ref in piecewise cudagraph and tractable test (#10048)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-05 14:49:20 -08:00 |
|
youkaichao
|
aff1fd8188
|
[torch.compile] use interpreter with stable api from pytorch (#9889)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-01 11:50:37 -07:00 |
|
youkaichao
|
ff5ed6e1bc
|
[torch.compile] rework compile control with piecewise cudagraph (#9715)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-10-29 23:03:49 -07:00 |
|
youkaichao
|
17c79f3c36
|
[torch.compile] auto infer dynamic_arg_dims from type annotation (#9589)
|
2024-10-22 13:43:37 -07:00 |
|
Russell Bryant
|
776dbd74f1
|
[CI/Build] mypy: Resolve some errors from checking vllm/engine (#9267)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-10-16 22:55:59 +00:00 |
|
youkaichao
|
e00c094f15
|
[torch.compile] generic decorators (#9258)
|
2024-10-10 15:54:23 -07:00 |
|
youkaichao
|
e4d652ea3e
|
[torch.compile] integration with compilation control (#9058)
|
2024-10-10 12:39:36 -07:00 |
|
youkaichao
|
a36e070dad
|
[torch.compile] fix functionalization (#8480)
|
2024-09-14 09:46:04 -07:00 |
|
youkaichao
|
ce2702a923
|
[tpu][misc] fix typo (#8260)
|
2024-09-06 22:40:46 -07:00 |
|
youkaichao
|
ce6bf3a2cf
|
[torch.compile] avoid Dynamo guard evaluation overhead (#7898)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-08-28 16:10:12 -07:00 |
|