38 Commits

Author SHA1 Message Date
youkaichao
f12141170a
[torch.compile] consider relevant code in compilation cache (#11614)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-08 10:46:43 +00:00
youkaichao
3682e33f9f
[v1] fix compilation cache (#11598)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-30 04:24:12 +00:00
youkaichao
dba4d9dec6
[v1][bugfix] fix cudagraph with inplace buffer assignment (#11596)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-29 09:03:49 +00:00
youkaichao
eb881ed006
[misc] fix typing (#11540)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-27 11:05:08 +08:00
Lucas Tucker
dbeac95dbb
Mypy checking for vllm/compilation (#11496)
Signed-off-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: lucast2021 <lucast2021@headroyce.org>
2024-12-26 05:04:07 +00:00
youkaichao
88a412ed3d
[torch.compile] fast inductor (#11108)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2024-12-16 16:15:22 -08:00
Luka Govedič
30870b4f66
[torch.compile] Dynamic fp8 + rms_norm fusion (#10906)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
2024-12-13 03:19:23 +00:00
youkaichao
66aaa7722d
[torch.compile] remove graph logging in ci (#11110)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-11 10:59:50 -08:00
youkaichao
91642db952
[torch.compile] use depyf to dump torch.compile internals (#10972)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-11 10:43:05 -08:00
youkaichao
d1c2e15eb3
[torch.compile] add dynamo time tracking (#11005)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-08 23:09:04 -08:00
youkaichao
a1887f2c96
[torch.compile] fix deprecated code (#10948)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-06 11:01:23 +00:00
youkaichao
b031a455a9
[torch.compile] add logging for compilation time (#10941)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-12-06 10:07:15 +00:00
youkaichao
db87eb6c67
[torch.compile] use size tuning for specific sizes (#10933)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-05 20:30:41 -08:00
youkaichao
dc5ce861bf
[torch.compile] remove compilation_context and simplify code (#10838)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-03 06:19:02 +00:00
Cyrus Leung
f877a7d12a
[Misc] Improve type annotations for support_torch_compile (#10763)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-30 17:48:35 -08:00
youkaichao
05d1f8c9c6
[misc] move functions to config.py (#10624)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-25 09:27:30 +00:00
youkaichao
65813781a2
[torch.compile] add warning for unsupported models (#10622)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-24 23:27:51 -08:00
Luka Govedič
8b0fe06c89
[torch.compile] Inductor code caching fix (#10273)
Signed-off-by: luka <luka@neuralmagic.com>
Signed-off-by: Luka Govedic <luka.govedic@gmail.com>
2024-11-20 21:44:57 -08:00
youkaichao
0cd3d9717e
[7/N] torch.compile, reduce compilation time (#10460)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-20 11:20:38 -08:00
youkaichao
51bb12d17b
[4/N][torch.compile] clean up set_torch_compile_backend (#10401)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-17 23:57:20 -08:00
youkaichao
4fd9375028
[2/N][torch.compile] make compilation cfg part of vllm cfg (#10383)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-16 18:02:14 -08:00
Tyler Michael Smith
2885ba0e24
[Misc] Change RedundantReshapesPass and FusionPass logging from info to debug (#10308)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2024-11-15 02:44:26 +00:00
youkaichao
eea55cca5b
[1/N] torch.compile user interface design (#10237)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-11 18:01:06 -08:00
youkaichao
330e82d34a
[v1][torch.compile] support managing cudagraph buffer (#10203)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-11-11 11:10:27 -08:00
bnellnm
10b67d865d
[Bugfix] SymIntArrayRef expected to contain concrete integers (#10170)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
2024-11-08 14:44:18 -08:00
Luka Govedič
4f93dfe952
[torch.compile] Fuse RMSNorm with quant (#9138)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: youkaichao <youkaichao@126.com>
2024-11-08 21:20:08 +00:00
Woosuk Kwon
4089985552
[V1] Integrate Piecewise CUDA graphs (#10058)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-11-05 22:16:04 -08:00
youkaichao
c4cacbaa7f
[v1] reduce graph capture time for piecewise cudagraph (#10059)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-05 18:19:50 -08:00
youkaichao
ca9844b340
[bugfix] fix weak ref in piecewise cudagraph and tractable test (#10048)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-05 14:49:20 -08:00
youkaichao
aff1fd8188
[torch.compile] use interpreter with stable api from pytorch (#9889)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-01 11:50:37 -07:00
youkaichao
ff5ed6e1bc
[torch.compile] rework compile control with piecewise cudagraph (#9715)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-10-29 23:03:49 -07:00
youkaichao
17c79f3c36
[torch.compile] auto infer dynamic_arg_dims from type annotation (#9589) 2024-10-22 13:43:37 -07:00
Russell Bryant
776dbd74f1
[CI/Build] mypy: Resolve some errors from checking vllm/engine (#9267)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-10-16 22:55:59 +00:00
youkaichao
e00c094f15
[torch.compile] generic decorators (#9258) 2024-10-10 15:54:23 -07:00
youkaichao
e4d652ea3e
[torch.compile] integration with compilation control (#9058) 2024-10-10 12:39:36 -07:00
youkaichao
a36e070dad
[torch.compile] fix functionalization (#8480) 2024-09-14 09:46:04 -07:00
youkaichao
ce2702a923
[tpu][misc] fix typo (#8260) 2024-09-06 22:40:46 -07:00
youkaichao
ce6bf3a2cf
[torch.compile] avoid Dynamo guard evaluation overhead (#7898)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-08-28 16:10:12 -07:00