76 Commits

Author SHA1 Message Date
Charlie Fu
7b2f28deba
[AMD][torch.compile] Enable silu+fp8_quant fusion for rocm (#18082)
Signed-off-by: charlifu <charlifu@amd.com>
2025-05-13 22:13:56 -07:00
Harry Mellor
19324d660c
Update deprecated type hinting in vllm/compilation (#18072)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-13 08:32:48 -07:00
Aaron Pham
cb528d0585
[Fix] check to make sure processor has chat templates (#18047)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2025-05-13 03:04:10 -07:00
Harry Mellor
4b2ed7926a
Improve configs - the rest! (#17562)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-09 15:18:44 -07:00
Richard Zou
ea2236bf95
Add option to use torch._inductor.standalone_compile (#17057)
Signed-off-by: rzou <zou3519@gmail.com>
2025-05-09 12:59:04 -07:00
Richard Zou
edbf2d609e
[easy] Fix logspam on PiecewiseBackend errors (#17138)
Signed-off-by: rzou <zou3519@gmail.com>
2025-05-05 23:46:11 -07:00
Harry Mellor
d6484ef3c3
Add full API docs and improve the UX of navigating them (#17485)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-03 19:42:43 -07:00
Sage Moore
460a2b1100
[torch.compile] Add torch inductor pass for fusing silu_and_mul with subsequent scaled_fp8_quant operations (#10867)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-05-01 07:59:28 -07:00
Keyun Tong
26bc4bbcd8
Avoid overwriting vllm_compile_cache.py (#17418)
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
2025-05-01 07:30:57 +00:00
Bryan Lu
70788bdbdc
[V1][Spec Decode] Apply torch.compile & cudagraph to EAGLE (#17211)
Signed-off-by: Bryan Lu <yuzhelu@amazon.com>
2025-04-29 21:10:00 +00:00
Richard Zou
165cb56329
Ignore '<string>' filepath (#17330)
Signed-off-by: rzou <zou3519@gmail.com>
2025-04-28 19:23:29 -07:00
cascade
690fe019f0
[Feature] support sequence parallelism using compilation pass (#16155)
Signed-off-by: cascade812 <cascade812@outlook.com>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-04-27 06:29:35 -07:00
Lu Fang
4d17e20310
Disable the torch.compile cache checks when VLLM_DISABLE_COMPILE_CACHE=1 (#16573)
Signed-off-by: Lu Fang <lufang@fb.com>
2025-04-26 09:17:58 -07:00
James Wu
a6e72e1e4f
[Bugfix] [pytorch] Patch AOTAutogradCache._get_shape_env (#17142)
Signed-off-by: James Wu <jjwu@meta.com>
2025-04-26 11:28:20 +08:00
Richard Zou
fe742aef5a
[easy] Pass compile_fx only the config patches (#16845)
Signed-off-by: rzou <zou3519@gmail.com>
2025-04-20 12:25:19 +08:00
Richard Zou
682e0b6d2f
Log how much time loading a compiled artifact takes (#16848)
Signed-off-by: rzou <zou3519@gmail.com>
2025-04-19 16:50:46 +00:00
Richard Zou
966c742ed2
Disable remote caching when calling compile_fx (#16611)
Signed-off-by: rzou <zou3519@gmail.com>
2025-04-15 22:18:28 -07:00
Lu Fang
7678fcd5b6
Fix the torch version parsing logic (#15857) 2025-04-10 07:37:47 -07:00
Charlie Fu
e85829450d
[Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050)
Signed-off-by: charlifu <charlifu@amd.com>
2025-03-31 04:42:18 -07:00
Luka Govedič
04437e313d
[Bugfix] [torch.compile] Add Dynamo metrics context during compilation (#15639)
Signed-off-by: luka <luka@neuralmagic.com>
2025-03-28 14:01:09 -06:00
Jee Jee Li
70f2c2a709
[Bugfix] Fix 'InductorAdaptor object has no attribute 'cache_dir' (#15674)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-03-28 17:10:40 +08:00
Richard Zou
6278bc829e
Fix incorrect filenames in vllm_compile_cache.py (#15494)
Signed-off-by: <zou3519@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2025-03-27 18:33:41 +08:00
Matthew Vine
7a6d45bc8a
Support FIPS enabled machines with MD5 hashing (#15299)
Signed-off-by: Matthew Vine <32849887+MattTheCuber@users.noreply.github.com>
2025-03-26 20:19:46 -04:00
Luka Govedič
f622dbcf39
[Fix] [torch.compile] Improve UUID system for custom passes (#15249)
Signed-off-by: luka <luka@neuralmagic.com>
2025-03-24 01:54:07 +00:00
DefTruth
6ebaf9ac71
[Bugfix] consider related env vars for torch.compiled cache hash (#14953)
Signed-off-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
2025-03-23 15:53:09 +00:00
Jee Jee Li
2fa0e1396b
[Bugfix] Fix torch.compile raise FileNotFoundError (#15278)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-03-22 13:49:34 +08:00
Harry Mellor
3b352a2f92
Correct capitalisation: VLLM -> vLLM (#14562)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-10 16:36:21 +00:00
youkaichao
c6359e8ca6
[v1] torch.compile integration explanation (#14437)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-03-08 01:55:50 +08:00
Tyler Michael Smith
72c62eae5f
[V1] EP/TP MoE + DP Attention (#13931) 2025-03-04 21:27:26 -08:00
Luka Govedič
bd56c983d6
[torch.compile] Fix RMSNorm + quant fusion in the non-cutlass-fp8 case, rename RedundantReshapesPass to NoopEliminationPass (#10902)
Signed-off-by: luka <luka@neuralmagic.com>
2025-02-28 16:20:11 -07:00
youkaichao
09b95e36ab
[torch.compile] PyTorch 2.6 and nightly compatibility (#12393)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-07 01:09:07 +08:00
Russell Bryant
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**

commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:18:24 2025 -0500

    Add SPDX license headers to python source files
    
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
    also be easily used by tools to help manage license compliance.
    
The Linux Foundation runs license scans against the codebase to help
ensure
    we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
    
    More information can be found on the SPDX site:
    
    - https://spdx.dev/learn/handling-license-info/
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:36:32 2025 -0500

    Check for SPDX headers using pre-commit
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

---------

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
youkaichao
6e650f56a1
[torch.compile] decouple compile sizes and cudagraph sizes (#12243)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-24 02:01:30 +08:00
youkaichao
c81081fece
[torch.compile] transparent compilation with more logging (#12246)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-21 19:32:55 +08:00
youkaichao
51ef828f10
[torch.compile] fix sym_tensor_indices (#12191)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-20 11:37:50 +08:00
Roger Wang
81763c58a0
[V1] Add V1 support of Qwen2-VL (#12128)
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: imkero <kerorek@outlook.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-19 19:52:13 +08:00
youkaichao
e66faf4809
[torch.compile] store inductor compiled Python file (#12182)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-19 16:27:26 +08:00
youkaichao
7b98a65ae6
[torch.compile] disable logging when cache is disabled (#12043)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-17 20:29:31 +00:00
youkaichao
f12141170a
[torch.compile] consider relevant code in compilation cache (#11614)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-08 10:46:43 +00:00
youkaichao
3682e33f9f
[v1] fix compilation cache (#11598)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-30 04:24:12 +00:00
youkaichao
dba4d9dec6
[v1][bugfix] fix cudagraph with inplace buffer assignment (#11596)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-29 09:03:49 +00:00
youkaichao
eb881ed006
[misc] fix typing (#11540)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-27 11:05:08 +08:00
Lucas Tucker
dbeac95dbb
Mypy checking for vllm/compilation (#11496)
Signed-off-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: lucast2021 <lucast2021@headroyce.org>
2024-12-26 05:04:07 +00:00
youkaichao
88a412ed3d
[torch.compile] fast inductor (#11108)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2024-12-16 16:15:22 -08:00
Luka Govedič
30870b4f66
[torch.compile] Dynamic fp8 + rms_norm fusion (#10906)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
2024-12-13 03:19:23 +00:00
youkaichao
66aaa7722d
[torch.compile] remove graph logging in ci (#11110)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-11 10:59:50 -08:00
youkaichao
91642db952
[torch.compile] use depyf to dump torch.compile internals (#10972)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-11 10:43:05 -08:00
youkaichao
d1c2e15eb3
[torch.compile] add dynamo time tracking (#11005)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-08 23:09:04 -08:00
youkaichao
a1887f2c96
[torch.compile] fix deprecated code (#10948)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-06 11:01:23 +00:00
youkaichao
b031a455a9
[torch.compile] add logging for compilation time (#10941)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-12-06 10:07:15 +00:00