[CI/Build][AMD] Use float16 in test_reset_prefix_cache_e2e to avoid accuracy issues (#29997)

Signed-off-by: Randall Smith <ransmith@amd.com>
Co-authored-by: Randall Smith <ransmith@amd.com>
This commit is contained in:
rasmith 2025-12-05 02:42:25 -06:00 committed by GitHub
parent 6038b1b04b
commit feecba09af
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -21,6 +21,7 @@ def test_reset_prefix_cache_e2e(monkeypatch):
max_num_batched_tokens=32,
max_model_len=2048,
compilation_config={"mode": 0},
dtype="float16",
)
engine = LLMEngine.from_engine_args(engine_args)
sampling_params = SamplingParams(