vllm/offline_inference at 53415653ff24be03e7c90f5b42ef9cb3f72aad71 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-08-03 17:17:04 +08:00

History

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

2025-08-22 10:56:57 +08:00

basic

[Kernel/Quant] Remove AQLM (#22943 )

2025-08-16 19:38:21 +00:00

disaggregated-prefill-v1

[Docs] Switch to better markdown linting pre-commit hook (#21851 )

2025-07-29 19:45:08 -07:00

openai_batch

[Docs] Switch to better markdown linting pre-commit hook (#21851 )

2025-07-29 19:45:08 -07:00

profiling_tpu

[Misc] small update (#20462 )

2025-07-03 20:33:44 -07:00

qwen2_5_omni

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

async_llm_streaming.py

[Example] Add async_llm_streaming.py example for AsyncLLM streaming in python (#21763 )

2025-07-30 18:39:46 -06:00

audio_language.py

[Model] Gemma3n MM (#20495 )

2025-08-09 09:56:25 -07:00

automatic_prefix_caching.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

batch_llm_inference.py

[Docs] Improve docstring for ray data llm example (#20597 )

2025-07-07 20:06:26 -07:00

chat_with_tools.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

context_extension.py

[Misc] refactor context extension (#19246 )

2025-06-07 05:13:21 +00:00

convert_model_to_seq_cls.py

[Model][Last/4] Automatic conversion of CrossEncoding model (#19675 )

2025-07-07 14:46:04 +00:00

data_parallel.py

[Kernels] Clean up FusedMoeMethodBase and modular kernel setup. Remove extra arguments from modular kernel methods. (#22035 )

2025-08-15 14:46:00 -04:00

disaggregated_prefill.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

embed_jina_embeddings_v3.py

[Deprecation][2/N] Replace --task with --runner and --convert (#21470 )

2025-07-27 19:42:40 -07:00

embed_matryoshka_fy.py

[Deprecation][2/N] Replace --task with --runner and --convert (#21470 )

2025-07-27 19:42:40 -07:00

encoder_decoder_multimodal.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

encoder_decoder.py

[New Model]mBART model (#22883 )

2025-08-16 12:16:58 +00:00

llm_engine_example.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

load_sharded_state.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

logits_processor.py

[V1] Logits processors extensibility (#19912 )

2025-08-16 12:59:17 -07:00

lora_with_quantization_inference.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

metrics.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

mistral-small.py

[Frontend] Use engine argument to control MM cache size (#22441 )

2025-08-07 09:47:10 -07:00

mlpspeculator.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

multilora_inference.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

neuron_eagle.py

[bugfix] fix syntax warning caused by backslash (#21251 )

2025-07-20 17:12:10 +00:00

neuron_int8_quantization.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

neuron_multimodal.py

[Misc] refactor neuron_multimodal and profiling (#19397 )

2025-06-10 06:12:42 +00:00

neuron_speculation.py

[Misc] Remove deprecated args in v0.10 (#21349 )

2025-07-22 05:26:39 -07:00

neuron.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

prefix_caching.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

prithvi_geospatial_mae.py

Support encoder-only models without KV-Cache (#21270 )

2025-07-26 21:09:52 +08:00

profiling.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

prompt_embed_inference.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

qwen3_reranker.py

[Deprecation][2/N] Replace --task with --runner and --convert (#21470 )

2025-07-27 19:42:40 -07:00

qwen_1m.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

reproducibility.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

rlhf_colocate.py

[Docs] Improve docs for RLHF co-location example (#20599 )

2025-07-09 08:06:43 -07:00

rlhf_utils.py

[RLHF] Fix torch.dtype not serializable in example (#22158 )

2025-08-04 02:43:33 +00:00

rlhf.py

[RLHF] Fix torch.dtype not serializable in example (#22158 )

2025-08-04 02:43:33 +00:00

save_sharded_state.py

[Bugfix] fix max-file-size type from str to int (#21675 )

2025-07-28 00:06:52 -07:00

simple_profiling.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

skip_loading_weights_in_engine_init.py

[Doc] Add inplace weights loading example (#19640 )

2025-07-17 21:12:23 -07:00

spec_decode.py

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

structured_outputs.py

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

torchrun_example.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

tpu.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

vision_language_multi_image.py

[Model][VLM] Support R-4B Model (#23246 )

2025-08-21 04:08:52 +00:00

vision_language_pooling.py

[Deprecation][2/N] Replace --task with --runner and --convert (#21470 )

2025-07-27 19:42:40 -07:00

vision_language.py

[Bugfix] Fix extra whitespace in strings caused by newline (#23272 )

2025-08-20 22:03:00 -07:00