mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-08-03 18:57:07 +08:00

History

[Doc: ]fix various typos in multiple files (#23487 )

Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2025-08-25 00:04:04 +00:00

auto_awq.md

Stop using title frontmatter and fix doc that can only be reached by search (#20623 )

2025-07-08 03:27:40 -07:00

auto_round.md

[Docs] Switch to better markdown linting pre-commit hook (#21851 )

2025-07-29 19:45:08 -07:00

bitblas.md

[Docs] Fix hardcoded links in docs (#21287 )

2025-07-21 02:23:57 -07:00

bnb.md

Stop using title frontmatter and fix doc that can only be reached by search (#20623 )

2025-07-08 03:27:40 -07:00

fp8.md

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

gguf.md

Stop using title frontmatter and fix doc that can only be reached by search (#20623 )

2025-07-08 03:27:40 -07:00

gptqmodel.md

Stop using title frontmatter and fix doc that can only be reached by search (#20623 )

2025-07-08 03:27:40 -07:00

inc.md

[Doc: ]fix various typos in multiple files (#23487 )

2025-08-25 00:04:04 +00:00

int4.md

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

int8.md

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

modelopt.md

Make distinct code and console admonitions so readers are less likely to miss them (#20585 )

2025-07-07 19:55:28 -07:00

quantized_kvcache.md

[Docs] Switch to better markdown linting pre-commit hook (#21851 )

2025-07-29 19:45:08 -07:00

quark.md

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

README.md

[Docs] add auto-round quantization readme (#21600 )

2025-07-25 08:52:42 -07:00

supported_hardware.md

[Kernel/Quant] Remove AQLM (#22943 )

2025-08-16 19:38:21 +00:00

torchao.md

[Docs] Switch to better markdown linting pre-commit hook (#21851 )

2025-07-29 19:45:08 -07:00

README.md

Quantization

Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.

Contents: