[doc] update wrong hf model links (#17184)

Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-12-16 13:06:14 +08:00 · 2025-04-26 00:40:54 +08:00 · 2025-04-26 00:40:54 +08:00 · df5c879527
commit df5c879527
parent 423e9f1cbe
5 changed files with 6 additions and 7 deletions
--- a/docs/source/features/quantization/auto_awq.md
+++ b/docs/source/features/quantization/auto_awq.md
@ -6,7 +6,7 @@ To create a new 4-bit quantized model, you can leverage [AutoAWQ](https://github
 Quantization reduces the model's precision from BF16/FP16 to INT4 which effectively reduces the total model memory footprint.
 The main benefits are lower latency and memory usage.

-You can quantize your own models by installing AutoAWQ or picking one of the [6500+ models on Huggingface](https://huggingface.co/models?sort=trending&search=awq).
+You can quantize your own models by installing AutoAWQ or picking one of the [6500+ models on Huggingface](https://huggingface.co/models?search=awq).

 ```console
 pip install autoawq
--- a/docs/source/features/quantization/bitblas.md
+++ b/docs/source/features/quantization/bitblas.md
@ -20,8 +20,8 @@ vLLM reads the model's config file and supports pre-quantized checkpoints.

 You can find pre-quantized models on:

- [Hugging Face (BitBLAS)](https://huggingface.co/models?other=bitblas)
- [Hugging Face (GPTQ)](https://huggingface.co/models?other=gptq)
+- [Hugging Face (BitBLAS)](https://huggingface.co/models?search=bitblas)
+- [Hugging Face (GPTQ)](https://huggingface.co/models?search=gptq)

 Usually, these repositories have a `quantize_config.json` file that includes a `quantization_config` section.

--- a/docs/source/features/quantization/bnb.md
+++ b/docs/source/features/quantization/bnb.md
@ -14,7 +14,7 @@ pip install bitsandbytes>=0.45.3

 vLLM reads the model's config file and supports both in-flight quantization and pre-quantized checkpoint.

-You can find bitsandbytes quantized models on <https://huggingface.co/models?other=bitsandbytes>.
+You can find bitsandbytes quantized models on <https://huggingface.co/models?search=bitsandbytes>.
 And usually, these repositories have a config.json file that includes a quantization_config section.

 ## Read quantized checkpoint
--- a/docs/source/features/quantization/gptqmodel.md
+++ b/docs/source/features/quantization/gptqmodel.md
@ -18,7 +18,7 @@ for more details on this and other advanced features.

 ## Installation

-You can quantize your own models by installing [GPTQModel](https://github.com/ModelCloud/GPTQModel) or picking one of the [5000+ models on Huggingface](https://huggingface.co/models?sort=trending&search=gptq).
+You can quantize your own models by installing [GPTQModel](https://github.com/ModelCloud/GPTQModel) or picking one of the [5000+ models on Huggingface](https://huggingface.co/models?search=gptq).

 ```console
 pip install -U gptqmodel --no-build-isolation -v
--- a/docs/source/features/quantization/torchao.md
+++ b/docs/source/features/quantization/torchao.md
@ -30,5 +30,4 @@ tokenizer.push_to_hub(hub_repo)
 quantized_model.push_to_hub(hub_repo, safe_serialization=False)
 ```

-Alternatively, you can use the TorchAO Quantization space for quantizing models with a simple UI.
-See: https://huggingface.co/spaces/medmekk/TorchAO_Quantization
+Alternatively, you can use the [TorchAO Quantization space](https://huggingface.co/spaces/medmekk/TorchAO_Quantization) for quantizing models with a simple UI.