mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced 2026-05-28 05:37:04 +08:00
[Docs] Improve API docs (+small tweaks) (#22459)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
parent
ccdae737a0
commit
7be7f3824a
@ -58,10 +58,9 @@ nav:
|
|||||||
- CI: contributing/ci
|
- CI: contributing/ci
|
||||||
- Design Documents: design
|
- Design Documents: design
|
||||||
- API Reference:
|
- API Reference:
|
||||||
- Summary: api/README.md
|
- Summary: api/summary.md
|
||||||
- Contents:
|
- Contents:
|
||||||
- glob: api/vllm/*
|
- api/vllm/*
|
||||||
preserve_directory_names: true
|
|
||||||
- CLI Reference:
|
- CLI Reference:
|
||||||
- Summary: cli/README.md
|
- Summary: cli/README.md
|
||||||
- Community:
|
- Community:
|
||||||
|
|||||||
@ -1,7 +1,4 @@
|
|||||||
---
|
# FP8 INC
|
||||||
title: FP8 INC
|
|
||||||
---
|
|
||||||
[](){ #inc }
|
|
||||||
|
|
||||||
vLLM supports FP8 (8-bit floating point) weight and activation quantization using Intel® Neural Compressor (INC) on Intel® Gaudi® 2 and Intel® Gaudi® 3 AI accelerators.
|
vLLM supports FP8 (8-bit floating point) weight and activation quantization using Intel® Neural Compressor (INC) on Intel® Gaudi® 2 and Intel® Gaudi® 3 AI accelerators.
|
||||||
Currently, quantization is validated only in Llama models.
|
Currently, quantization is validated only in Llama models.
|
||||||
|
|||||||
@ -105,7 +105,7 @@ class Example:
|
|||||||
return fix_case(self.path.stem.replace("_", " ").title())
|
return fix_case(self.path.stem.replace("_", " ").title())
|
||||||
|
|
||||||
def generate(self) -> str:
|
def generate(self) -> str:
|
||||||
content = f"---\ntitle: {self.title}\n---\n\n"
|
content = f"# {self.title}\n\n"
|
||||||
content += f"Source <gh-file:{self.path.relative_to(ROOT_DIR)}>.\n\n"
|
content += f"Source <gh-file:{self.path.relative_to(ROOT_DIR)}>.\n\n"
|
||||||
|
|
||||||
# Use long code fence to avoid issues with
|
# Use long code fence to avoid issues with
|
||||||
|
|||||||
@ -40,6 +40,7 @@ theme:
|
|||||||
- navigation.sections
|
- navigation.sections
|
||||||
- navigation.prune
|
- navigation.prune
|
||||||
- navigation.top
|
- navigation.top
|
||||||
|
- navigation.indexes
|
||||||
- search.highlight
|
- search.highlight
|
||||||
- search.share
|
- search.share
|
||||||
- toc.follow
|
- toc.follow
|
||||||
@ -51,11 +52,6 @@ hooks:
|
|||||||
- docs/mkdocs/hooks/generate_argparse.py
|
- docs/mkdocs/hooks/generate_argparse.py
|
||||||
- docs/mkdocs/hooks/url_schemes.py
|
- docs/mkdocs/hooks/url_schemes.py
|
||||||
|
|
||||||
# Required to stop api-autonav from raising an error
|
|
||||||
# https://github.com/tlambert03/mkdocs-api-autonav/issues/16
|
|
||||||
nav:
|
|
||||||
- api
|
|
||||||
|
|
||||||
plugins:
|
plugins:
|
||||||
- meta
|
- meta
|
||||||
- search
|
- search
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user