Cyrus Leung
8896eb72eb
[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed ( #18800 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-22 10:56:57 +08:00
Kevinzz
16bff144be
[Misc] fix typo in the multimodal doc ( #23051 )
2025-08-17 01:56:20 -07:00
Michael Goin
4fc722eca4
[Kernel/Quant] Remove AQLM ( #22943 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-08-16 19:38:21 +00:00
633WHU
3f52738dce
[Doc] Add max_lora_rank configuration guide ( #22782 )
...
Signed-off-by: chiliu <cliu_whu@yeah.net>
2025-08-13 04:10:07 -07:00
Hongsheng Liu
3a7e3bbdd2
[Doc] Added unmentioned required option "method" in the usage of EAGLE-3 based models ( #21737 )
...
Signed-off-by: Dilute-l <dilu2333@163.com>
Co-authored-by: Dilute-l <dilu2333@163.com>
2025-08-12 00:14:51 -07:00
Harry Mellor
7be7f3824a
[Docs] Improve API docs (+small tweaks) ( #22459 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-08 03:02:51 -07:00
iAmir97
099c046463
[Doc] Sleep mode documentation ( #22310 )
...
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com>
Signed-off-by: iAmir97 <71513472+iAmir97@users.noreply.github.com>
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Hong Hanh <hanh.usth@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2025-08-08 12:25:18 +08:00
WeiQing Chen
289b18e670
[Docs] Update features/disagg_prefill, add v1 examples and development ( #22165 )
...
Signed-off-by: David Chen <530634352@qq.com>
2025-08-07 00:59:23 -07:00
Yong Hoon Shin
5e8398805e
[Doc] Fix link to prefix caching design ( #22384 )
...
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
2025-08-07 00:28:15 -07:00
Gamhang
0a6d305e0f
feat(multimodal): Add customizable background color for RGBA to RGB conversion ( #22052 )
...
Signed-off-by: Jinheng Li <ahengljh@gmail.com>
Co-authored-by: Jinheng Li <ahengljh@gmail.com>
2025-08-01 06:07:33 -07:00
WeiQing Chen
4931486988
[Doc] Added warning of speculating with draft model ( #22047 )
...
Signed-off-by: Dilute-l <dilu2333@163.com>
Co-authored-by: Dilute-l <dilu2333@163.com>
2025-08-01 02:11:56 -07:00
Hongsheng Liu
79731a79f0
[Doc] Fix a syntax error of example code in structured_outputs.md ( #22045 )
...
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
2025-08-01 00:01:22 -07:00
Hongsheng Liu
5c8fe389d6
[Docs] Fix the example code of streaming chat completions in reasoning ( #21825 )
...
Signed-off-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: wangzi <3220100013@zju.edu.cn>
Co-authored-by: Zi Wang <66560864+BruceW-07@users.noreply.github.com>
2025-07-30 12:11:58 +00:00
Cyrus Leung
5bbaf492a6
[Doc] Update partial support ( #21916 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-30 01:32:39 -07:00
Harry Mellor
ba5c5e5404
[Docs] Switch to better markdown linting pre-commit hook ( #21851 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 19:45:08 -07:00
Cyrus Leung
ab714131e4
[Doc] Update compatibility matrix for pooling and multimodal models ( #21831 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-29 06:29:51 -07:00
Michael Goin
947e982ede
[Docs] Minimize spacing for supported_hardware.md table ( #21779 )
2025-07-28 18:46:39 -07:00
Cyrus Leung
86ae693f20
[Deprecation][2/N] Replace --task with --runner and --convert ( #21470 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-27 19:42:40 -07:00
WeiQing Chen
97349fe2bc
[Docs] add offline serving multi-modal video input expamle Qwen2.5-VL ( #21530 )
...
Signed-off-by: David Chen <530634352@qq.com>
2025-07-25 18:37:32 -07:00
Wenhua Cheng
5ac3168ee3
[Docs] add auto-round quantization readme ( #21600 )
...
Signed-off-by: Wenhua Cheng <wenhua.cheng@intel.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-25 08:52:42 -07:00
Shintarou Okada
6eca337ce0
Replace --expand-tools-even-if-tool-choice-none with --exclude-tools-when-tool-choice-none for v0.10.0 ( #20544 )
...
Signed-off-by: okada <kokuzen@gmail.com>
Signed-off-by: okada shintarou <okada@preferred.jp>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 02:56:36 -07:00
Michael Goin
82ec66f514
[V0 Deprecation] Remove Prompt Adapters ( #20588 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-23 16:36:48 -07:00
Michael Yao
23637dcdef
[Docs] Fix bullets and grammars in tool_calling.md ( #21440 )
...
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-07-23 01:23:20 -07:00
Ning Xie
d97841078b
[Misc] unify variable for LLM instance ( #20996 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-07-21 12:18:33 +01:00
Harry Mellor
be54a951a3
[Docs] Fix hardcoded links in docs ( #21287 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-21 02:23:57 -07:00
Asher
5a7fb3ab9e
[Model] Add ToolParser and MoE Config for Hunyuan A13B ( #20820 )
...
Signed-off-by: Asher Zhang <asherszhang@tencent.com>
2025-07-17 09:10:09 +00:00
Nir David
01513a334a
Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor) ( #12010 )
...
Signed-off-by: Nir David <ndavid@habana.ai>
Signed-off-by: Uri Livne <ulivne@habana.ai>
Co-authored-by: Uri Livne <ulivne@habana.ai>
2025-07-16 15:33:41 -04:00
Harry Mellor
313ae8c16a
[Deprecation] Remove everything scheduled for removal in v0.10.0 ( #20979 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-15 15:57:53 +00:00
bigmoyan
5f0af36af5
Update kimi-k2 tool calling docs, enable unit tests ( #20821 )
...
Signed-off-by: wangzhengtao <wangzhengtao@moonshot.cn>
Co-authored-by: wangzhengtao <wangzhengtao@moonshot.cn>
Co-authored-by: wangzhengtao <wangzhengtao@msh.team>
2025-07-11 20:16:14 +00:00
Reid
6fb162447b
[doc] fix ordered list issue ( #20819 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-07-11 06:49:46 -07:00
Reid
6a9e6b2abf
[doc] fold long code block ( #20795 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-07-10 23:16:41 -07:00
Alex Brooks
41060c6e08
[Core] Add Support for Default Modality Specific LoRAs [generate / chat completions] ( #19126 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
2025-07-10 21:09:37 +01:00
fxmarty-amd
332d4cb17b
[Feature][Quantization] MXFP4 support for MOE models ( #17888 )
...
Signed-off-by: Felix Marty <felmarty@amd.com>
Signed-off-by: Bowen Bao <bowenbao@amd.com>
Signed-off-by: Felix Marty <Felix.Marty@amd.com>
Co-authored-by: Bowen Bao <bowenbao@amd.com>
2025-07-09 13:19:02 -07:00
Cyrus Leung
70ca5484f5
[Doc] Update notes ( #20668 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-09 03:46:36 -07:00
qscqesze
f95570a52d
[Docs] fix minimax tool_calling docs error ( #20667 )
...
Signed-off-by: qingjun <qingjun@minimaxi.com>
2025-07-09 00:37:07 -07:00
Harry Mellor
b942c094e3
Stop using title frontmatter and fix doc that can only be reached by search ( #20623 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-08 03:27:40 -07:00
Harry Mellor
b4bab81660
Remove unnecessary explicit title anchors and use relative links instead ( #20620 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-08 02:49:13 -07:00
Harry Mellor
af107d5a0e
Make distinct code and console admonitions so readers are less likely to miss them ( #20585 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-07 19:55:28 -07:00
Harry Mellor
923147b5e8
[Doc] Fix internal links so they don't always point to latest ( #20563 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-07 04:15:50 -07:00
Harry Mellor
45877ef740
[Doc] Use gh-pr and gh-issue everywhere we can in the docs ( #20564 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-07 03:54:22 -07:00
Flora Feng
fe1e924811
[Frontend] Support image object in llm.chat ( #19635 )
...
Signed-off-by: sfeng33 <4florafeng@gmail.com>
Signed-off-by: Flora Feng <4florafeng@gmail.com>
2025-07-06 06:47:13 +00:00
Guy Stone
d3f05c9248
[Doc] fix mutltimodal_inputs.md gh examples link ( #20497 )
...
Signed-off-by: Guy Stone <guys@spotify.com>
2025-07-04 16:41:35 -07:00
Jee Jee Li
1819fbda63
[Quantization] Bump to use latest bitsandbytes ( #20424 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-03 21:58:46 +08:00
qscqesze
363528de27
[Feature] Support MiniMax-M1 function calls features ( #20297 )
...
Signed-off-by: QscQ <qscqesze@gmail.com>
Signed-off-by: qingjun <qingjun@minimaxi.com>
2025-07-03 06:48:27 +00:00
Nicolò Lucchesi
3dd359147d
[Docs] Update EAGLE example ( #20375 )
...
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-07-02 17:13:51 -07:00
cronoik-inceptionai
b95877509b
Documentation update tool_calling: mapping back to function from response ( #20373 )
2025-07-02 05:55:49 -07:00
QiliangCui
b205e8467d
[Doc][TPU] Add models and features supporting matrix. ( #20230 )
...
Signed-off-by: Qiliang Cui <cuiq@google.com>
2025-07-02 06:33:20 +00:00
yyzxw
be0cfb2b68
fix[Docs]: link anchor is incorrect #20309 ( #20315 )
...
Signed-off-by: zxw <1020938856@qq.com>
2025-07-02 06:32:34 +00:00
Shintarou Okada
3d19d47d91
[Frontend] Expand tools even if tool_choice="none" ( #17177 )
...
Signed-off-by: okada shintarou <okada@preferred.jp>
2025-07-01 12:47:38 -04:00
Lukas Geiger
c3649e4fee
[Docs] Fix syntax highlighting of shell commands ( #19870 )
...
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-06-23 17:59:09 +00:00