Harry Mellor
|
a1fe24d961
|
Migrate docs from Sphinx to MkDocs (#18145)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-23 02:09:53 -07:00 |
|
Michael Goin
|
54af915949
|
[Doc] Update quickstart and install for cu128 using --torch-backend=auto (#18505)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-05-23 08:36:37 +00:00 |
|
Harry Mellor
|
4b0da7b60e
|
Enable hybrid attention models for Transformers backend (#18494)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-23 10:12:08 +08:00 |
|
Kai Wu
|
c91fe7b1b9
|
[Frontend][Bug Fix] Update llama4 pythonic jinja template and llama4_pythonic parser (#17917)
Signed-off-by: Kai Wu <kaiwu@meta.com>
|
2025-05-22 16:44:08 -07:00 |
|
Reid
|
cb506ecb5a
|
[Misc] improve Automatic Prefix Caching example (#18554)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-22 14:50:46 +00:00 |
|
Cyrus Leung
|
23b67b37b2
|
[Doc] Fix invalid JSON in example args (#18527)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-22 07:11:46 +00:00 |
|
Dhia Eddine Rhaiem
|
eca18691d2
|
[MODEL] FalconH1 (#18406)
Signed-off-by: dhia.rhaiem <dhia.rhaiem@tii.ae>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Ilyas Chahed <ilyas.chahed@tii.ae>
Co-authored-by: Jingwei Zuo <jingwei.zuo@tii.ae>
|
2025-05-21 04:59:06 -07:00 |
|
Kebe
|
5d7f545204
|
[Frontend] deprecate --device arg (#18399)
Signed-off-by: Kebe <mail@kebe7jun.com>
|
2025-05-21 01:21:17 -07:00 |
|
Reid
|
8f55962a7f
|
[Misc] refactor prompt embedding examples (#18405)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-20 15:26:12 +00:00 |
|
Reid
|
1b1e8e05ff
|
[doc] update env variable export (#18391)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-20 08:53:27 +00:00 |
|
Elad Segal
|
84ab4feb7e
|
[Doc] Fix typo (#18355)
|
2025-05-19 16:05:16 +00:00 |
|
Cyrus Leung
|
43b5f61dce
|
[Doc] Move input-related docs to Features (#18353)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-19 15:08:39 +00:00 |
|
Li Wang
|
c5bb0ebdc6
|
[Doc] Fix prompt embedding examples (#18350)
Signed-off-by: wangli <wangli858794774@gmail.com>
|
2025-05-19 06:48:16 -07:00 |
|
Nan Qin
|
221cfc2fea
|
Feature/vllm/input embedding completion api (#17590)
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
Signed-off-by: Nan2018 <nan@protopia.ai>
Co-authored-by: 临景 <linjing.yx@alibaba-inc.com>
Co-authored-by: Bryce1010 <bryceyx@gmail.com>
Co-authored-by: Andrew Sansom <andrew@protopia.ai>
Co-authored-by: Andrew Sansom <qthequartermasterman@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-18 20:18:05 -07:00 |
|
Robin
|
d1211f8794
|
[Doc] Add doc to explain the usage of Qwen3 thinking (#18291)
Signed-off-by: WangErXiao <863579016@qq.com>
|
2025-05-18 23:04:07 +00:00 |
|
Reid
|
b6a6e7a529
|
[Misc] add litellm integration (#18320)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-18 15:32:30 +00:00 |
|
Reid
|
1a8f68bb90
|
[doc] update reasoning doc (#18306)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-18 06:59:14 -07:00 |
|
Trevor Royer
|
55f1a468d9
|
Move cli args docs to its own page (#18228) (#18264)
Signed-off-by: Trevor Royer <troyer@redhat.com>
|
2025-05-16 19:43:45 -07:00 |
|
Reid
|
2dff093574
|
[Misc] add lobe-chat support (#18177)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-15 05:02:23 +00:00 |
|
Aaron Pham
|
afe3236e90
|
[Chore] astral's ty (#18116)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-05-15 05:00:43 +00:00 |
|
Aaron Pham
|
2fc9075b82
|
[V1] Structured Outputs + Thinking compatibility (#16577)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
|
2025-05-14 15:45:24 -07:00 |
|
Chen Zhang
|
964472b966
|
[Doc] Update prefix cache metrics to counting tokens (#18138)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-05-14 15:23:30 +00:00 |
|
Reid
|
9ccc6ded42
|
[doc] add missing import (#18133)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-14 10:57:34 +00:00 |
|
rongfu.leng
|
82e7f9bb03
|
[Misc] replace does not exist model (#18119)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-05-14 02:13:47 -07:00 |
|
wang.yuqi
|
63ad622233
|
[New Model]: support GTE NewModel (#17986)
|
2025-05-14 01:31:31 -07:00 |
|
Russell Bryant
|
0189a65a2e
|
[Docs] Expand security doc with firewall info (#18081)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-05-13 19:36:00 +00:00 |
|
Reid
|
906f0598fc
|
[doc] add download/list/delete HF model CLI usage (#17940)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-13 11:15:51 +00:00 |
|
bwshen-mi
|
acee8f48aa
|
[Model] Support MiMo-7B inference with MTP (#17433)
Signed-off-by: wp-alpha <wangpeng66@xiaomi.com>
Co-authored-by: wangpeng66 <wangpeng66@xiaomi.com>
|
2025-05-12 23:25:33 +00:00 |
|
Jonathan Berkhahn
|
98ea35601c
|
[Lora][Frontend]Add default local directory LoRA resolver plugin. (#16855)
Signed-off-by: jberkhahn <jaberkha@us.ibm.com>
|
2025-05-12 10:39:10 -07:00 |
|
Xu Wenqing
|
3a5ea75129
|
[Feature] Support DeepSeekV3 Function Call (#17784)
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: Xu Wenqing <xuwq1993@qq.com>
|
2025-05-12 00:45:21 -07:00 |
|
Isotr0py
|
021c16c7ca
|
[Model] Broadcast Ovis2 implementation to fit Ovis1.6 (#17861)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-11 17:56:30 -07:00 |
|
wang.yuqi
|
e4b8713380
|
[New Model]: nomic-embed-text-v2-moe (#17785)
|
2025-05-11 00:59:43 -07:00 |
|
Frieda Huang
|
9cea90eab4
|
[Frontend] Add /classify endpoint (#17032)
Signed-off-by: Frieda (Jingying) Huang <jingyingfhuang@gmail.com>
|
2025-05-11 07:57:07 +00:00 |
|
Reid
|
d1110f5b5a
|
[doc] update lora doc (#17936)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-11 15:56:21 +08:00 |
|
Reid
|
ec61ea20a8
|
[Misc] add dify integration (#17895)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-09 03:42:39 -07:00 |
|
Yan Ma
|
ff8c400502
|
[Doc] remove visible token in doc (#17884)
Signed-off-by: yan <yanma1@habana.ai>
|
2025-05-09 01:21:31 -07:00 |
|
Michael Yao
|
89a0315f4c
|
[Doc] Update several links in reasoning_outputs.md (#17846)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
|
2025-05-09 01:20:55 -07:00 |
|
Simon Mo
|
3d1e387652
|
[Docs] Add Slides from NYC Meetup (#17879)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-05-08 21:46:54 -07:00 |
|
Reid
|
53d0cb7423
|
[Misc] add chatbox integration (#17828)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-08 10:05:26 +00:00 |
|
Cyrus Leung
|
96722aa81d
|
[Frontend] Chat template fallbacks for multimodal models (#17805)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-07 23:05:54 -07:00 |
|
Chanh Nguyen
|
7ea2adb802
|
[Core] Support full cuda graph in v1 (#16072)
Signed-off-by: Chanh Nguyen <cnguyen@linkedin.com>
Co-authored-by: Chanh Nguyen <cnguyen@linkedin.com>
|
2025-05-07 22:30:15 -07:00 |
|
Harry Mellor
|
66ab3b13c9
|
Don't call the venv vllm (#17810)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-08 04:06:39 +00:00 |
|
Reid
|
7377dd0307
|
[doc] update the issue link (#17782)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-07 20:29:05 +08:00 |
|
Cyrus Leung
|
8a15c2603a
|
[Frontend] Add missing chat templates for various MLLMs (#17758)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-07 00:10:01 -07:00 |
|
Harry Mellor
|
022afbeb4e
|
Fix doc build performance (#17748)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-07 00:36:41 +00:00 |
|
Harry Mellor
|
6115b11582
|
Make right sidebar more readable in "Supported Models" (#17723)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-06 16:48:26 +00:00 |
|
Reid
|
7525d5f3d5
|
[doc] Add RAG Integration example (#17692)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-06 16:10:23 +00:00 |
|
Michael Yao
|
0d115460a7
|
[Docs] Use gh-file to add links to tool_calling.md (#17709)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
|
2025-05-06 15:27:19 +00:00 |
|
Harry Mellor
|
05e1f96419
|
Fix dockerfilegraph pre-commit hook (#17698)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-06 08:56:48 +00:00 |
|
Cyrus Leung
|
63ced7b43f
|
[Doc] Update notes for H2O-VL and Gemma3 (#17219)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-06 07:51:02 +00:00 |
|