From 8245fd1aafe4f415d891ee9ca74d9407c3950a3f Mon Sep 17 00:00:00 2001 From: rudy0053 Date: Wed, 19 Mar 2025 02:52:36 +0000 Subject: [PATCH] =?UTF-8?q?=E5=BD=93=E6=88=91=E4=BD=BF=E7=94=A8=E6=8C=87?= =?UTF-8?q?=E4=BB=A4=EF=BC=9A=20```=20#=20We=20recommend=20using=20the=20t?= =?UTF-8?q?okenizer=20from=20base=20model=20to=20avoid=20long-time=20and?= =?UTF-8?q?=20buggy=20tokenizer=20conversion.=20CUDA=5FVISIBLE=5FDEVICES?= =?UTF-8?q?=3D0,1=20\=20vllm=20serve=20/data/models/ollama-model/QwQ-32B-G?= =?UTF-8?q?GUF/qwq-32b-q4=5Fk=5Fm.gguf=20\=20--tensor-parallel-size=202=20?= =?UTF-8?q?\=20--port=208132=20\=20--max-model-len=201024=20\=20--gpu-memo?= =?UTF-8?q?ry-utilization=200.7=20\=20>=20/data/models/qwq32-q4.log=202>&1?= =?UTF-8?q?=20```,=20=E5=87=BA=E7=8E=B0=E4=BA=86=E6=8A=A5=E9=94=99?= =?UTF-8?q?=EF=BC=9A=20The=20tokenizer=20class=20you=20load=20from=20this?= =?UTF-8?q?=20checkpoint=20is=20'LlamaTokenizer'.=20The=20class=20this=20f?= =?UTF-8?q?unction=20is=20called=20from=20is=20'Qwen2TokenizerFast'.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 模型检查点(qwq-32b-q4_k_m.gguf)内部分词器配置为LlamaTokenizer,但代码中实际使用的是Qwen2TokenizerFast(通义千问的分词器),为什么是这样?