From 4ce53e3d716e0c8051915c31b32a9b02855dd401 Mon Sep 17 00:00:00 2001 From: ai-modelscope Date: Sat, 1 Feb 2025 20:51:41 +0800 Subject: [PATCH] Update README.md --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 295c6c9..d6ff9c6 100644 --- a/README.md +++ b/README.md @@ -211,6 +211,9 @@ python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B 3. For mathematical problems, it is advisable to include a directive in your prompt such as: "Please reason step by step, and put your final answer within \boxed{}." 4. When evaluating model performance, it is recommended to conduct multiple tests and average the results. +Additionally, we have observed that the DeepSeek-R1 series models tend to bypass thinking pattern (i.e., outputting "\\n\n\") when responding to certain queries, which can adversely affect the model's performance. +**To ensure that the model engages in thorough reasoning, we recommend enforcing the model to initiate its response with "\\n" at the beginning of every output.** + ## 7. License This code repository and the model weights are licensed under the [MIT License](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE). DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that: