Compare commits

...

8 Commits

Author SHA1 Message Date
Dr. Artificial曾小健
6aab286119
Merge d7a382f7e1746d30a9e520d7bf023b635fc19e71 into 0cf78561f1d51c84a21b2190626b21116d5c68bb 2025-04-09 13:36:46 +08:00
Xingkai Yu
0cf78561f1
Merge pull request #129 from peti562/patch-2
fixing a typo
2025-04-09 13:36:23 +08:00
Xingkai Yu
4a6d53cac8
Merge pull request #189 from eladb/patch-1
chore: add syntax highlighting to citation
2025-04-09 13:34:09 +08:00
Xingkai Yu
f1e82facf1
Merge pull request #100 from aBurmeseDev/main
docs: fix contact email link README.md
2025-04-09 13:32:25 +08:00
Dr. Artificial曾小健
d7a382f7e1
fix, Update README.md
fix
2025-01-31 17:04:48 +08:00
Elad Ben-Israel
6a023be7cf
chore: add syntax highlighting to citation 2025-01-30 10:06:19 +02:00
Peter Makadi
6e59fa73e6
fixing a typo
in licences section, Llama should be capitalized
2025-01-28 21:52:11 +01:00
John L.
c942e96852
chore: fix contact mailto link 2025-01-27 18:06:01 -08:00

View File

@ -60,7 +60,7 @@ To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSe
**Distillation: Smaller Models Can Be Powerful Too**
- We demonstrate that the reasoning patterns of larger models can be distilled into smaller models, resulting in better performance compared to the reasoning patterns discovered through RL on small models. The open source DeepSeek-R1, as well as its API, will benefit the research community to distill better smaller models in the future.
- Using the reasoning data generated by DeepSeek-R1, we fine-tuned several dense models that are widely used in the research community. The evaluation results demonstrate that the distilled smaller dense models perform exceptionally well on benchmarks. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community.
- Using the reasoning data generated by DeepSeek-R1, we fine-tuned several dense models that are widely used in the research community. The evaluation results demonstrate that the distilled smaller dense models perform exceptionally well on benchmarks. We have open-sourced distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community.
## 3. Model Downloads
@ -257,11 +257,11 @@ When responding, please keep the following points in mind:
This code repository and the model weights are licensed under the [MIT License](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE).
DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that:
- DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from [Qwen-2.5 series](https://github.com/QwenLM/Qwen2.5), which are originally licensed under [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-1.5B/blob/main/LICENSE), and now finetuned with 800k samples curated with DeepSeek-R1.
- DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under [llama3.1 license](https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/LICENSE).
- DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under [llama3.3 license](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/blob/main/LICENSE).
- DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under [Llama3.1 license](https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/LICENSE).
- DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under [Llama3.3 license](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/blob/main/LICENSE).
## 8. Citation
```
```bibtex
@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning},
author={DeepSeek-AI},
@ -274,4 +274,4 @@ DeepSeek-R1 series support commercial use, allow for any modifications and deriv
```
## 9. Contact
If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).
If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).