mirror of
https://git.datalinker.icu/deepseek-ai/DeepSeek-R1.git
synced 2025-12-08 20:44:23 +08:00
Compare commits
8 Commits
4bad8a334f
...
6aab286119
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
6aab286119 | ||
|
|
0cf78561f1 | ||
|
|
4a6d53cac8 | ||
|
|
f1e82facf1 | ||
|
|
d7a382f7e1 | ||
|
|
6a023be7cf | ||
|
|
6e59fa73e6 | ||
|
|
c942e96852 |
10
README.md
10
README.md
@ -60,7 +60,7 @@ To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSe
|
||||
**Distillation: Smaller Models Can Be Powerful Too**
|
||||
|
||||
- We demonstrate that the reasoning patterns of larger models can be distilled into smaller models, resulting in better performance compared to the reasoning patterns discovered through RL on small models. The open source DeepSeek-R1, as well as its API, will benefit the research community to distill better smaller models in the future.
|
||||
- Using the reasoning data generated by DeepSeek-R1, we fine-tuned several dense models that are widely used in the research community. The evaluation results demonstrate that the distilled smaller dense models perform exceptionally well on benchmarks. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community.
|
||||
- Using the reasoning data generated by DeepSeek-R1, we fine-tuned several dense models that are widely used in the research community. The evaluation results demonstrate that the distilled smaller dense models perform exceptionally well on benchmarks. We have open-sourced distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community.
|
||||
|
||||
## 3. Model Downloads
|
||||
|
||||
@ -257,11 +257,11 @@ When responding, please keep the following points in mind:
|
||||
This code repository and the model weights are licensed under the [MIT License](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE).
|
||||
DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that:
|
||||
- DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from [Qwen-2.5 series](https://github.com/QwenLM/Qwen2.5), which are originally licensed under [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-1.5B/blob/main/LICENSE), and now finetuned with 800k samples curated with DeepSeek-R1.
|
||||
- DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under [llama3.1 license](https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/LICENSE).
|
||||
- DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under [llama3.3 license](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/blob/main/LICENSE).
|
||||
- DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under [Llama3.1 license](https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/LICENSE).
|
||||
- DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under [Llama3.3 license](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/blob/main/LICENSE).
|
||||
|
||||
## 8. Citation
|
||||
```
|
||||
```bibtex
|
||||
@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
|
||||
title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning},
|
||||
author={DeepSeek-AI},
|
||||
@ -274,4 +274,4 @@ DeepSeek-R1 series support commercial use, allow for any modifications and deriv
|
||||
```
|
||||
|
||||
## 9. Contact
|
||||
If you have any questions, please raise an issue or contact us at [service@deepseek.com](service@deepseek.com).
|
||||
If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user