SCRP-Chat: DeepSeek-R1 Reasoning Model

We are excited to announce the immediate availability of DeepSeek-R1-Distill-Llama-70B, the first open-source reasoning model released by DeepSeek on January 20th 2025. Reasoning models are trained to conduct a hidden reasoning process before providing a final response, giving them the ability to search through different possible solutions and self-correct. In our internal test, the new model was able to solve certain mathematics questions that all previous models had failed to solve.

Please refer to the SCRP-Chat documentation for usage instructions.

DeepSeek-R1-Distill-Llama-70B

DeepSeek-R1-Distill-Llama-70B is a fine-tuned version of Llama 3.3 70B, trained with samples of the reasoning process generated by the full 671-billion-parameter DeepSeek R1 model. According to the DeepSpeed-R1 reseach paper, the model exceeds OpenAI-o1-mini across various benchmarks.

Generation speed: up to 33 token/sec.

Due to the hidden reasoning process, there is a noticeable delay before the model provides a final response. You can inspect the reasoning process in real time by clicking the ‘Thinking…’ or ‘Thought for X seconds’ dropdown menu.

When Should I Use a Reasoning Model?

Because a reasoning model’s reasoning process generates a significant amount of hidden tokens, it takes a much longer time than a normal text model to provide a final response. For task that requires reasoning, such as mathematics, coding and scientific reasoning, the tradeoff is often worthwhile. For task that does not require a reasoning process—e.g. text formatting—you will get much faster response with a normal text model.