Pass Fun Forum
User Info
Welcome, Guest. Please login or register.
July 06, 2025, 10:29:57 AM

Login with username, password and session length
Search:     Advanced search
News
NOTE:

NOW EVERY COUNTRY VISITOR CAN DOWNLOAD FROM SHARECASH LIKE USA - UK - CANADA & AUSTRALIA.
Forum Stats
49182 Posts in 48793 Topics by 137438 Members
Latest Member: Bert98T422
* Home Help Search Login Register
Pass Fun Forum  |  Pass Fun Main Index  |  Introduction!!!  |  DeepSeek's First-generation Reasoning Models 0 Members and 1 Guest are viewing this topic. « previous next »
Pages: [1] Print
Author Topic: DeepSeek's First-generation Reasoning Models  (Read 1 times)
EzequielCr
Newbie
*

Karma: +0/-0
Posts: 17

Hello, I'm Ubaso Nwaozuzu of FootballInNigeria.com.ng, working alongside our experienced Editor Kelechi Abel. We launched this platform to provide detailed analysis of Nigerian football. Our team is focused on providing accurate soccer updates to o

View Profile WWW Email
« on: February 01, 2025, 06:17:01 AM »


DeepSeek's first-generation reasoning designs, attaining efficiency equivalent to OpenAI-o1 across math, code, and thinking tasks.


Models


DeepSeek-R1


Distilled models


DeepSeek team has shown that the thinking patterns of larger models can be distilled into smaller models, leading to better efficiency compared to the thinking patterns found through RL on small models.


Below are the designs developed through fine-tuning against numerous dense designs commonly utilized in the research neighborhood utilizing reasoning information produced by DeepSeek-R1. The assessment results show that the distilled smaller sized thick designs perform exceptionally well on criteria.


DeepSeek-R1-Distill-Qwen-1.5 B


DeepSeek-R1-Distill-Qwen-7B


DeepSeek-R1-Distill-Llama-8B


DeepSeek-R1-Distill-Qwen-14B


DeepSeek-R1-Distill-Qwen-32B


DeepSeek-R1-Distill-Llama-70B


License


The design weights are accredited under the MIT License. DeepSeek-R1 series assistance business use, permit any adjustments and derivative works, consisting of, but not restricted to, distillation for training other LLMs.
Logged

My web site; ai
Pages: [1] Print 
« previous next »
 

Powered by SMF 2.0 RC1.2 | SMF © 2006–2009, Simple Machines LLC | Theme by nesianstyles | Buttons by Andrea