DeepSeek: DeepSeek R1 Zero

deepseek/deepseek-r1-zero

DeepSeek-R1-Zero is a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step. It's 671B parameters in size, with 37B active in an inference pass.

It demonstrates remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.

DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. See DeepSeek R1 for the SFT model.

Modalities

Context

164K

Knowledge Cutoff

Jul 31, 2024

Activity

Recent activity on DeepSeek R1 Zero

Total usage per day on OpenRouter

Not enough data to display yet.

OpenRouter

Product

Chat
Rankings
Apps
Models
Providers
Pricing
Enterprise
Labs

Company

About
Announcements
CareersHiring
Privacy
Terms of Service
Support
State of AI
Works With OR
Data

Developer

Documentation
API Reference
SDK
Status

Connect

Discord
GitHub
LinkedIn
X
YouTube

DeepSeek: DeepSeek R1 Zero