A team of researchers at the University of California, Berkeley, claims to have created the core functionality of DeepSeek's R1-Zero AI for a mere $30.
The team's effort, led by Ph.D. candidate Jiayi Pan, questions the concept that cutting-edge AI requires a billion-dollar investment. It demonstrates that effective reinforcement learning techniques may be performed on a budget.

Pan's team used a small language model with just 3 billion parameters to replicate DeepSeek R1-Zero's reinforcement learning capabilities.
Despite its modest size, their AI self-correction and research-based problem-solving abilities are essential features allowing for iteratively refining responses.
Researchers tested their border using the Countdown game, a numerical puzzle in which players manipulate numbers to achieve a target value. Initially, the model made random guesses, but through reinforcement learning, it gradually developed problem-solving skills similar to human reasoning. Over time, it refined its approach, eventually solving problems more efficiently.
The team began their experiment with a 500 million-parameter model capable of only guessing and stopping. The AI used revision approaches on up to 1.5 billion parameters.
It significantly improved by 3 billion parameters, solving problems with greater accuracy and fewer steps. Despite the project's low cost, the findings indicate a potential paradigm shift in AI development, with efficiency and affordability challenging the dominance of billion-dollar AI enterprises.
Pan revealed his findings on X (formerly Twitter), announcing that their model, TinyZero, is now available on GitHub for public experimentation.
While the team is working on a research article, their findings show that large AI companies such as OpenAI, Google, and Microsoft may not require massive funds to make significant advances.
The implications of this research could find how AI is built, disrupting the status quo of large-scale processing expenses.