Strategy Over Size
How a radical "less is more" approach in AI is outsmarting tech's giants, proving that the future of intelligence isn't about size, but strategic process design.
There hasn’t been a shake-up in AI this big from an “outsider” since DeepSeek.
I’ve spent my career at the intersection of finance and technology, building and investing in AI systems that create tangible value. We sold a company to Facebook in 2018, and today my family office is invested across 20 tech startups. My job is to see through the hype and identify the architectural shifts that will define the next decade.
And let me tell you, I’ve just seen one.
We’re all mesmerized by the sheer scale of today’s Large Language Models, these are digital behemoths with trillions of parameters. Yet for all their astonishing ability to write poetry or summarize reports, they can be stumped by a simple Sudoku puzzle. This paradox reveals a fundamental weakness: their brute-force approach struggles with problems that require precise, step-by-step reasoning. A single misstep can send the entire solution off a cliff.
Today’s models leverage lots of thought chains. These are houses built on foundational errors that throw off the entire structure. You can’t build like that.
This is the central problem tackled by a brilliant new paper, “Less is More: Recursive Reasoning with Tiny Networks”.
The researchers introduced the Tiny Recursive Model (TRM), an AI that operates on a radically different principle. Instead of being a massive, one-shot prediction engine, TRM uses a surprisingly small network, with less than 0.01% of the parameters of major LLMs, that thinks recursively.
Please read that again and digest. That is SHOCKINGLY smaller than the parameter size being used by today’s AI giants and the OSS models.
How does it work?
It formulates an answer, then loops back on itself to check and refine that answer, progressively improving its solution in a focused, deliberate cycle. The results are staggering.
On notoriously difficult logic puzzles where giant models fail, TRM achieves breakthrough performance, boosting accuracy on extreme Sudoku puzzles from 55% to an incredible 87.4% and significantly outperforming LLMs on the abstract reasoning benchmark ARC-AGI.
The core contribution here is a powerful challenge to the “bigger is better” orthodoxy dominating AI development. TRM demonstrates that an elegant process can be vastly more effective than raw computational power. It’s a system designed not just to know, but to reason.
GPU dealers are LOVING the current approach. Power generation is being installed everywhere to fuel AI. Clearly the bet is on BIGGER.
But what if the smaller, strategic approach is superior?
By repeatedly iterating on its own work, TRM mimics a more human-like, self-correcting thought process, achieving a new level of accuracy and efficiency. This research provides a blueprint for a new class of AI: small, hyper-efficient “reasoning engines” that could soon become essential components in everything from scientific discovery to industrial optimization.
Why This Changes The Path to AGI
For the past several years, the AI landscape has been dominated by a single, powerful idea: the scaling laws. This paradigm dictates that to make models smarter, you simply make them bigger: more data, more parameters, more computation. This led to the explosion of LLMs like GPT, Gemini, and Claude, models that have redefined what’s possible in natural language processing. But this relentless pursuit of scale has created its own set of problems. These models are incredibly expensive to train and run, consume city-sized amounts of energy, and, most importantly, their reasoning capabilities have hit a wall.
Their auto-regressive nature, generating a response token by token, makes them inherently brittle for tasks demanding logical precision.
The first real attempt to break this mold came from a predecessor model called the Hierarchical Reasoning Model. HRM was a clever, biologically-inspired system that used two separate neural networks recursing at different frequencies to solve problems. It showed promise, but it was complex, relying on esoteric biological arguments and advanced mathematical justifications that didn’t quite hold up under scrutiny. The performance gains were there, but they were incremental.
This is where TRM represents a true breakthrough. The authors of “Less is More” took a deep look at HRM and made a crucial insight: the complexity was not a feature, but a bug. They stripped the system down to its absolute essentials. The innovation of TRM lies in its profound simplicity and efficiency:
A Single, Tiny Network
Instead of two medium-sized networks, TRM uses a single, incredibly small 2-layer network to handle the entire reasoning process. This not only cuts the number of parameters in half but proves that a streamlined architecture is more effective.
Intuitive Reasoning Loop
TRM reframes the process in a way that’s instantly understandable. It maintains two key pieces of information: the current Answer (y
) and the latent Reasoning (z
) aka its “chain of thought”. The model first updates its reasoning based on the problem and its last answer, then uses that new reasoning to produce a better answer.
TRM abandons HRM’s reliance on the complex Implicit Function Theorem to approximate training gradients. Instead, it backpropagates the error through the entire recursive process. This more direct approach is a key reason for its massive performance leap, improving accuracy on a key benchmark from 56.5% to 87.4%. That is crazy progress.
The underlying assumption of TRM is a complete inversion of the scaling philosophy. It assumes that for a vast class of problems, the most efficient path to a correct solution is not a single, massive leap of computation but a series of small, self-correcting steps. This research fits into the wider scientific conversation by offering a compelling alternative to the brute-force scaling race. It champions a future where AI systems are built with greater intention, where we design models with inherent, verifiable reasoning processes, making them not only more accurate but also more efficient and potentially more trustworthy.
How to describe how big this is…
Imagine trying to build a complex Swiss watch.
The standard Large Language Model approach is like using a giant industrial factory press. You feed it all the parts and, in one massive, powerful slam, it attempts to assemble the entire watch. The sheer force might get some parts in the right place, but it’s just as likely to crush the delicate gears, leaving you with a beautiful but broken timepiece. It’s a process of brute force.
The Tiny Recursive Model is completely different, and it reminds us of the way the real world works. It’s a master watchmaker sitting at their workbench. They pick up a single gear (the Answer) and place it carefully. Then, they look through their magnifying loupe to check its alignment and connection to the other parts (the Reasoning). Seeing a slight imperfection, they make a tiny adjustment, then look through the loupe again. This deliberate cycle of place, check, adjust, re-check continues until every component is perfectly seated. The final watch is flawless, not because of overwhelming power, but because of a precise, iterative, and self-correcting process.
That is the elegance of TRM.
Now, let’s crack this engineering marvel open and understand what makes it tick.
A Look Under the Hood
So, how does TRM achieve this deliberate, watchmaker-like precision? It’s not just a smaller model. It’s a fundamentally different, more structured way of thinking. The process is a disciplined, multi-step cycle of drafting, critiquing, and revising.
Here’s a breakdown of how it works:
Draft an Initial Answer: Unlike an LLM that writes word-by-word, TRM starts with an initial embedded answer (
y
) and latent reasoning state (z
). This is its quick, complete “draft” of the solution—a first rough guess.Create a “Scratchpad”: The model uses its latent reasoning feature (
z
) as a dedicated space for its internal thoughts, effectively a “scratchpad”. This is where the real work of refining its logic happens.Intensely Self-Critique: The model enters a tight inner loop. For a set number of steps (
n=6
in the optimal configuration), it repeatedly updates its reasoning on the scratchpad, taking into account the original question (x
), its current answer (y
), and its previous reasoning (z
). It’s effectively asking itself, “Given my current answer, does my logic hold up? How can I improve my thinking?”Revise the Answer: After this focused “thinking” phase, the model uses the improved logic from its scratchpad to create a brand new, better version of its answer.
Repeat Until Confident: This entire four-step process (draft, think, critique, revise) is itself a single “supervision step.” The model can repeat this entire cycle up to 16 times, with each major loop pushing the solution closer to a correct, logically sound final state.
This is the blueprint for a new kind of artificial intelligence with immediate, practical implications for my fellow investors and builders:
For Business Leaders: This is what a true algorithmic advantage looks like. While competitors are paying massive inference costs for brute-force scale, a smarter, more efficient model like TRM can deliver superior performance on reasoning tasks for a tiny fraction of the cost. The paper shows it achieves its results with less than 0.01% of the parameters of major LLMs. That’s margin. That’s new growth avenues. That’s a new ball game.
For Researchers: This work is a major validation for ideas at the intersection of neural and symbolic AI. The model’s ability to recursively “think” before “acting” demonstrates that architecture, not just scale, can be a primary driver of reasoning ability. It opens a new frontier for exploring how structured processes can lead to more robust intelligence.
For Practitioners: State-of-the-art reasoning is no longer gated behind billion-dollar GPU clusters. This paper provides a highly efficient, parameter-light blueprint for building specialized reasoners that can run almost anywhere. The experiments were performed on a handful of GPUs, not a massive data center, making this technology incredibly accessible .
The paper immediately opens up new avenues of research.
What are the “scaling laws” for recursion?
How does the optimal number of recursive steps change with problem complexity?
Can this recursive technique be adapted for creative and generative tasks, allowing a model to iteratively refine a video game?
Of course, the authors acknowledge limitations. As it stands, TRM is a supervised learning method designed for problems with a single correct answer; it’s not yet a generative model capable of handling ambiguity. That’s a big gap but it will be cleared or engineered around.
The next steps are clear: extend this recursive framework to generative domains, explore its performance on a wider array of scientific and logical problems, and develop a theoretical understanding of why this “less is more” approach is so effective at avoiding overfitting on small datasets.
The vision unlocked by this work is not just about building smaller or faster AI. It’s about building smarter AI.
This is how we get AI in the hands of the people, not the Big Tech giants.
It’s a shift away from computational brute force and toward elegant, efficient, and verifiable logic.
This democratizes powerful AI, paving the way for reasoning tools that can run on a laptop or even a phone, not just in a handful of corporate data centers.
This is Life in the Singularity.
Friends: in addition to the 17% discount for becoming annual paid members, we are excited to announce an additional 10% discount when paying with Bitcoin. Reach out to me, these discounts stack on top of each other!
Thank you for helping us accelerate Life in the Singularity by sharing.
I started Life in the Singularity in May 2023 to track all the accelerating changes in AI/ML, robotics, quantum computing and the rest of the technologies accelerating humanity forward into the future. I’m an investor in over a dozen technology companies and I needed a canvas to unfold and examine all the acceleration and breakthroughs across science and technology.
Our brilliant audience includes engineers and executives, incredible technologists, tons of investors, Fortune-500 board members and thousands of people who want to use technology to maximize the utility in their lives.
To help us continue our growth, would you please engage with this post and share us far and wide?! 🙏