Life in the Singularity

Life in the Singularity

START: Self-taught Reasoner with Tools

Matt McDonagh's avatar
Matt McDonagh
Mar 08, 2025
∙ Paid

The last few months in AI have been incredible.

The reinforcement learning renaissance is underway. Then we hacked how to give models near-perfect (and near-infinite) memory. This all lead to the breakthrough of giving models a mechanism to self-reward:

Self-Rewarding Reasoning Large Language Models (SR-LLMs)

Matt McDonagh
·
March 2, 2025
Self-Rewarding Reasoning Large Language Models (SR-LLMs)

"There are decades where nothing happens; and there are weeks where decades happen"

Read full story

All of that in the span of 30-days.

Now, we are giving these reasoning models a large and powerful toolbox to handle as they learn and work.

Life in the Singularity is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

This paper was released by the Alibaba Group, the team behind the famous Qwen series of models that are currently battling with DeepSeek for the king of Chinese AI.

Did DeepSeek Get DeepSeeked by Alibaba?

Matt McDonagh
·
January 30, 2025
Did DeepSeek Get DeepSeeked by Alibaba?

Read full story

START, which is short for Self-taught Reasoner with Tools, moves beyond simply prompting LLMs to reason; it provides a mechanism for them to ground their reasoning in verifiable computations, learn to use tools efficiently, and do so in a way that is more scalable and accessible than previous app…

User's avatar

Continue reading this post for free, courtesy of Matt McDonagh.

Or purchase a paid subscription.
© 2026 Matt McDonagh · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture