Wildest Week in AI History

The Great Acceleration Begins

Jul 26, 2024

“There are decades in AI where nothing happens, and there are weeks where decades happen.”

This week technology took a huge leap forward thanks to AI. I suspect AI is going to have a larger and larger role in making scientific discoveries, developing technological breakthroughs and engineering marvels.

This week will be remembered for many reasons. Mistral released Mistral Large 2. Elon confirmed Grok 2’s release and discussed a very aggressive December 2024 release for Grok 3. But the big three acceleration events of this week came from OpenAI, Meta and Google DeepMind:

SearchGPT → OpenAI’s strategic shot at Google and Perplexity
Llama 3.1 → The Real Open AI is Meta
AlphaProof x AlphaGeometry 2 → DeepMind’s Brilliant Bicameral Mind

We’ll explore these more deeply in a moment. It’s the sheer volume of updates and accelerating model capabilities that have me most excited. The more players who are active in the AI space the more surface area we have for progress. The more avenues we are exploring for efficiency gains across data engineering, model architecture, training methods and the rest of the landscape.

We are entering a period of rapid acceleration in AI development and our trajectory is taking us to the singularity.

Our velocity is increasing at increasing rates.

We’re in it now.

Forces Fueling The Acceleration

There are a number of forces that are driving this accelerating dance between science, technology and engineering.

Hardware Advancements

The evolution of specialized hardware is a cornerstone of the AI acceleration we're witnessing. Graphics Processing Units (GPUs) were initially designed for rendering graphics but proved exceptionally efficient at the parallel computations needed for AI. Tensor Processing Units (TPUs), custom-designed by Google for AI workloads, further boosted performance. Alongside this, memory technologies and interconnect speeds have dramatically increased, allowing for faster data transfer and larger models. However, the real game-changer in recent years has been the revolution in memory technologies, addressing the limitations of traditional memory architectures and fueling the next wave of AI hardware innovation.

In traditional computing, the separation of processing units and memory creates a bottleneck known as the Von Neumann Bottleneck. Data constantly shuttles back and forth between these components, limiting processing speed. AI models, especially large language models and complex neural networks, are voracious consumers of data. Standard Dynamic Random Access Memory (DRAM) struggles to keep up with this demand, hindering the performance of AI hardware.

This has led to the exploration of several groundbreaking memory technologies, each aiming to overcome the Von Neumann Bottleneck and revolutionize AI hardware.

High-Bandwidth Memory (HBM) is a leading contender in the race to address AI's memory needs. By vertically stacking multiple DRAM layers, HBM creates a wider path for data flow, significantly increasing the bandwidth between memory and processor. This allows AI hardware to access data much faster, reducing the time the processor spends idling while waiting for data. This increased bandwidth directly translates to faster training times for AI models, enabling researchers to work with larger datasets and more complex architectures. HBM has already been integrated into high-end computing systems, and its continued development promises further advancements in AI hardware capabilities.

Non-Volatile Memory (NVM) technologies, such as 3D XPoint are an alternative to DRAM.

NVM boasts faster read/write speeds than DRAM, enabling quicker data access for AI systems.

Perhaps more importantly, NVM retains data even after power loss, a critical feature for AI training. This means that AI models don't need to be reloaded into memory every time a training session restarts, saving valuable time and resources. The combination of faster speeds and persistent storage makes NVM a promising candidate for accelerating AI training and enabling more efficient AI hardware architectures.

If all this sounds exotic, know we are experimenting with entirely novel architectures where the memory and processing live together and waste no time in commute.

In-Memory Computing (IMC) represents a radical departure from traditional memory architectures. Instead of shuttling data between separate processing and memory units, IMC integrates processing elements directly within the memory itself. This eliminates the data transfer bottleneck entirely, potentially unleashing massive speedups and reducing power consumption for specific AI tasks. While still in the research phase, IMC holds immense promise for the future of AI hardware. As the technology matures it will lead to a paradigm shift in how AI systems are designed and built, unlocking new levels of performance and efficiency.

These memory breakthroughs are already transforming the landscape of AI hardware.

Faster training times, the ability to handle more complex models, reduced power consumption, and improved overall efficiency are just some of the benefits that these new memory technologies are bringing to AI hardware. As research continues and these technologies become more refined and accessible, we can expect even more dramatic advancements in the capabilities of AI systems. The combination of specialized processors, such as TPUs, and advanced memory technologies is creating a new generation of AI hardware capable of pushing the boundaries of what AI can achieve.

Hardware isn’t the only thing speeding us along toward the Singularity. Software and data are both making quantum leaps of their own.

Democratization of AI

One of the most significant accelerants in AI is the unprecedented accessibility of tools and datasets.

Cloud computing platforms offer researchers and developers access to massive computational resources on-demand, eliminating the need for expensive infrastructure. Open-source libraries like TensorFlow and PyTorch have standardized AI development, providing a common foundation for experimentation and collaboration. The availability of vast public datasets, ranging from image repositories to text corpora, enables researchers to train and refine models without the need for extensive data collection.

This democratization has significantly lowered the barriers to entry for AI research, allowing a wider range of individuals and organizations to contribute to the field, accelerating innovation and fueling progress.

Many hands make light work… and multiple minds on a single problem mean different perspectives and ultimately, solutions.

Software's Agility Drives Rapid AI Iteration

The inherent flexibility of software plays a crucial role in the rapid pace of AI development.

Unlike hardware, which requires physical fabrication and can take years to improve, software can be iterated upon quickly. This means that researchers can experiment with new algorithms, architectures, and optimizations with relative ease.

As new insights are gained, AI models can be updated and refined in a matter of days or weeks, rather than months or years. This rapid iteration cycle enables researchers to explore the vast landscape of AI possibilities at an unprecedented pace, accelerating the discovery of new techniques and pushing the boundaries of what AI can achieve.

On GitHub, in Discord channels, on X and other places engineers gather together to share what they are working on, what roadblocks they’ve hit, what they’ve tried… and we all engage in communal learning transfer.

There’s never been this many developers, coders, programmers, architects and engineers.

There’s never been AI co-pilots who write code at levels you typically have to pay someone $300,000 to $600,000 per year for access to.

…and all these minds are collaborating in constantly improving digital environments with a growing horde of high-value data.

Data Explosion

Data is the lifeblood of AI.

Machine learning algorithms, the backbone of modern AI, rely on vast amounts of data to learn patterns and make predictions.

The exponential growth of data, both from real-world sources (e.g., sensors, social media) and synthetic generation, has provided AI with a seemingly endless supply of fuel. This abundance of data allows researchers to train more sophisticated and nuanced models, capable of tackling complex tasks like natural language understanding, image recognition, and decision-making.

The continuous influx of new data also enables AI models to adapt and evolve, staying up-to-date with the ever-changing world and ensuring their relevance in the face of new challenges.

It also helps roll-out entirely new model designs.

This week saw 3 incredible examples of new models with powerful capabilities ushering us forward faster and faster.

Wild Week in Review

SearchGPT → OpenAI’s strategic shot at Google and Perplexity
Llama 3.1 → The Real Open AI is Meta
AlphaProof x AlphaGeometry 2 → DeepMind’s Brilliant Bicameral Mind

SearchGPT

“SearchGPT is designed to help users connect with publishers by prominently citing and linking to them in searches” - OpenAI Blog Post, July 25, 2024

SearchGPT is a variant of OpenAI’s language models specifically designed to handle search-related queries and tasks. Unlike general-purpose language models, SearchGPT is optimized for retrieving, summarizing, and interacting with large volumes of information typically found in search engines or databases.

SearchGPT was developed alongside The Wall Street Journal, The Associated Press, Vox Media and a handful of other corporate content creators and distributors. This was very tactical as these partnerships stem litigation risk that Perplexity is seeing play out right now.

Strategists immediately noted:

this is an attack on Google
this is defense against Perplexity and other “we use AI, but we cultivate the experience and give citations” variants

Brilliant. With this move and OpenAI’s growing access to devices and users by way of the Apple and Microsoft deals the company is getting into a powerful posture.

Llama 3.1

Llama 3.1 is a 405B parameter model, the largest open-source AI model to date.

It offers state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation.

These are Current Generation capabilities for closed models but in OSS form. Truly incredible. This breakthrough is a major catalyst for science, technology, engineering and hundreds of other disciplines and lines of effort.

Llama 3.1 boasts a 128K context length, allowing it to reason over longer sequences of text. With the release of Llama 3.1, Meta aims to usher in a new era of open-source large language models.

Increasing open source model capabilities helps society in a number of ways. First, it allows more people around the world to have access to the benefits and opportunities of AI. Second, it ensures that power is not concentrated in the hands of a small few. Third, it allows the technology to be deployed more evenly and safely across society. Fourth, it allows developers to fully customize the models for their needs and applications.

Good one Meta.

The state of the art is still being set by DeepMind, however.

DeepMind’s Brilliant Bicameral Mind

One of the running jokes among nerds is how bad LLMs are at math. We share screenshots of AI math mistakes that a 10-yr old reasons their way through with ease.

Large Language Model, after all.

The modern solution is function calling or some other method to invoke Python to gain access to Numpy and the other libraries that bestow mathematical magic.

DeepMind looked at the massive universe of math and decided they could bifurcate the problem space and use two models with different internals.

AlphaProof is a system trained to prove mathematical statements in formal language. It uses a reinforcement learning approach to learn how to prove theorems.
AlphaGeometry 2 is a system designed to solve geometry problems. It uses a combination of deep learning and symbolic reasoning to solve problems.

Together, AlphaProof and AlphaGeometry 2 were able to solve four out of six problems from the 2024 International Mathematical Olympiad (IMO), achieving a score equivalent to a silver medal.

From Google DeepMind’s Press Release on July 25, 2024:

We’ve made great progress building AI systems that help mathematicians discover new insights, novel algorithms and answers to open problems. But current AI systems still struggle with solving general math problems because of limitations in reasoning skills and training data.
Today, we present AlphaProof, a new reinforcement-learning based system for formal math reasoning, and AlphaGeometry 2, an improved version of our geometry-solving system.

This is a major leap forward for AI, as it shows that AI can now solve complex mathematical problems that were previously thought to be beyond the reach of machines.

It demonstrates soon AI systems will outperform humans in 95% of mathematical and 97% of verbal tasks.

AlphaProof and AlphaGeometry 2 are still under development, but they have the potential to revolutionize the way we do mathematics. They could be used to solve problems that are too difficult for humans to solve on their own, and they could help us to better understand the nature of mathematics.

This triumph is the result of a collaborative effort between two distinct AI systems, AlphaProof and AlphaGeometry 2, each embodying unique strengths that together form a "bicameral mind" capable of tackling complex mathematical challenges previously reserved for human ingenuity.

AlphaProof: The Master of Formal Logic

AlphaProof is a specialist in the realm of formal logic. Trained on a vast corpus of mathematical proofs, AlphaProof has developed a deep understanding of the underlying principles and structures that govern mathematical reasoning. Its expertise lies in transforming mathematical problems stated in natural language into the precise, unambiguous syntax of formal logic. This translation process is crucial, as it enables AlphaProof to leverage its powerful reasoning capabilities to systematically explore the space of possible solutions.

AlphaProof's strength lies in its ability to construct rigorous, step-by-step proofs that adhere to the strict rules of formal logic. It can meticulously verify the validity of each step, ensuring that the final proof is sound and irrefutable. This meticulousness is essential in the realm of mathematics, where even the slightest error can invalidate an entire proof.

AlphaGeometry 2: The Intuitive Geometer

AlphaGeometry 2, the second half of the bicameral mind, complements AlphaProof's logical rigor with a more intuitive approach to problem-solving. Specializing in geometric reasoning, AlphaGeometry 2 has been trained on a massive dataset of geometric diagrams and their corresponding properties. This training has enabled it to develop a deep understanding of spatial relationships and geometric concepts, allowing it to "see" the underlying structure of geometric problems.

AlphaGeometry 2's strength lies in its ability to generate intuitive insights and conjectures. It can quickly identify promising avenues for exploration, even when the formal logic of the problem is not immediately apparent. This intuition allows AlphaGeometry 2 to guide the search for solutions, often leading to breakthroughs that would be difficult to achieve through purely logical deduction.

The Bicameral Mind in Action: Solving the IMO

The collaboration between AlphaProof and AlphaGeometry 2 is exemplified by their performance at the IMO. Faced with challenging mathematical problems that require both logical rigor and geometric intuition, the two systems seamlessly work together to find solutions.

The process begins with AlphaProof translating the problem into formal logic, providing a clear and unambiguous representation for both systems to work with. AlphaGeometry 2 then leverages its intuitive understanding of geometry to generate potential solution paths, which are then rigorously tested by AlphaProof for logical soundness. This iterative process of exploration and verification continues until a valid proof is found.

This bicameral approach has proven remarkably effective, allowing DeepMind's AI to solve four out of six problems at the IMO, a feat that demonstrates the power of combining different AI approaches to tackle complex challenges.

Multi-Minded Systems

The success of AlphaProof and AlphaGeometry 2 is a significant milestone in the field of AI. It demonstrates the potential of AI to not only solve complex mathematical problems but also to generate new knowledge and insights. This has far-reaching implications for fields such as mathematics, computer science, and engineering, where AI could be used to accelerate research and development.

Looking to the future, the development of AI systems with bicameral minds is a promising direction for research. Indeed, there is no reason to stop at 2 “minds” aka AI systems. In the near-future we will see 10, 100+ distinct synthetic minds working together to solve problems and optimize outcomes.

By combining different AI approaches, we can create systems that are more powerful and versatile than ever before.

These systems have the potential to revolutionize the way we solve problems and generate knowledge, ultimately leading to a better understanding of the world around us.

How Fast Can We Possibly Go?

The great things about weeks like these… they compound.

Now we’ve got these new models in the hands of millions of hardcore developers creating infrastructure and experiences for billions of users.

Soon GPT-5 will enter the noosphere and the great acceleration will lurch forward even faster.

Again, compounding all the way.

Hardware, software and data are all contributing more and more energy to our technological process… and now we have AI systems that are actively adding energy (and efficiency) to our engines as we race toward the singularity.

I think we are going to witness legendary feats of technology this Fall.

I predict the future will become less predictable.

👋 Thank you for reading Life in the Singularity.

I started this in May 2023 and AI has accelerated faster ever since. Our audience includes Wall St Analysts, VCs, Big Tech Engineers and Fortune 500 Executives.

To help us continue our growth, would you please Like, Comment and Share this?

Thank you again!!!

Share Life in the Singularity