You learn so many interesting ways to view the world by studying data science.
One of the concepts that really captured me was “local minima”.
In the context of AI a "local minima" signifies a point of development where further progress within that particular paradigm becomes increasingly difficult. While impressive results may have been achieved, there is a sense that the current approach has inherent limitations, hindering significant breakthroughs toward true artificial intelligence.
To find AI you need to reach Mount AI.
You must avoid the hills (the minima) that seem like they are taking you skyward at first.. but ultimately represent costly distractions from the Main Quest.
There is a large group of very smart people in the AI community that say Large Language Models, LLMs for short, are the latest hill we are collectively stuck on.
In this piece I’m going to explore other paths to AGI. We’re going to start by discussing the current (potential) hill, LLM, and then head off into the landscape to see the other paths toward Mount AI.
By the end we will combine some of these methods to form new AI architecture and see if that takes us closer to our destination.
What’s Wrong with LLMs?
The common derogatory term for LLMs is to call them “stochastic parrots”.
The term "stochastic parrot" suggests that LLMs, despite their impressive ability to generate coherent and seemingly meaningful text, lack real understanding about what they digested or outputted.
They are perceived as sophisticated mimics, adeptly parroting patterns and structures they have observed in their vast training data, without possessing true comprehension of the meaning behind the words they generate.
It is "stochastic" because the output is based on probabilistic predictions derived from statistical patterns in the training data, rather than on a deeper understanding of the world or the ability to reason logically.
This makes sense of course.
LLMs primarily learn from text, divorced from real-world experiences or sensory inputs. This limits their ability to form a grounded understanding of concepts and their relationships, potentially leading to nonsensical or factually incorrect outputs in situations that require common sense or world knowledge. LLMs might excel at pattern recognition and completion, but often struggle with tasks that require reasoning, generalization, or the ability to apply knowledge to novel situations. This limitation highlights their potential lack of genuine understanding.
The counterpoint: as LLMs grow in size and complexity, they exhibit emergent capabilities that go beyond simple pattern recognition. They demonstrate impressive performance on tasks like translation, summarization, and even creative writing, suggesting a deeper grasp of language and its nuances.
Are these LLMs actually building miniature world models they use to generate the next token.. or is it just probabilities talking?
The debate over whether LLMs truly "understand" remains open, but their impact on society is undeniable.
Where Exactly Is Mount AI?
We could be standing on it (LLM) and simply need more parameters, better token routing algorithms, superior training methods, more clever inference, etc…
Let’s assume the real path to AI is out there in the wilderness.
Time for us to explore.
Evolutionary Algorithms (EAs)
At their core, EAs operate on a population of candidate solutions, each encoded as a "genome" (often a string of bits or numbers). This population undergoes a cycle of selection, reproduction, and mutation.
Selection: Individuals are evaluated based on a fitness function that quantifies how well they solve the target problem. The fittest individuals are more likely to be chosen for reproduction.
Reproduction: Selected individuals create offspring through crossover, where parts of their genomes are combined. This mimics sexual reproduction, potentially combining beneficial traits from different parents.
Mutation: Offspring undergo random mutations in their genomes, introducing variation. This prevents the population from stagnating and allows exploration of new solution spaces.
This cycle repeats for many generations, with the population gradually evolving toward better solutions. You can really fine-tune the conditions using powerful levers like Elitism which helps you preserve the best individuals from each generation to ensure that good solutions are not lost due to random genetic operations.
EAs are inherently parallel, allowing them to explore multiple solutions simultaneously. In the context of replacing LLMs, EAs could evolve text or code by starting with a random population of sequences and iteratively refining them based on a fitness function that measures their quality.
EAs can produce unexpected and creative outputs, potentially going beyond the patterns learned by LLMs. EAs evolve to fit specific tasks or domains without requiring large pre-trained models. This makes them highly adaptable and reduces inference expense.
Training is a different matter, however. The evolutionary process can be computationally expensive, especially for complex tasks. Building a model to achieve X (whatever you’re seeking to achieve) is another issue. It can be challenging to precisely control the output of an EA.
Short Version: These algorithms mimic the process of natural selection to evolve solutions to problems. They start with a population of candidate solutions and iteratively apply genetic operations like mutation and crossover to create new generations. The fittest individuals, based on a defined fitness function, are selected to survive and reproduce, leading to the evolution of increasingly better solutions.
Neural Symbolic AI
This approach marries the connectionist power of neural networks with the logical capabilities of symbolic systems. Neural networks excel at learning patterns from data, while symbolic systems can reason with explicit knowledge and rules.
Neural Component: This typically consists of deep learning models trained on large datasets. They can process raw data (e.g., text, images) and extract meaningful representations.
Symbolic Component: This involves knowledge representation formalisms like logic or ontologies. It allows for reasoning, inference, and manipulation of symbols according to defined rules.
The key challenge is bridging these two worlds. One approach is to use neural networks to learn to ground symbols in raw data, enabling the system to interpret and reason about the world. Another approach is to use symbolic knowledge to guide the learning process of neural networks, improving their interpretability and generalization.
“Explainability” has become an increasingly hot topic among the AI community. The symbolic component can provide explicit reasoning steps, making the system's decisions more transparent.
Humans use symbols to better understand their world. It helps us generalize. Symbolic reasoning can enable the system to handle novel situations not explicitly seen during training, a vital step toward true AI.
This all sounds fantastic, so what’s the catch?
Combining neural and symbolic components effectively remains a challenge. In the same spirit as the earlier mentioned Evolutionary approach, symbolic reasoning can be computationally expensive for large knowledge bases.
Short Version: This approach combines the strengths of neural networks with symbolic reasoning. Neural networks excel at pattern recognition and learning from data, while symbolic reasoning enables logical deduction and manipulation of symbols. By integrating these two paradigms, neural symbolic AI aims to achieve more explainable and generalizable AI systems.
Bayesian Networks
These graphical models capture probabilistic relationships between variables. Each node in the network represents a variable, and directed edges represent conditional dependencies.
Structure Learning: This involves determining the network's topology (i.e., which variables are connected). Algorithms like constraint-based or score-based learning can be used, often relying on data to infer dependencies.
Parameter Learning: Once the structure is known, the conditional probability distributions associated with each node are learned from data.
Inference: Given observed evidence (values of some variables), Bayesian inference calculates the posterior probabilities of other variables. This allows for reasoning under uncertainty and making predictions based on available information.
In replacing LLMs, Bayesian networks could be used to model the probabilistic relationships between words, topics, or sentiments, enabling tasks like text classification and sentiment analysis with a focus on uncertainty quantification.
Short Version: These are probabilistic graphical models that represent relationships between variables using directed acyclic graphs. Nodes in the graph represent variables, and edges represent probabilistic dependencies. Bayesian networks enable reasoning under uncertainty and provide a framework for making inferences and predictions based on observed evidence.
Neuroevolution
This approach leverages evolutionary algorithms to optimize neural network architectures or their parameters. The "genomes" in this case represent the network's structure or weights.
Encoding: Neural networks can be encoded in various ways, such as direct encoding of weights or indirect encoding of connection patterns.
Evolution: The population of neural networks undergoes selection, reproduction (e.g., crossover of weights or substructures), and mutation.
Fitness Evaluation: Each network is evaluated on the target task, and its performance determines its fitness.
Neuroevolution can discover novel architectures that are well-suited for specific tasks, potentially surpassing human-designed networks. It could be used to evolve specialized neural networks for language processing tasks, offering an alternative to pre-trained LLMs.
Short Version: This approach combines neural networks with evolutionary algorithms. Neural networks are used to represent the behavior or structure of solutions, and evolutionary algorithms are used to optimize the parameters or topology of the neural networks. Neuroevolution has been applied to tasks like evolving neural network controllers for robots and designing artificial neural networks for specific tasks.
So Which Path Is Right?
That’s a $1 quadrillion dollar question.
The answer may not be knowable with our current level of development — several of the paths I explained above are still being developed and refined in laboratory environments.
Things are moving fast, though.
It's also important to recognize that the "real AI mountain" might not be a single peak, but rather a vast and complex landscape. LLMs, despite their limitations, could still contribute significantly to the journey towards AI, providing valuable tools and insights along the way.
I believe the “answer” is a blend of evolutionary approaches and symbolic systems. I’ve been thinking about this a lot and there are multiple ways you can fuse these strategies together, such as:
Evolutionary Neural Architecture Search (ENAS)
ENAS utilizes evolutionary algorithms to automatically design neural network architectures, optimizing their performance on specific tasks. By incorporating symbolic representations of network components and their connections, ENAS could evolve more interpretable and modular networks, potentially leading to improved generalization and efficiency.
Symbolic Genetic Programming (SGP)
SGP extends traditional genetic programming by allowing the evolution of programs composed of both symbolic expressions and neural network components. This hybrid approach enables the combination of symbolic reasoning with the learning capabilities of neural networks, potentially facilitating the emergence of systems capable of both understanding abstract concepts and handling complex data.
Neuro-Symbolic Reinforcement Learning
This approach combines reinforcement learning, where agents learn through trial and error, with symbolic representations of states, actions, and goals. By grounding the learning process in a symbolic framework, agents could develop more robust and generalizable strategies, facilitating transfer learning and explainable decision-making.
Evolutionary Neuro-Symbolic Integration
This innovative amalgamation involves evolving both the structure and parameters of neural networks alongside symbolic knowledge bases. Evolutionary algorithms could optimize the interaction between these two components, enabling the emergence of systems capable of both learning from data and reasoning with symbolic representations, potentially bridging the gap between connectionist and symbolic AI paradigms.
Evolutionary Developmental Neuro-Symbolic Systems
Borrowed from developmental psychology, this approach involves evolving AI systems that start with simple symbolic representations and gradually develop more complex ones through interaction with the environment. This could lead to the emergence of systems with a deeper understanding of the world, grounded in their own experiences and capable of adapting to novel situations.
These are just a few potential avenues for blending evolutionary symbolic systems in AI research. By combining the strengths of both paradigms we gain AI systems that are more capable and adaptable.
The integration of evolutionary algorithms with symbolic representations offers a promising path toward achieving true artificial intelligence, enabling systems to not only learn from data but also reason, generalize, and understand the world in a more human-like manner.
If we can deliver an AI system capable of generalizing and understanding the world in the way humans do… we’ll find ourselves atop Mount AI.
Ultimately, the pursuit of true AI requires a multifaceted approach, embracing both existing paradigms like LLMs and exploring new frontiers in AI research. Only through continuous experimentation and collaboration can we hope to climb Mount AI and unlock the full potential of intelligence - artificial or otherwise.
Can I ask you for a small favor?
🥺 Please follow our newly launched video channel Hacking Wealth, we need to reach a 100 followers before Rumble will unlock a named URL.
Thank you for helping us grow and thank you for reading Life in the Singularity.
I started this in May 2023 to track all the accelerating changes in AI. New players, new models, new methods… a big new future materializing in front of us.
Our audience now includes Big Tech engineers and executives, incredible technologists, Fortune 500 board members and thousands of people looking to leverage technology to maximize the utility in their lives.
To help us continue our growth, would you please Like, Comment and Share this?
Thank you again!!!