September 2024
The Emergence of SuperIntelligence
Listen to the Podcast Version
Future Thinking: The Emerging Intelligence
Natural Selection: What do cells, plants, animals, and humans all have in common? They all use future thinking to survive and anticipate changes in their environment, whether cells need to die to preserve organs, plants grow roots deeper before droughts, animals foresee danger, or humans earn money to buy food. We all think about the future and try to predict the best actions based on acquired knowledge. This natural selection process led to the emergence of intelligence in living organisms.
Biological Intelligence: The relation between prediction and intelligence has gained significant traction in cognitive neuroscience since Predictive Coding was introduced (Rao and Ballard, 1999). As Bubic et al. (2010) argue, prediction enables organisms to direct their behaviour towards the future while remaining grounded in the present. Recent theoretical work has proposed that intelligence itself can be approximated as the ability to produce accurate predictions (Tjøstheim and Stephens, 2022). This perspective aligns well with our focus on developing AI models specifically trained to predict future events in surgical contexts. Brown et al. (2022) suggest that over the lifespan, the brain becomes more effective at predicting (utilising knowledge) compared to learning. In the surgical context, this may imply that experienced surgeons develop enhanced predictive capabilities, allowing them to anticipate and respond to surgical scenarios more effectively than novices.
Complexity's Sweetspot for Developing Intelligence
Predicting Complex Sytems: Complexity itself, rather than the quality of the dataset, can drive the development of sophisticated behaviours in models. The findings have broader implications for data curation, neural network training, and even understanding human cognition, where balancing predictability and randomness appears crucial for emergent intelligence.
Hypothesis: Intelligence can emerge from modeling simple systems as long as they exhibit complex behaviors, even when the process that generates the data lacks inherent intelligence.
Conjecture: "intelligence arises from the ability to predict complexity and that creating intelligence may require only exposure to complexity."
Scaling Intelligence: Learning and Search
Scaling General Learning Methods: Scaling laws show that pre-training large neural networks on more data becomes increasingly effective in constructing world models as the number of parameters grows. This approach has demonstrated emerging new capabilities in models with higher parameter counts. These deep networks learn hierarchical features—from low-level to high-level representations—by distributing information across many layers. This enables them to approximate complex functions efficiently, capturing much of the knowledge humanity has accumulated. As Richard Sutton wrote in "The Bitter Lesson," the actual contents of minds are tremendously complex, and attempts to simplify them can be limiting. Instead of hard-coding specific knowledge or structures, we should focus on developing general learning methods that can discover and capture this arbitrary complexity on their own.
Scaling Solution Search Methods: As Sutton highlighted, there are only two techniques that scale indefinitely with compute: learning and search. The ability humans have to reason with an internal monologue (System 2, Kahneman) enables them to search over ideas before responding. This cognitive process inspired Chain-of-Thought, which improves language models on reasoning tasks. Interestingly, recent work has highlighted the potential of scaling inference-time compute, i.e., search, as an alternative to simply increasing model learnable parameters. Brown et al. (2024) and Snell et al. (2024) demonstrated that optimally allocating test-time compute can outperform much larger models on reasoning tasks. By leveraging the LLM as a simulator and exploring multiple potential futures, we may be able to achieve more robust and accurate predictions without necessarily requiring ever-larger models.
General Intelligence vs. Human-Like Intelligence
General intelligence differs fundamentally from human-like intelligence; AI can exhibit many types of general intelligence that do not resemble human cognition. Intelligence is often assessed using human standards, but AI's development follows different principles governed by physics and technology, rather than biological evolution. Key differences between human and artificial intelligence include:
- Structure: Human intelligence is bound to neural biology, while AI operates with independent hardware and software, allowing learned skills to be easily copied.
- Speed: AI systems communicate much faster, nearly at the speed of light, compared to the slower human nervous system.
- Communication: AI can directly connect and collaborate, unlike humans who rely on language with limited bandwidth.
- Updatability: AI can be rapidly updated and scaled, whereas humans cannot expand their cognitive capabilities instantly.
- Energy Consumption: The human brain is far more energy-efficient compared to AI systems, which require significant power.
These differences mean AI can process information faster, avoid miscommunication, and form highly integrated systems. For instance, autonomous vehicles can coordinate seamlessly because they operate under the same algorithms.