Tao on AlphaEvolve

Terrence Tao summarizes AlphaEvolve nicely:

Very roughly speaking, this is a tool that can attempt to extremize functions F(x) with x ranging over a high dimensional parameter space Omega, that can outperform more traditional optimization algorithms when the parameter space is very high dimensional and the function F (and its extremizers) have non-obvious structural features

AlphaEvolve thrives in problems that have simple objective functions, i.e. problems for which we can easily judge whether a solution is correct or close to correct. This property allows the model to quickly generate a large number of potential solutions using a cheap and fast large language model, and then iteratively improve the pool of solutions. The system then rapidly evaluates these programs and iteratively improves the best candidates, driving the evolutionary process forward.

From Gemini:

The Role of MAP-Elites and Island Models

To effectively navigate the vast and complex search space of all possible computer programs, AlphaEvolve integrates two powerful techniques from evolutionary computation: MAP-Elites and an island-based population model. These methods work together to ensure the search is both broad and deep, preventing it from getting stuck on a single, suboptimal solution.

  • MAP-Elites (Multi-dimensional Archive of Phenotypic Elites): This algorithm is designed to find a diverse set of high-performing solutions rather than just the single best one. It maintains a “map” or a grid, where each cell represents a different type of solution based on specific characteristics, or “phenotypes.” The grid then stores the “elite”—the highest-scoring program—for each specific combination of these traits. This process encourages exploration across the entire landscape of possible algorithmic strategies, preserving interesting but not-yet-optimal solutions that a traditional search might discard.
  • Island-Based Population Model: In this model, the total population of programs is segregated into smaller, semi-isolated subpopulations, or “islands.” Evolution, driven by the LLM acting as a mutation and crossover operator, occurs primarily within each island. This allows different islands to independently explore different regions of the search space, effectively pursuing multiple distinct strategies in parallel. Periodically, high-performing programs, or “elites,” are allowed to “migrate” between islands, sharing successful discoveries and cross-pollinating ideas without homogenizing the entire population too quickly. This structured separation is key to maintaining diversity and avoiding premature convergence to a local optimum.

Generating Algorithms, Not Just Answers

A defining feature of AlphaEvolve is its ability to generate algorithms that solve a problem, rather than just outputting a direct solution to a single instance of that problem. This distinction is what makes the system so powerful and widely applicable.

  • Power Through Generalization: Instead of simply finding the shortest path between two points, AlphaEvolve might discover a new, more efficient algorithm for finding shortest paths in general. Its discovery of a faster sorting algorithm for sequences of 3-5 elements is a perfect example. While sorting a tiny list is trivial, discovering a novel, provably faster method for doing so is a genuine contribution to computer science. This output is generalizable, reusable, and can be applied to an entire class of problems.
  • Interpretability and Discovery: The output is human-readable source code. Scientists and mathematicians can analyze, understand, and formally verify the generated algorithms. This transparency turns the AI into a collaborative tool for discovery. When AlphaEvolve found a new construction for the Cap Set problem, it produced a program that not only yielded a better result but also revealed a new structure that mathematicians could study and learn from.
  • Wide Applicability: This approach is not limited to theoretical computer science or mathematics. It can be applied to any domain where success can be defined by an optimal program or procedure, including discovering new molecules, designing engineering components, or creating financial modeling strategies.

I suppose I never thought about using large language models to navigate a high-dimensional solution space, but with the advent of very fast models such as Google’s Flash, this is becoming more and more feasible. Using large language models to supplement the navigation of high-dimensional solution spaces could prove to be revolutionary. The seeming creativity that large language models bring to the table may create machines that look and act dangerously similarly to human scientists and mathematicians.

Resources: Terrence Tao Quote , AlphaEvolve Paper