AI Atlas: FunSearch: Leveraging AI Hallucinations to Make New Discoveries in Mathematics

AI breakthroughs, concepts, and techniques that are tangibly valuable, specific, and actionable. Written by Glasswing Founder and Managing Partner, Rudina Seseri

🗺️ What is FunSearch?

FunSearch is an AI system recently developed at Google DeepMind and released in December of 2023. The system, whose name comes from the fact that it “searches the function space” of code, utilizes a large language model (LLM) to find creative solutions in mathematics and computer science. This represents the first time LLMs have been used for such a purpose. The system made headlines when it was used to find larger solutions to the “cap set problem,” a standing challenge in mathematics, than had been previously discovered by traditional computing methods over the last 20 years.

In essence, FunSearch is an ensemble method, or a combination of machine learning techniques, pairing an LLM with an evaluation system in order to solve complex math problems. The LLM writes programs to solve a given mathematical problem and its outputs are automatically executed and evaluated for performance, with the best programs being retained for future iterations. This continues indefinitely until a user decides to retrieve the highest-scoring outputs.

This basic structure may sound familiar – it is reminiscent of a GAN, where a generator competes to fool another neural network into accepting synthetic input as real. The use of a mathematical evaluator may also sound similar in concept to Physics-Informed Neural Networks, where equations representing the natural laws of physics act as constraints to embed domain knowledge into digital models. The difference is that FunSearch is not attempting to simulate or dynamically mimic reality; rather, the integrated scoring system hones the LLM for a single predefined end goal.

🤔 What is the significance of FunSearch, and what are its limitations?

FunSearch is unique because it embraces the tendency for large language models to hallucinate, or confidently output factually incorrect information. Whereas in use cases such as content generation this behavior may be adverse or harmful, for FunSearch these hallucinations represent creative “brainstorming” on the path to an optimized solution. The LLM generates thousands of potential solutions while the evaluator weeds out incorrect responses which are subsequently fed forward, creating a positive feedback loop. This strategy results in several important advantages:

Resistance to hallucinations: The use of an evaluator to automatically review programs represents a safeguard against incorrect responses and enables FunSearch to continuously iterate on only the best results.
Interpretability: Rather than generating a solution for a problem, FunSearch generates a program that finds the solution. This means that its outputs can be easily inspected and understood before deployment.
Scalability: FunSearch is designed to find highly compact solutions, or short programs describing very large objects. This makes the system ideal for scaling to “needle-in-a-haystack” problems such as detecting rare outliers in datasets.

However, while FunSearch is an exciting proof of concept for larger value, there are elements that will need to be expanded upon before enterprises can make use of such a system.

Problems need to be explicitly defined: FunSearch can only solve problems once the parameters have been specifically defined by human researchers. Without this, the system has no framework for what it is working towards.
There needs to be a correct answer to the problem: In order for the evaluation component to work, there needs to be an “ideal” set of parameters – for example, a program that solves a given problem better and with less wasted computation. Because of this, FunSearch is well-suited for mathematical code but it could not be used for content generation, where defining a “correct response” is nuanced.
Solutions need to be verified automatically: The evaluator needs to be able to automatically execute and make judgements on outputs from the LLM. This makes it unfeasible for testing hypotheses in areas such as biology, where doing so often requires lab experiments.

🛠️ Applications of FunSearch

The use of FunSearch, or similarly constructed AI systems, to optimize mathematical operations has applications across a wide range of industries and enterprises.

Inventory and supply chain management: FunSearch was used to generate tailored programs for the bin-packing problem, where items of different sizes are fit into the smallest number of bins, that outperformed modern heuristics.
Data storage and computation: Optimization strategies similar to the bin-packing problem can be employed to distribute computational resources across servers and data centers or to break up complex processes.
Code optimization: With a defined end goal, expanded applications of FunSearch can be used to improve code written by programmers by automatically evaluating and iterating on what they produce.

Still curious about how FunSearch fits into the overall AI and machine learning landscape? As an ensemble method, it combines multiple techniques to leverage the strengths of each – in this case, an LLM and a decision tree. This is best illustrated visually through the AI Palette that Glasswing open-sourced last month!

Stay up-to-date on the latest AI news by subscribing to Rudina’s AI Atlas.

Subscribe Now