This is it folks. AI designs better AI

Key Concepts

ASI ARC (AI-driven Scientific Innovation Autonomous Research Creation): A framework for AI to autonomously design better AI models.
Neural Architecture Search (NAS): An algorithm searching for the best neural network structure for a specific task, limited by human-defined techniques.
Closed Evolutionary Loop: The core framework of ASI ARC, comprising Researcher, Engineer, Analyst, and Cognition Base components.
Researcher Component: The creative brain of the system, generating new AI model designs by referencing the cognition base and past experiments.
Engineer Component: Evaluates new AI model designs by training the models in a real code environment, utilizing a self-revision mechanism to debug and fix errors.
Fitness Score: A blended quantitative and qualitative measure used to evaluate the quality of AI model designs, incorporating performance metrics and expert evaluation.
Analyst Component: Gathers existing information from the Cognition Base and synthesizes experimental results to inform the next design cycle.
Cognition Base: A database of existing human knowledge extracted from scientific papers, used by the Analyst and Researcher components.
Two-Stage Exploration and Verification: A strategy to efficiently explore and validate AI model designs, using smaller models and less data in the exploration stage, and scaling up the top performers for rigorous testing in the verification stage.
Linear Attention Architectures: A special type of transformer that is generally smaller and faster, the focus of the AI model designs in the paper.
AlphaGo Moment: A breakthrough in AI where an AI system came up with an entirely new strategy that humans never thought of, demonstrating the ability to innovate and create something beyond human capabilities.

Problem Statement: The Bottleneck in AI Advancement

The speaker starts by highlighting the rapid progress in AI, with increasingly performant models appearing regularly. However, the pace of innovation is constrained by human cognitive capacity. Human researchers are the limiting factor, creating a bottleneck in AI advancement. Existing models like ChatGPT, Grok, and Gemini are designed primarily by humans. While computational power is increasing through massive data centers (e.g., XAI and Meta), human innovation cannot keep up.

Quote: "The pace of AI research itself remains linearly bounded by human cognitive capacity."

ASI ARC: Autonomous AI Model Design

The paper introduces ASI ARC, a system that autonomously conducts scientific research and designs next-generation AI models, removing the human bottleneck. It uses AI to find new neural architectures for AI in a fully autonomous system.

Limitations of Neural Architecture Search (NAS)

Previous attempts to automate AI design, like NAS, were limited because they only explored human-defined techniques, lacking the creativity to invent entirely new approaches.

The ASI ARC Framework: A Closed Evolutionary Loop

ASI ARC operates as a closed evolutionary loop with four main components:

1. Researcher Component

Functions as the creative brain, generating brand new ideas for AI model designs, not just minor tweaks.
Learns from the Cognition Base (past experiments and scientific papers) to conceive new architectural ideas.
Writes computer code for new AI designs and checks for novelty and validity, preventing redundant designs and errors.
Selects the top 10 best-performing architectures (parent architecture) and incorporates ideas from the top 50 to create new designs. This process loops continuously, creating a self-evolving process.

2. Engineer Component

Evaluates new AI model designs by training them in a real coding environment.
Implements a self-revision mechanism to track error logs and debug its own mistakes, preventing promising ideas from being discarded due to simple errors.
Utilizes an automated quality assurance system that analyzes training logs and terminates sessions early if a model is training too slowly or the loss is unusually low.

3. Analyst Component

Gathers information from the Cognition Base, synthesizes experimental results (fitness score, performance metrics, training logs), and pulls existing human knowledge.
Compares parent and sibling designs to understand why certain changes work or don't work.
Updates its memory (central database) with insights, informing the next design cycle of the researcher.

4. Cognition Base

A central database of existing human knowledge, extracting concepts, algorithms, and historical contexts from nearly a hundred scientific papers.
Functions as the repository of information that fuels the Researcher and is updated by the Analyst.

Fitness Score: Evaluating AI Model Designs

The system uses a fitness score to determine which ideas are good. This score blends quantitative and qualitative measures.

Quantitative: A sigmoid transformation emphasizes small but significant improvements while capping extreme values to prevent reward hacking.
Qualitative: An LLM acts as an expert evaluator, considering performance, design, complexity, speed, and novelty compared to previous models.

Two-Stage Exploration and Verification Strategy

To manage computational cost, the system uses a two-stage approach:

Exploration: Uses smaller models (20 million parameters), less data, and a limited number of validation samples for quick tests to find promising candidates.
Verification: Top performers from the exploration stage are scaled up, trained extensively, and rigorously tested with larger datasets.

Results and Findings: The AlphaGo Moment

The system conducted 1,773 autonomous experiments over 20,000 GPU hours, discovering 106 innovative and state-of-the-art linear attention architectures.

The AI system generates new architectures by directly modifying the code of an existing one, creating a parent-child relationship.
The system demonstrates emergent design principles, creating designs that are genuinely new and unexpected, challenging human assumptions.
More GPU hours lead to more state-of-the-art ideas, showing that the limitation is now compute, not human brain power.

Examples of AI-Designed Architectures

Pathgate Fusion Net: Features a hierarchical two-stage router for traffic control within the model.
Contentaware Sharpness Gating: Intelligently routes words based on content, using a learnable temperature parameter.
Parallel Sigmoid Fusion and Retention Architecture: Uses parallel independent sigmoid gates for each path, allowing simultaneous activation.

Performance Evaluation

The new AI-designed models consistently outperform existing human-designed models (Deltanet and gated Deltaet) in terms of loss (error rate) and test score (performance).

Limitations

The study focused solely on linear attention architectures, which are smaller and less performant than other AI models. It's uncertain whether the framework can be generalized to design more diverse architectures.

Conclusion

The ASI ARC system autonomously learns, corrects mistakes, and generates working designs for new AI architectures. It discovered 106 state-of-the-art ideas that are better than human-designed models, at least for linear attention architectures. The best ideas came from its own discoveries, not just remixes of human ideas. The code is available on GitHub. This could accelerate AI innovation significantly.