The central idea of this paper is “A continually self-improving AI is one that, once created, can autonomously and continually improve itself better than its human creators can improve it.”
- (P1) Continues acquiring knowledge into weights without forgetting
- (P2) Generates its own training signal – learning from it beats human-generated signals
- (P3) Can autonomously design its own learning algorithm
Three-Part Thesis Structure
Part 1: Continual Knowledge Acquisition
Paper: Synthetic Continued Pretraining (ICLR 2025, Oral)
Problem: models can’t learn niche domain knowledge from just a few documents.
Solution – EntiGraph:
- Take a small niche corpus (e.g. 265 specialized books)
- Generate synthetic text by exploiting relationships between entities in those documents
- Continually pretrain on synthetic data Result: closed-book accuracy jumped from 39% → 56%, approaching open-book (60%) performance.
Part 2: Self-Improving Pretraining Capability
Paper: Synthetic Bootstrapped Pretraining (ICLR 2026)
Problem: internet data is finite. Models saturate.
Solution – SBP:
- Pair documents by nearest-neighbor embedding similarity
- Fine-tune a synthesizer to generate new documents conditioned on a neighbor
- Pretrain a new model on synthetic data instead of repeating real data Key result (Slide 34-35):
- Baseline (repeat data) → saturates
- Oracle (infinite real data) → keeps scaling
- SBP → 40% improvement over baseline, tracks the oracle No teacher model distillation – genuine bootstrap.
Part 3: Towards AI-Designed AI
Paper: Towards Execution-Grounded Automated AI Research
The most ambitious part. Formalizes AI as its own researcher:
class AIResearchEnv(ResearchEnv):
codebase: str # e.g. nanoGPT
resource: str # e.g. “8xH100”
def value(self, code_diff: str):
sandbox.exec(f”patch -p {code_diff}”)
sandbox.exec(“bash run.sh”)
return sandbox.exec(“bash eval.sh”) # the metric
The loop (Slide 41):
idea = researcher.ideator(env.context)
code_diff = researcher.executor(env.context, idea)
results.append((idea, env.value(code_diff)))
researcher.learn(results) # RL / evolutionary search
Yang ends the thesis with general relativity as an analogy. Einstein’s field equations predicted an expanding universe – a truth Einstein himself initially refused to believe
and tried to suppress with the cosmological constant. The math was smarter than its creator. Yang’s argument: AI systems can similarly discover truths beyond what their creators
anticipated. The moment the theory (or model) is created, it transcends its creator.
─────────────────────────────────────────────────
▎ “Einstein later confessed the 1917 modification is the ‘greatest blunder’ of his life. The moment the theory is created, it is above its creator.”
Final slide answer: “Absolutely yes” – AI can self-improve to be stronger than its creator.
This thesis is essentially the theoretical foundation behind both autoresearch (Part 3 – AI as researcher running experiments) and GSD (Part 1 – continual knowledge acquisition across sessions). The AIResearchEnv abstraction on Slide 39-40 is almost exactly Karpathy’s autoresearch design – locked eval harness, mutable codebase, sandbox execution.
Sources: