What Is Terrence Tao's AI Collaboration? How the World's Greatest Mathematician Uses ChatGPT
Fields Medal winner Terrence Tao split a mathematical proof with ChatGPT, with AI solving one half. Here's what this means for AI as a scientific co-author.
The Experiment That Changed How Mathematicians Think About AI
In 2023, Terrence Tao — Fields Medal winner, UCLA mathematics professor, and widely regarded as one of the most gifted mathematicians alive — began reporting something unusual. He’d been using GPT-4 as part of his mathematical work. Not as a calculator. Not as a search engine. As a collaborator.
In one documented instance, Tao split a proof between himself and the AI: he handled portions requiring high-level creative judgment, while ChatGPT’s underlying model successfully completed the other half.
For most researchers, that sentence ends with “…but the AI made errors.” That’s not what Tao reported. The AI contributed valid mathematical work.
This matters for reasons that go beyond a single proof. Tao has more mathematical credibility than almost anyone alive, and no interest in hype. His direct, qualified assessment of AI as useful in frontier mathematics is a meaningful signal about where these systems actually stand.
This article unpacks what specifically happened, what Tao’s experience reveals about the current state of AI in mathematics, and what it implies for AI as a scientific co-author more broadly.
Who Terrence Tao Is — and Why His Assessment Matters
Tao received the Fields Medal in 2006 at age 31. The Fields Medal is awarded every four years to mathematicians under 40 for outstanding contributions to the field — the closest thing mathematics has to a Nobel Prize. He has contributed to an unusually broad range of areas: number theory, harmonic analysis, partial differential equations, combinatorics, and compressed sensing.
His blog is read by mathematicians worldwide. When he writes about a proof technique, researchers engage with it seriously. When he writes about a tool, people pay attention.
Most AI enthusiasm comes from generalists, investors, or people who haven’t stress-tested these systems on hard problems. Tao is the opposite. He works at the frontier of mathematics, where errors are immediately visible and sloppy reasoning is worse than useless.
Why Skepticism Makes His Endorsement Meaningful
Tao has been consistent in noting what AI gets wrong. He’s described AI-generated mathematics as plausible-sounding but often subtly wrong — the kind of errors that look fine until you check carefully. He’s compared working with LLMs in mathematics to collaborating with an eager but imperfect assistant who needs supervision.
That skepticism is exactly why his specific positive observations carry weight. He’s not saying AI is transforming mathematics across the board. He’s saying, with precision, what it’s actually useful for.
The Proof That AI and Human Shared
The collaboration Tao described involved a structured division of labor. He identified a proof that could be broken into components — some requiring the high-level creative reasoning that remains distinctly human, and others requiring careful execution of established techniques.
The AI handled one of those components successfully. GPT-4 contributed a valid, non-trivial piece of mathematical argument to a real research problem.
How the Division Actually Worked
Tao directed the overall strategy. He determined which parts of the proof were tractable for AI delegation and which weren’t. The AI then operated within that structured framework — responding to well-specified prompts, applying known proof techniques, and producing output that Tao could verify.
The collaboration wasn’t symmetric. Tao was the architect; the AI was a capable executor within clearly defined sub-problems. But within that scope, the contribution was genuine.
The analogy Tao has used in various writings is instructive: it’s like having a very knowledgeable graduate student who works quickly and knows a great deal of mathematics, but occasionally makes mistakes that require supervision. The student can contribute real work. But you need to check it.
What the Proof Splitting Actually Demonstrates
That Tao could delegate meaningful mathematical work to an AI at all is significant. Not because it means AI is approaching human mathematical ability in general — it isn’t — but because it shows the current generation of AI systems has crossed some meaningful threshold.
They’re not doing math purely by lookup. They’re applying logical structure, tracking definitions, and executing proof strategies in ways that produce valid output on a meaningful fraction of well-specified problems.
What AI Can Do in Advanced Mathematics
Based on Tao’s documented experience and related research, here’s an honest accounting of where AI systems are actually capable in mathematics today.
Executing Known Proof Techniques
For many mathematical sub-problems, the solution approach is known and the challenge is careful execution. AI can apply techniques like induction, algebraic manipulation, and integration by parts with reasonable reliability in familiar domains. This work is time-consuming, and errors compound. Delegating it frees the mathematician’s attention for harder questions.
Surveying Related Results
Knowing which existing theorems are relevant to a problem is crucial and time-consuming. AI can quickly surface connections to known results, acting as augmented search across the mathematical literature. This isn’t proof-writing, but it significantly speeds up the early stages of research.
Generating Proof Sketches
When the high-level approach to a problem is unclear, AI can generate multiple candidate strategies quickly. Most will be wrong or inapplicable, but having a set of approaches to evaluate is faster than generating them from scratch. Think of it as structured brainstorming — the AI produces candidates, the human evaluates them.
Formal Verification Assistance
The mathematics community has been moving toward formal proof systems like Lean and Coq, which verify proofs mechanically. Tao has been actively involved in Lean’s Mathlib formalization project. AI systems are becoming increasingly capable of helping translate natural-language mathematical arguments into formal proofs these systems can verify — bridging informal and formal mathematics.
Where AI Hits Its Limits
Tao has been careful not to overstate what AI can do. The limitations are real and worth understanding clearly.
Novel Mathematical Insight
The most valuable mathematical contributions come from genuine creative insight: seeing that a new framework applies to an old problem, or connecting distant mathematical fields in unexpected ways. Current AI systems can remix known techniques. They can’t reliably generate genuinely new ones.
The frontier of research is, almost by definition, where training data is sparse. AI performance degrades precisely where the hardest problems live.
Consistency Across Long Arguments
A full mathematical proof can span dozens of pages and hundreds of interdependent steps. Maintaining logical consistency throughout — tracking all assumptions, applying results only within their stated conditions — requires sustained attention across the whole argument. AI systems can lose the thread in long proofs, producing locally plausible but globally inconsistent arguments.
For short, well-defined sub-proofs, this is manageable. For complex arguments across many pages, it’s a significant problem.
Confident Errors
This is the most dangerous failure mode. AI systems can produce mathematical arguments that sound convincing but are subtly wrong — applying a theorem outside its valid range, making an algebraic error, asserting something false with confidence. A working mathematician verifies everything. Someone less expert might not catch these mistakes.
This is why Tao has consistently emphasized that AI-generated mathematical work requires human verification, not just acceptance.
AI as a Scientific Co-Author: The Bigger Picture
Tao’s collaboration is one data point in a broader pattern. AI is increasingly appearing as a functional participant in scientific work across multiple fields.
The AlphaProof Milestone
In 2024, Google DeepMind announced AlphaProof, a system that achieved silver-medal-level performance on International Mathematical Olympiad problems — solving four out of six problems at a level that would have placed it among top human competitors. Tao publicly noted this as a significant result.
IMO problems are hard but structured. The gap between IMO-level performance and frontier research mathematics is large. But the capability trajectory is clear.
The Pattern Across Science
The structure Tao describes — AI handling structured sub-tasks while humans direct overall strategy — appears consistently across scientific fields. In biology, AI predicts protein structures and screens drug candidates. In physics, AI assists with simulation and data analysis. In chemistry, AI proposes synthesis routes.
In each case, AI accelerates execution of well-defined tasks. Humans retain the creative, strategic direction.
What “Co-Author” Actually Means Here
Using AI as a co-author doesn’t mean AI independently generates ideas. It means AI’s output makes a direct, verifiable contribution to finished scientific work — a meaningful bar that most software tools don’t clear.
The quality of the collaboration is largely determined by the human’s expertise in framing problems, delegating appropriately, and verifying output. The tool’s ceiling is partly set by the person using it.
Building Structured AI Research Workflows Without Developer Resources
Tao’s workflow pattern — breaking complex problems into components, routing specific tasks to AI, and verifying the output — scales beyond frontier mathematics. Anyone doing research, analysis, or complex knowledge work can apply the same logic.
The infrastructure challenge is that doing this well typically requires managing API access, chaining multiple AI calls, and building some kind of review layer. That’s non-trivial without developer resources.
MindStudio makes this accessible without code. It’s a no-code platform for building AI agents and automated workflows, with 200+ AI models — including GPT-4, Claude, and Gemini — available without separate API keys or accounts. You can build an agent that accepts a research question, routes sub-tasks to different models based on their strengths, and returns structured output for human review. The average build takes 15 minutes to an hour.
For teams building AI-powered research and analysis workflows, MindStudio’s visual builder handles the workflow logic, model routing, and integration layer. You focus on what to delegate and how to verify it — the same division of labor Tao applies at the mathematical frontier.
The platform also connects to 1,000+ tools your team already uses — Notion, Slack, Google Workspace, Airtable — so AI workflows fit into existing processes rather than requiring a separate system. If you want to explore what this looks like in practice, MindStudio’s agent builder is a good starting point for seeing how multi-step AI workflows get structured.
Try MindStudio free at mindstudio.ai.
Frequently Asked Questions
Did Terrence Tao really use ChatGPT to complete part of a math proof?
Yes. Tao has documented multiple instances of using GPT-4 (the model underlying ChatGPT) in his mathematical research. In one case, he described splitting a proof between himself and an AI — he handled portions requiring high-level creative judgment, and the AI successfully completed another portion. He’s been clear that AI output requires verification, but the collaboration produced valid mathematical work.
Is AI actually good at advanced mathematics?
It depends on the task. Current AI is reliably useful for specific types of mathematical work: applying known proof techniques, surveying related results, generating candidate approaches, and assisting with formal verification. It struggles with novel mathematical insight, logical consistency across long arguments, and performance at the research frontier where training data is sparse. For structured sub-problems in familiar domains, it’s genuinely useful. For breakthrough mathematical discovery, humans remain essential.
What is Terrence Tao’s overall view on AI in mathematics?
Cautiously positive, and consistently specific. Tao describes AI as a useful tool for certain tasks within mathematical research — capable but imperfect, best used with well-specified prompts and careful output verification. He’s noted it could significantly speed up parts of mathematical work. He’s equally direct that current systems can’t replace the creative insight driving frontier research, and that they produce confident errors requiring expert review.
What is AlphaProof, and how is it different from ChatGPT?
AlphaProof is a system developed by Google DeepMind that uses reinforcement learning and formal proof verification to solve mathematical competition problems. It achieved silver-medal-level performance on 2024 International Mathematical Olympiad problems. It’s specialized for formal mathematics, whereas ChatGPT and GPT-4 are general-purpose language models. They represent different approaches to AI in mathematics — formal systems optimized for verified proof generation versus general models that draft arguments in natural language.
Can AI replace mathematicians?
Not in any near-term sense. The creative, insight-driven aspects of frontier research remain distinctly human. What’s changing is that AI can accelerate execution of structured sub-tasks, freeing mathematicians to focus on harder problems. Tao’s workflow illustrates this: he remained the architect of the mathematical approach; AI executed specific well-defined pieces. The real question isn’t whether AI replaces mathematicians, but whether mathematicians who use AI effectively will outpace those who don’t.
How is Tao’s use of AI different from using ChatGPT for math homework?
The difference is in sophistication and verification. Most people use ChatGPT for math as a tutor — getting explanations, checking calculations, understanding concepts. Tao uses it as a research collaborator for frontier mathematics: generating proof strategies, drafting arguments, executing specific sub-tasks within a larger project. Critically, Tao verifies every output against deep expertise. He’s using the same tool with precision prompting, expert-level evaluation, and clear awareness of where it fails — which is a fundamentally different mode of use.
Key Takeaways
Terrence Tao’s AI collaboration is a useful lens for understanding what current AI systems can and can’t do in demanding intellectual work:
- AI can contribute meaningfully to advanced mathematics, but works best as an executor of well-defined sub-tasks, not an independent creative reasoner.
- Human direction is the key variable. Tao’s expertise is what makes the collaboration work — he frames the problems, directs the strategy, and verifies the output.
- The split-proof model generalizes. Breaking complex intellectual work into AI-tractable components while humans retain creative oversight applies across research, analysis, and knowledge work of all kinds.
- Limitations are real and matter. Hallucinated mathematics, logical inconsistency in long proofs, and degraded performance at research frontiers make uncritical reliance on AI in technical fields genuinely risky.
- The capability trajectory is clearly upward. From GPT-4’s proof contributions to AlphaProof’s IMO results, AI is getting meaningfully better at mathematical reasoning.
If you want to put these ideas to work — building structured AI workflows for research, analysis, or complex knowledge work — MindStudio is a practical starting point. It’s free to try, and most users have a working first agent within an hour.