Demis Hassabis on AGI Timeline, Scientific Breakthroughs, and DeepMind's Future

Organized & Compiled by Shenchao TechFlow

Guest: Demis Hassabis (Founder of DeepMind, 2024 Nobel Prize winner in Chemistry, Head of Google DeepMind)

Host: Gary Tan

Podcast source: Y Combinator

Demis Hassabis: Agents, AGI, and the Next Major Scientific Breakthrough

Broadcast date: April 29, 2026

Edit the introduction

Google DeepMind CEO and Nobel Prize in Chemistry laureate Demis Hassabis appeared on Y Combinator to discuss the key advancements still needed to reach AGI, offer advice to entrepreneurs on how to maintain a competitive edge, and speculate on where the next major scientific breakthrough may occur. His most practical insight for deep tech entrepreneurs is that if you're launching a decade-long deep tech project today, you must account for the emergence of AGI in your planning. He also revealed that Isomorphic Labs—the AI-driven drug discovery company spun out from DeepMind—is about to announce major news.

Essential Quotes

AGI Roadmap and Timeline

These existing technical components will almost certainly become part of the final AGI architecture.
Issues with continuous learning, long-term reasoning, and certain aspects of memory remain unsolved—AGI needs to resolve them all.
If your AGI timeline is around 2030, like mine, and you're starting a deep tech project today, you must account for the fact that AGI will emerge along the way.

Memory and context window

The context window is roughly equivalent to working memory. Human working memory can typically hold only about seven digits, yet we have context windows of millions or even tens of millions of tokens. But the problem is that we cram everything into it—including irrelevant or incorrect information—and currently, this approach is quite crude.
If you need to process a live video stream and store all the tokens, one million tokens would only last about 20 minutes.

Flaws in reasoning

I like to play chess with Gemini. Sometimes it recognizes that a move is poor, but can't find a better one, so it ends up circling back and making that bad move anyway. But a precise reasoning system shouldn't behave this way.
On one hand, it can solve problems at the level of International Mathematical Olympiad gold medals; on the other, it makes elementary math mistakes when asked differently. It seems to lack something in its self-reflective reasoning process.

Agent and Creativity

To achieve AGI, you need a system that can proactively solve problems for you. Agents are the way forward, and I believe we’ve only just begun.
I haven’t seen anyone create a top-ranking App Store AAA game using vibe coding. Given the current level of effort being invested, it should be possible—but it hasn’t happened yet. This suggests something is still missing in the tools or process.

Distillation and Small Models

Our assumption is that, six months to a year after the release of a cutting-edge Pro model, its capabilities can be compressed into very small models that can run on edge devices. We have not yet encountered any theoretical limits to information density.

Scientific Discoveries and the "Einstein Test"

I sometimes call it the “Einstein test”: can you train a system using only the knowledge available in 1901, and then have it independently derive the breakthroughs Einstein made in 1905, including special relativity? Once a system can do that, it’s very close to genuinely inventing something entirely new.
Solving one of the Millennium Prize Problems is already remarkable. But even harder is whether one can propose a new set of Millennium Prize Problems that top mathematicians consider equally profound and worthy of a lifetime of study.

Deep tech startup recommendations

Hard problems and easy problems are actually quite similar—they just differ in how they’re hard. Life is short; better to pour your energy into things that no one else will do if you don’t.

AGI Implementation Pathway

Gary Tan: You’ve thought about AGI longer than almost anyone else. Looking at the current paradigm, how much of the final AGI architecture do you think we already have? What is fundamentally missing now?

Demis Hassabis: Large-scale pretraining, RLHF, chain-of-thought—I’m confident these will become part of the final architecture of AGI. These technologies have already demonstrated so much by now. I can’t really imagine that in two years we’ll discover we’ve been on a dead end; that doesn’t make sense to me. But beyond what we already have, there may still be one or two missing pieces: continual learning, long-term reasoning, certain aspects of memory—some problems remain unsolved. AGI requires all of them to be solved. Perhaps existing technologies, combined with incremental innovations, could scale to that level; but there might still be one or two major breakthroughs left to achieve. I don’t think it’ll be more than one or two. Personally, I’d say there’s about a 50-50 chance that such key unsolved points remain. At Google DeepMind, we’re pursuing both paths simultaneously.

Gary Tan: What strikes me most about working with a multitude of agent systems is that, at the core, it’s always the same set of weights being passed back and forth. That’s why the concept of continuous learning is particularly fascinating—because right now, we’re essentially patching things together with duct tape, like those “nighttime dream cycles” and similar workarounds.

Demis Hassabis: Yes, those dream cycles are pretty cool. We’ve long thought about this issue in terms of integrating episodic memory. My PhD focused on how the hippocampus elegantly integrates new knowledge into existing knowledge frameworks. The brain does this exceptionally well—it replays important experiences during sleep, particularly during REM sleep, to learn from them. One key method that enabled our earliest Atari program, DQN (DeepMind’s Deep Q-Network, published in 2013 and the first to achieve human-level performance on Atari games using deep reinforcement learning), was experience replay. This was inspired by neuroscience: repeatedly replaying successful trajectories. That was back in 2013—ancient history by AI standards today—but it was crucial at the time.

I agree with you—we’re currently using duct tape to cram everything into the context window. It just doesn’t feel right. Even if we’re building machines rather than biological brains, and even if, theoretically, we could have context windows in the millions or billions with perfect memory, the costs of searching and retrieving information still exist. At this moment, when specific decisions are needed, finding truly relevant information isn’t simple—even if you can store everything. So I believe there’s still huge room for innovation in the field of memory.

Gary Tan: To be honest, a million-token context window is much larger than I expected and can do a lot.

Demis Hassabis: It’s large enough for most of its intended use cases. But consider this: the context window is roughly analogous to working memory. Human working memory averages only seven digits, yet we have context windows in the millions or even tens of millions. The problem is that we’re dumping everything into it—including irrelevant or incorrect information—and currently, this approach is quite crude. Moreover, if you were to naively log all tokens from a live video stream, one million tokens would only cover about 20 minutes. But if you want the system to understand your life over the course of one or two months, that’s still far from sufficient.

Gary Tan: DeepMind has always been deeply committed to reinforcement learning and search—how deeply has this philosophy been embedded in your development of Gemini? Is reinforcement learning still underestimated?

Demis Hassabis: It may indeed have been underestimated. Attention in this area has fluctuated. Since the very first day of DeepMind’s founding, we have been working on agent systems—all of our work on Atari and AlphaGo essentially involved reinforcement learning agents: systems capable of autonomously achieving goals, making decisions, and formulating plans. Of course, we initially chose the gaming domain because its complexity was manageable, and then gradually moved on to more complex games; after AlphaGo, we developed AlphaStar, essentially tackling every game we could.

The next question is whether these models can be generalized into world models or language models, rather than just game models. Over the past few years, we have been working on exactly this. The reasoning patterns and chain-of-thought inference used in today’s leading models are essentially a return to what AlphaGo originally pioneered. I believe much of the work we did back then is highly relevant today; we are revisiting those older ideas and applying them at a larger scale and in more general ways, including methods like Monte Carlo tree search and other reinforcement learning techniques. The concepts behind AlphaGo and AlphaZero are extremely relevant to today’s foundation models, and I believe a significant portion of future progress over the coming years will stem from this.

Distillation and Small Models

Gary Tan: To be smarter now requires larger models, but at the same time, distillation techniques are advancing, allowing smaller models to become quite fast. Your Flash model is very strong, achieving about 95% of the performance of state-of-the-art models at just one-tenth the cost. Is that correct?

Demis Hassabis: I believe this is one of our core strengths. You need to first build the largest models to achieve cutting-edge capabilities. One of our greatest advantages is our ability to quickly distill and compress those capabilities into increasingly smaller models. Distillation is a method we originally invented, and we remain the world leader in it. Moreover, we have strong business incentives to do this. We are likely the world’s largest AI application platform, with AI Overviews, AI Mode, and Gemini integrated into every Google product—including Maps, YouTube, and more. This involves billions of users and over a dozen products each with billion-user scales. These products must be extremely fast, highly efficient, low-cost, and have minimal latency. This gives us tremendous motivation to push Flash and even smaller Flash-Lite models to achieve maximum efficiency, and I hope this will ultimately serve a wide range of user tasks effectively.

Gary Tan: I’m curious just how intelligent these small models can become. Is there a limit to distillation? Can a 50B or 400B model be as intelligent as today’s largest frontier models?

Demis Hassabis: I don’t believe we’ve hit any theoretical limits in information theory—at least, no one knows whether we have yet. Perhaps one day we’ll encounter a ceiling in information density, but for now, our assumption is that after a cutting-edge Pro model is released, its capabilities can be compressed into much smaller models—small enough to run on edge devices—within six months to a year. You can see this with our Gemma models; Gemma 4 performs exceptionally well at the same scale. This is achieved through extensive distillation techniques and efficiency optimizations for small models. So I really don’t see any theoretical limits—we’re still very far from reaching them.

Gary Tan: There’s a bizarre phenomenon right now where engineers can accomplish 500 to 1,000 times the amount of work they could six months ago. Some people in this room are doing work equivalent to 1,000 times what a Google engineer did in the 2000s. Steve Yegge has talked about this.

Demis Hassabis: I’m really excited. Small models have many uses. One is lower cost, and speed brings its own benefits. When writing code or tackling other tasks, you can iterate faster—especially when collaborating with the system. Even if a fast system isn’t state-of-the-art, say it’s only 90% to 95% as capable, it’s still more than sufficient, and the gains in iteration speed far outweigh that 10% difference.

Another major direction is running these models on edge devices—not only for efficiency, but also for privacy and security. Consider devices that handle highly sensitive information, or robots; for a robot in your home, you’d want a powerful, efficient model running locally, delegating tasks to a large cloud model only in specific scenarios. Processing audio and video streams locally, keeping all data on-site—I can envision this as an ideal end state.

Memory and Reasoning

Gary Tan: Returning to context and memory. Since the model is currently stateless, what would the developer experience be like if it had continuous learning capabilities? How would you guide such a model?

Demis Hassabis: This is a fascinating question. The lack of continuous learning is a key bottleneck preventing current agents from completing full tasks. Today’s agents are useful for individual components of a task—you can combine them to do some impressive things—but they struggle to adapt to your specific environment. That’s why they can’t yet operate truly “fire and forget”; they need to learn the nuances of your particular context. Solving this problem is essential to achieving full general intelligence.

Gary Tan: How far have we gotten with reasoning? The model’s chain of thought is strong now, but it still makes mistakes that even smart undergraduates wouldn’t make. What specifically needs to be fixed? What improvements do you expect in reasoning?

Demis Hassabis: There is still significant room for innovation in thinking paradigms. What we’re doing is still quite crude and brute-force. There are many potential improvements, such as monitoring the reasoning chain and intervening midway through the thought process. I often feel that, whether our systems or those of our competitors, they tend to overthink and get stuck in loops.

Sometimes I enjoy observing Gemini play chess. It’s interesting that all leading foundational models are actually quite poor at chess. Watching their reasoning trajectories is valuable because chess is a well-understood domain, allowing me to quickly determine whether their moves are off-track or their reasoning is flawed. What we see is that sometimes they consider a move, recognize it as a bad one, but fail to find a better alternative—ultimately circling back and making the same poor move. A precise reasoning system should not exhibit this behavior.

This huge discrepancy still exists, but fixing it may require only one or two adjustments. That’s why you see what’s called “jagged intelligence”—capable of solving problems at the level of an IMO gold medal, yet making elementary math mistakes when asked in a slightly different way. It seems something is still missing in its self-reflection on its own thought process.

The agent's true capabilities

Gary Tan: Agents are a huge topic. Some say it’s just hype. Personally, I think we’ve only just begun. What is DeepMind’s internal assessment of agents’ true capabilities, and how does it differ from external claims?

Demis Hassabis: I agree with you—we’ve only just begun. To achieve AGI, you need a system that can proactively solve problems for you. That has always been clear to us. Agents are the path forward, and I believe we’re only at the start. Everyone is still figuring out how to make agents work better together; many of you here have likely done similar personal experiments. How do we integrate agents into workflows so they’re not just a nice-to-have, but truly perform fundamental tasks? We’re still in the experimental phase. It’s only in the past two or three months that we’ve really started identifying highly valuable use cases. The technology has just now reached a point where it’s no longer a toy demo, but genuinely adding value to your time and efficiency.

I often see people launching dozens of agents to run for dozens of hours, but I’m still unsure if the output justifies this level of investment.

We haven’t yet seen anyone create a AAA game that tops the app store charts using vibe coding. I’ve written some myself, and many of you here have created decent little demos. I can now build a prototype of Theme Park in half an hour—back then, my 17-year-old self spent six months. I have a feeling that if you dedicated an entire summer to it, you could create something truly incredible. But it still requires craftsmanship and the soul and taste of a human being—you must ensure these elements are woven into whatever you build. In fact, no child has yet created a breakout game that sold ten million copies, yet with today’s tools and accessibility, that should be possible. So something is still missing—perhaps it’s the process, or perhaps it’s the tools. I expect to see such a result within the next six to twelve months.

Gary Tan: To what extent will it be fully automated? I don’t think it will be fully automated from the start. More likely, the path will be for those here to first achieve 1,000 times greater efficiency, then someone will use these tools to create best-selling applications or games, after which more processes will gradually become automated.

Demis Hassabis: Yes, this is what you should see first.

Gary Tan: Part of the reason is that some people are indeed doing this, but they’re unwilling to publicly disclose how much help the agent provided.

Demis Hassabis: Possibly. But I’d like to talk about creativity. I often cite AlphaGo’s move 37 in game two as an example. For me, I was waiting for a moment like that—it was only after it happened that I launched scientific projects like AlphaFold. We started working on AlphaFold the day after we returned from Seoul, ten years ago. I came to Korea this time to celebrate the 10th anniversary of AlphaGo.

But merely playing Move 37 is not enough. It’s impressive and useful. But can this system invent the game of Go itself? If you give it a high-level description—such as “a game whose rules can be learned in five minutes but takes a lifetime to master, aesthetically elegant, and a single match can be completed in an afternoon”—and the system returns Go as the answer, today’s systems cannot do this. The question is: why?

Gary Tan: Someone in this room might be able to do it.

Demis Hassabis: If someone has achieved it, the issue isn’t that the system is lacking—it’s that our approach to using the system is flawed. This might very well be the correct answer. Perhaps today’s systems already have this capability; they just need a sufficiently brilliant creator to drive them, to infuse the project with its soul, while being so deeply integrated with the tools that they become almost one with them. If you immerse yourself day and night in these tools and possess deep creativity, you might just create something beyond imagination.

Open Source and Multimodal Models

Gary Tan: Let’s switch topics to open source. The recent release of Gemma has made it possible to run very powerful models locally. What are your thoughts? Will AI become something users control themselves, rather than remaining primarily in the cloud? Will this change who can use these models to build products?

Demis Hassabis: We are strong supporters of open source and open science. We made AlphaFold entirely free and open. Our scientific work continues to be published in top-tier journals. With Gemma, we aimed to create a world-leading model of comparable scale. Gemma has already been downloaded approximately 40 million times since its release just two and a half weeks ago.

I also believe it's important that there is a Western technology stack in the open-source space. Chinese open-source models are excellent and currently lead in the open-source field, but we believe Gemma is highly competitive at the same scale.

We also face a resource constraint—no one has spare compute capacity to run two full-scale frontier models. Therefore, our current decision is to use edge models for Android devices, glasses, robots, and similar platforms, and to make them open-source, since once deployed on devices, they are inherently exposed anyway—it makes more sense to fully open them. We’ve unified our open-source strategy at the nanoscale level, which also makes strategic sense.

Gary Tan: Before coming on stage, I demonstrated the AI operating system I built—I can interact directly with Gemini using voice commands. I was quite nervous about showing you, but it actually worked. Gemini was designed from the start as a multimodal system. I’ve used many models, but none match Gemini’s depth in voice-to-model interaction, tool usage, and contextual understanding.

Demis Hassabis: Yes. One advantage of the Gemini series that hasn't been fully recognized is that we built it from the start as a multimodal system. This made the initial phase more challenging than focusing solely on text, but we believe we’ll benefit significantly in the long term—and those benefits are already materializing. For example, in the area of world models, we’ve built Genie (a generative interactive environment model developed by DeepMind) on top of Gemini. The same applies to robotics: Gemini Robotics will be built on a multimodal foundation model, and our multimodal strengths will serve as a competitive moat. We’re also increasingly using Gemini at Waymo (Alphabet’s autonomous driving company).

Imagine a digital assistant that follows you into the real world, perhaps on your phone or glasses, needing to understand your physical surroundings and environment. Our system excels in this area. We will continue investing heavily in this direction, and I believe our lead on these types of challenges is substantial.

Gary Tan: The cost of inference is falling rapidly. What becomes possible when inference is essentially free? Will your team’s optimization priorities change as a result?

Demis Hassabis: I’m not sure reasoning will ever be truly free—the Jevons Paradox is right there. I think everyone will ultimately use up all the compute power they can get. Imagine millions of agents working together collaboratively, or a small group of agents thinking simultaneously along multiple directions and then integrating their results. We’re all experimenting with these approaches, and all of them will consume available reasoning resources.

In terms of energy, if we solve several of the challenges—such as controlled nuclear fusion, room-temperature superconductivity, and optimal battery technology—which I believe we will achieve through materials science—energy costs could approach zero. However, physical manufacturing processes for chips and similar stages still face bottlenecks, at least for the next few decades. Therefore, inference endpoints will still have quota limitations and will require efficient usage.

The next scientific breakthrough

Gary Tan: Fortunately, smaller models are becoming increasingly intelligent. Many of you here are founders in the biology and biotechnology fields. AlphaFold 3 has moved beyond proteins to encompass a broader range of biomolecules. How far are we from modeling entire cellular systems? Is this a problem of an entirely different level of difficulty?

Demis Hassabis: Isomorphic Labs is making excellent progress. AlphaFold is just one part of the drug discovery process; we are also conducting adjacent biochemistry research, designing compounds with the right properties, and major announcements are coming soon.

Our ultimate goal is to create a complete virtual cell—a fully functional cell simulator to which you can apply perturbations, whose outputs closely match experimental results and have practical utility. You can bypass numerous search steps and generate vast amounts of synthetic data to train other models to predict the behavior of real cells.

I estimate it will take about another decade to achieve a complete virtual cell. At DeepMind’s science team, we’re starting with the virtual nucleus because the nucleus is relatively self-contained. The key to this kind of problem is whether you can isolate a slice of appropriate complexity—one that is sufficiently self-contained, where you can reasonably approximate its inputs and outputs, and then focus on this subsystem. From this perspective, the nucleus is a very suitable candidate.

Another issue is insufficient data. I’ve spoken with top scientists working in electron microscopy and other imaging techniques. It would be revolutionary to image live cells without killing them, because then we could turn it into a visual problem—and we know how to solve visual problems. But as far as I know, there is currently no technology capable of imaging live, dynamic cells at nanometer resolution without damaging them. You can capture static images at that resolution, which is already incredibly sophisticated and exciting, but it’s not enough to directly turn it into a visual problem.

So there are two paths: one is a hardware-driven, data-driven approach; the other is to build better learnable simulators to model these dynamical systems.

Gary Tan: You don't just look at biology. Materials science, drug discovery, climate modeling, mathematics—if you had to rank them, which scientific field will be transformed the most over the next five years?

Demis Hassabis: Every field is exciting, which is why this has always been my greatest passion and the reason I’ve been working in AI for over 30 years. I’ve always believed that AI will be the ultimate tool for science, advancing scientific understanding, discovery, medicine, and our comprehension of the universe.

We originally articulated our mission in two steps: first, solve intelligence—that is, build AGI; second, use it to solve all other problems. We later had to adjust the wording because people would ask, “Do you really mean solve all problems?” And yes, we do mean exactly that. Now people are beginning to understand what that means. Specifically, I’m referring to scientific fields I call “root node problems”—areas where a breakthrough unlocks entirely new branches of discovery. AlphaFold is the prototype of what we aim to do. Over three million researchers worldwide, nearly every biologist, now uses AlphaFold. I’ve heard from friends who are executives at pharmaceutical companies that nearly every new drug discovered in the future will involve AlphaFold at some point in the drug discovery process. We’re proud of this, and it’s the kind of impact we want AI to have. But I believe this is only the beginning.

I can’t think of any scientific or engineering field where AI can’t help. The areas you mentioned are roughly at the “AlphaFold 1 moment”—the results are already promising, but the major challenges in those fields haven’t yet been fully solved. Over the next two years, we’ll have plenty of progress to discuss across all these areas, from materials science to mathematics.

Gary Tan: It feels like a Prometheus-like gift, granting humanity a completely new capability.

Demis Hassabis: Yes. And as the moral of the Prometheus story suggests, we must also be cautious about how this capability is used, where it is applied, and the risks of the same tools being misused.

Success experience

Gary Tan: Many people in this room are trying to start companies that apply AI to science. In your view, what distinguishes truly pioneering startups from those that simply wrap an API around a foundational model and call themselves “AI for Science”?

Demis Hassabis: I was thinking, if I were sitting in your seat today at Y Combinator evaluating startups, what would I do? One thing is that you must anticipate the trajectory of AI technology—which is already difficult. But I truly believe there’s tremendous opportunity in combining AI’s trajectory with another deep tech field. This intersection—whether in materials science, medicine, or other truly challenging scientific domains, especially those involving the atomic world—won’t have any shortcuts in the foreseeable future. These fields won’t be overturned by the next update to a foundational model. But if you’re looking for resilient areas to focus on, this is what I’d recommend.

I’ve always been drawn to deep tech. Truly lasting and valuable things are never easy. I’ve always been attracted to deep tech. When we started in 2010, AI was deep tech—investors told me, “We already know this won’t work,” and academia considered it a niche direction that had been tried and failed in the 1990s. But if you have conviction in your idea—why this time is different, what unique combination of background you bring—in the ideal case, you’re an expert yourself in both machine learning and applications, or you can assemble a founding team that is—there’s enormous impact and value to be created.

Gary Tan: This information is crucial. Something seems obvious once it’s done, but before it’s accomplished, everyone opposes you.

Demis Hassabis: Of course, you must do what you are truly passionate about. For me, regardless of what happens, I would pursue AI. I decided at a very young age that it was the most impactful thing I could imagine—and history has proven that, though it might not have; perhaps we were 50 years too early. It’s also the most interesting thing I can think of. Even if today we were still huddled in a small garage and AI hadn’t been achieved yet, I would still find a way to keep going. Maybe I’d return to academia, but I’d find some way to continue.

Gary Tan: AlphaFold is an example of you pursuing a direction and betting correctly. What makes a scientific field suitable for producing an AlphaFold-style breakthrough? Are there patterns, such as a specific objective function?

Demis Hassabis: I really should take the time to write this down. What I’ve learned from all the Alpha projects, such as AlphaGo and AlphaFold, is that our current technologies work best under three conditions. First, the problem has a massive combinatorial search space—the larger, the better—so large that no brute-force enumeration or specialized algorithm can solve it. The move space in Go and the conformation space of proteins both vastly exceed the number of atoms in the universe. Second, you can clearly define an objective function, such as minimizing the free energy of a protein or winning at Go, allowing the system to perform gradient ascent. Third, there is sufficient data, or a simulator capable of generating large amounts of synthetic data within the distribution.

If these three conditions hold, today’s methods can go far in finding the needle in the haystack that you need. The same logic applies to drug discovery: there exists some compound that can treat this disease without side effects; as long as physical laws permit its existence, the only question is how to find it efficiently and practically. I believe AlphaFold proved for the first time that such systems are capable of finding this needle within an enormous search space.

Gary Tan: I want to take it to the next level. We’re talking about how humans used these methods to create AlphaFold, but there’s also a meta-level: humans using AI to explore possible hypothesis spaces. How far are we from AI systems being able to perform true scientific reasoning—rather than just pattern matching on data?

Demis Hassabis: I think we're very close. We're building these kinds of general-purpose systems. We have a system called AI co-scientist and algorithms like AlphaEvolve that can do more than basic Gemini. All leading labs are exploring this direction.

But so far, I haven’t personally seen a single genuine, major scientific discovery made by these systems. I feel it’s coming soon. It may be related to creativity, as we discussed earlier—the kind that truly breaks through known boundaries. At that level, it’s no longer pattern matching, because there are no patterns to match. It’s not quite extrapolation either, but rather some form of analogical reasoning, which I believe these systems currently lack—or we haven’t yet learned how to use them correctly.

In science, I often say the standard is whether it can generate a truly interesting hypothesis, not just verify one. Because verifying a hypothesis can itself be a monumental achievement—such as proving the Riemann Hypothesis or solving a Millennium Prize problem—but perhaps we’re only a few years away from achieving that.

But even harder than that is whether we can propose a new set of Millennium Prize Problems that top mathematicians consider equally profound and worthy of a lifetime of study. I think this is another order of magnitude more difficult, and we currently don’t know how to achieve it. But I don’t believe this is magic—I’m confident these systems will eventually get there, perhaps just needing one or two more breakthroughs.

The way we can test this is something I sometimes call the “Einstein test”: Can you train a system using only the knowledge available in 1901, and then have it independently derive the breakthroughs Einstein made in 1905, including special relativity and his other papers from that year? I think we should actually run this test repeatedly to see when we can achieve it. Once we can, those systems will be very close to genuinely inventing something entirely new.

Entrepreneurial advice

Gary Tan: One final question. Many people here have deep technical backgrounds and aspire to do something on the scale of what you’ve achieved—you’re one of the world’s largest AI research organizations. Having come from the front lines of AGI research, is there something you know now that you wish you had known at age 25?

Demis Hassabis: We’ve actually touched on this already. You’ll find that solving hard problems is about as difficult as solving simple ones—it’s just hard in different ways. Different things have different kinds of challenges. But life is short and your energy is limited, so invest your vitality in something that truly no one else will do if you don’t. Use this as your criterion for choosing.

Another point is that I believe cross-domain combinations will become more common in the coming years, and AI will make cross-domain integration easier.

The final point depends on your AGI timeline. Mine is around 2030. If you start a deep tech project today, it typically means a decade-long journey—you must plan for the possibility that AGI emerges midway. What does that mean? It’s not necessarily bad, but you need to account for it. Can your project leverage AGI? How will AGI systems interact with your project?

Going back to the relationship between AlphaFold and general AI systems, one scenario I foresee is that general systems like Gemini or Claude will use specialized systems like AlphaFold as tools. I don’t believe we’ll cram everything into one massive, single “brain”—feeding all protein data into Gemini would make no sense; Gemini doesn’t need to perform protein folding. Returning to your point about information efficiency, that protein data would inevitably hinder its language capabilities. A better approach is to have highly capable general models that can invoke and even train specialized tools, while those specialized tools remain independent systems.

This idea deserves serious consideration—it impacts what you build today, including the type of factory and financial system you create. You must take AGI timelines seriously, imagine what that world would look like, and then build something that remains valuable when that world arrives.