Highlights
- Google DeepMind introduces Genie 3, a new AI world model for training robots and autonomous systems
- Model generates interactive, physics-based simulations from simple text prompts
- Genie 3 could support the development of artificial general intelligence (AGI)
- The tool is not yet available to the public and comes with technical limitations
- Simulations could be used to train warehouse robots, autonomous vehicles, or offer virtual experiences
Google has revealed a new AI system called Genie 3, which it claims is a major advance towards developing artificial general intelligence (AGI). The model creates lifelike virtual environments from simple text prompts and could be used to train AI agents for real-world tasks, particularly in robotics and autonomous navigation.
Developed by Google DeepMind, Genie 3 enables AI systems to interact with realistic, physics-based simulations of the real world—such as warehouses or mountainous terrains. The company believes that these world models are a critical part of building AGI, where machines can perform a wide range of tasks at a human level.
How Genie 3 works
Genie 3 allows users to generate interactive virtual scenes by typing natural language prompts. These simulations can then be manipulated in real time—for instance, a user could ask for a herd of deer to appear on a ski slope or alter the layout of a warehouse.
The visual quality of the scenarios is comparable to Google’s Veo 3 video generation model, but the key difference is that Genie 3’s simulations can last for minutes, offering real-time interaction beyond Veo 3’s short video clips.
So far, Google has demonstrated examples of skiing and warehouse environments to journalists, but has not made Genie 3 available to the public. No release date has been given, and the company acknowledged the model has a number of limitations.
Why it matters for AGI
Google says Genie 3 and other world models will be vital in developing AI agents—systems capable of acting autonomously in physical or virtual environments. While current large language models are good at tasks like planning or writing, they are not yet equipped to take action.
“World models like Genie 3 give disembodied AI a way to explore and interact with environments,” said Andrew Rogoyski of the Institute for People-Centred AI at the University of Surrey. “That capability could significantly enhance how intelligent and adaptable these systems become.”
Applications in robotics and virtual training
The real-time, physics-based nature of Genie 3’s simulations makes them ideal for training robots or autonomous vehicles. For example, a robot could be trained in a virtual warehouse—interacting with human-like figures, avoiding collisions, and handling objects—all before being deployed in a physical setting.
Professor Subramanian Ramamoorthy, Chair of Robot Learning and Autonomy at the University of Edinburgh, said: “To achieve flexible decision-making, robots need to anticipate the consequences of different actions. World models are extremely important in enabling that.”
Broader industry competition
Google’s announcement comes as competition intensifies in the AI industry. Just days earlier, OpenAI CEO Sam Altman shared what appeared to be a teaser of GPT-5, the next major language model from the makers of ChatGPT.
While OpenAI and Google compete in developing advanced LLMs (large language models), world models like Genie 3 add a new dimension by allowing AI systems to perceive, act and learn from interactions in simulated spaces—not just process text.
What's next?
Alongside Genie 3, Google has also built a virtual agent named Sima, which can carry out tasks within video games. Though promising, neither Sima nor Genie 3 is available to the public at this stage.
A research note accompanying Sima last year stated that language models are good at planning, but struggle to take action—a gap that world models could help bridge. Google says it expects such models to play “a critical role” as AI agents become more embedded in the real world.














