Google claims AI model Genie 2 can generate interactive worlds

The new AI model can generate worlds akin to those seen in modern videogames, according to Google.

Google DeepMind has unveiled its latest artificial intelligence (AI) model, Genie 2.

According to Google, Genie 2, which was trained on a large-scale video dataset, can create interactive worlds with just a prompt, similar to a mythological ‘genie’.

Specifically, Google claimed that the models are prompted with a single image. In the examples given in the blogpost, the images were all generated by Imagen 3, Google DeepMind’s text-to-image model. Genie 2 then produces “action-controllable, playable 3D environments”.

“It can be played by a human or AI agent using keyboard and mouse inputs,”  Google explained. “This means anyone can describe a world they want in text, select their favourite rendering of that idea, and then step into and interact with that newly created world, or have an AI agent be trained or evaluated in it.”

Examples of prompts being rendered into interactive settings include a robot walking around a futuristic city and a sailboat traversing a body of water.

Explaining the motivation behind the project, Google DeepMind emphasised its view that videogames play a key role in the world of AI research: “Their engaging nature, unique blend of challenges and measurable progress make them ideal environments to safely test and advance AI capabilities.”

The company also pointed out what it considers to be the improvements made between Genie 1 and Genie 2: “Until now, world models have largely been confined to modelling narrow domains. In Genie 1, we introduced an approach for generating a diverse array of 2D worlds. Today, we introduce Genie 2, which represents a significant leap forward in generality.”

AI: improved technology, increasing concerns

The Genie 2 model comes at a time when Big Tech is grappling with the potential consequences of AI technology on creative industries and news reporting.

Last month, two news outlets lost a copyright lawsuit against OpenAI where they alleged that the ChatGPT-maker “intentionally” removed copyright management information from their work and used it to train its AI models.

The plaintiffs, Raw Story Media and AlterNet Media, were unable to prove “concrete injury”, the judge presiding over the case said, adding that the likelihood that ChatGPT – an AI model that processes large swaths of data from across the internet – would output plagiarised content from one of the plaintiffs is “remote”.

According to Dr Sean Ren, an associate professor of computer science at the University of Southern California and the CEO of Sahara AI, the news outlets’ loss “highlights how hard it is for publishers to prove copyright infringement in the context of today’s AI”.

And in October, The Guardian reported that UK ministers faced a backlash over plans to allow AI companies to train their models on content from publishers and artists by default unless they opt out.

Earlier that month, thousands of creatives around the world signed a statement warning AI companies that the unlicensed use of their work to train generative AI models is a “major, unjust threat” to their livelihoods.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

Related Content

What podcasts looked like in 2024 — literally

A look at the more challenging AI evaluations emerging in response to the rapid progress of models, including FrontierMath, Humanity's Last Exam, and RE-Bench (Tharin Pillay/Time)

2024 has been an amazing year for roguelikes

Leave a Comment