A New AI Game Engine Creates Playable DOOM In Real-Time

Janani R September 25, 2024 | 11:20 AM Technology

So, a diffusion model, denoising data, peak signal-to-noise ratio, RL-agent, autoregressive model, and thermodynamics walk into a bar... and now we can play the 1993 cult classic first-person shooter, DOOM, generated in real-time by AI. A team from Google and Tel Aviv University has developed a new gaming engine called GameNGen, which is entirely powered by a real-time neural model, also known as artificial intelligence.

The main idea is that they’ve trained a reinforcement learning agent (RL-agent) to play DOOM repeatedly. During each session, this RL-agent records and stores its gameplay, learning to avoid being shot, eaten, or otherwise eliminated, while also mastering how to interact with the environment effectively.

Figure 1. The 1993 Classic First-Person Shooter, DOOM

Additionally, a diffusion model is used, which essentially learns to degrade a perfect image with noise over several steps before denoising the data to restore the image to its original quality. This makes it highly effective at predicting and generating images. In this case, it enables the gaming engine to predict what the next frame of gameplay should look like based on the previous frame.Figure 1 shows The 1993 Classic First-Person Shooter, DOOM.

By leveraging the data it has gathered from observing the RL-agent play DOOM repeatedly, the engine can generate all the textures, colors, models, skins, and other elements required to visualize the maps in DOOM.Typically, textures, sprites, models, shaders, prefabs, and similar assets are stored locally and loaded at the start of each level, with each asset having its unique physics interactions preloaded as well.

The diffusion model can predict and render what the next frame should look like, such as when a weapon is fired, what it targets, and the physical effects of shotgun blasts on the object it impacts. Whether it’s an enemy or a barrel of toxic sludge, the model can simulate the corresponding outcomes—like the enemy being killed or the barrel exploding.

By incorporating user inputs, you now have a game being generated and interacted with in real-time, without the need for preloading or caching. While DOOM is used as an example, any game could potentially be implemented, even those that don’t exist yet. In theory, GameNGen could create its own game if given the right parameters.

All of this was accomplished using a single tensor processing unit (TPU)—similar to a GPU but specifically designed for AI, focusing on high-volume, low-precision computational processing—and it achieved 20 fps. While this is below the typical 60 fps standard for modern gaming, it’s important to note that, as with any emerging technology, improvements are likely on the horizon. For reference, the original 1993 version of DOOM ran at a maximum of 35 fps.

With the single-TPU setup, memory becomes a limiting factor, and the AI model can only "remember" about three seconds of gameplay before "forgetting" it as the user progresses through the levels. However, the AI can infer certain data, such as the ammo count and whether a specific area on the map has been cleared. Nevertheless, with a context length of only three seconds, this can sometimes result in errors.

Another important point is that relying solely on the RL-agent for training has its drawbacks. Instead of training the RL-agent to achieve maximum scores and discover all the hidden secrets typically found in FPS games, it was trained to gather data based on how average players might engage with the game. During training, the RL-agent had access to its previous 32 actions.

“Our agent, even at the end of training, still does not explore all the game locations and interactions, leading to erroneous behavior in those cases,” states the whitepaper released by GameNGen.Video games have traditionally been developed by teams of people writing millions of lines of code, but GameNGen is the first to utilize a neural model, making it a potential game changer—pun intended.

Since I started cloud gaming a few years ago, I've enjoyed the ability to play games at speeds and resolutions that my aging Xbox or old GTX 760 Ti simply can't handle. Plus, I can skip those cumbersome 160-GB downloads for games I'm unsure about.

From its original release on DOS to being generated in real-time by AI, who could have imagined such advancements over 30 years ago? Are generative AI gaming engines the future? Will we end up with a box connected to our holodeck, where we can input a few prompts to start playing entirely unique games tailored to our preferences? I know I can’t wait to experience DOOM on a volumetric display.

Source: GameNGen

Cite this article:

Janani R (2024), A New AI Game Engine Creates Playable DOOM In Real-Time, AnaTechMaz, pp. 71

Recent Post

Blog Archive