How AI agents in video games are set to transform gaming

Inworld TeamJuly 23, 2024
Related posts
Want to try Inworld?

Think the only thing large language models (LLMs) can add to video games is unscripted dialogue? Then, you haven’t heard about AI agents.

Generative AI agents are systems that can perceive their environment, make decisions, and take autonomous actions towards a set of goals. Dubbed by Forbes as the “next frontier of AI,” AI agents have attracted significant buzz because their ability to reason through and complete complex tasks enables new kinds of automation and assistance in enterprise use cases that were never possible before. 

But one of the most promising use cases for AI agents is in video games. While it’s still early days when it comes to generative AI agents in video games, the findings of experiments like the Stanford AI Village and Google’s Scalable Instructable Multiworld Agent (SIMA) suggest that there’s significant potential. At Inworld, we’re also working on developing fully autonomous AI agents for integration into games – and it’s currently possible to add agentic AI-powered behavior and gameplay to games via our AI Engine for both AI NPCs and other non-NPC uses for AI agents like shaping procedural content, orchestrating objects, managing complex physics simulations, generating actions, and adaptively adjusting gameplay.

In this post, we’ll break down the current state of AI agent research and development. Then, we’ll look at the types of agentic behavior possible now with our AI Engine – and explore the future of AI agents in video games. 

AI agents in video games: Current state

AI research and video games have long been intertwined. In 1951, Nim, the first AI game, was created. In 1952, IBM developed an AI Checkers program that was designed to analyze and learn from each move, allowing the computer to progressively get better. 

Since then, AI models have been judged, in part, by how well they were able to play games – from Go to Atari games. Video games are seen by machine learning researchers as a way to both test an AI model’s ability to reason and prepare models for other tasks. So, it’s not surprising that researchers are turning to video games once again as a training ground for AI agents. 

But video games are also a key potential use case for the tech. Generative AI agents have the potential to radically improve NPCs and perform other agentic operations in the game world by doing things like shaping procedural content, controlling in-game environments and objects, managing complex physics simulations, and adaptively adjusting gameplay.

The capacity of AI agents for autonomous goal-directed actions and their ability to understand complex instructions, and environmental awareness means that they’ll be able to operate independent of rigid algorithms controlling their behavior. Rather than performing repetitive or programmed actions, AI agents will be capable of real-time emergent behavior and reactions – making their gameplay much more human. 

Here are the most exciting recent experiments around AI agents in gaming.

Stanford AI Village

Researchers at Stanford University and Google released a paper titled Generative Agents: Interactive Simulacra of Human Behavior in August 2023. The paper detailed an experiment where they used OpenAI’s large language model API to create 25 agents in a Sims-like sandbox game environment. 

The AI-driven agents were designed to simulate human-like behavior in interactive environments. The architecture involved three components powered by large language models (LLMs): a memory stream where experiences are recorded, a reflection model capable of synthesizing memories into higher-level inferences, and a planning stream which translates those conclusions into high-level action plans then details behaviors for actions and reactions.

To show how their AI agents could transform game worlds, the AI Village paper’s authors give the scenario of a player wanting to host an impromptu Valentines Day party. With AI agents, the player or creator would just need to tell the agents about the party whereas, currently, it would take months for developers to script all the gameplay mechanics and dialogue involved in setting that up.

The AI Village experiment was designed to test the kinds of machine learning architectures that could allow for generative agents to stay in character and behave in ways that resemble human behavior. While the researchers admit that the NPCs they created weren’t perfect and made a number of errors, the experiment demonstrated the potential to create generative agents capable of simulating complex human behaviors while autonomously engaging in emergent interactions.  

The Stanford AI Village experiment also highlighted the possibilities for using AI agents in complex team-based gameplay suggesting their potential to be used in multiplayer gaming.

Google DeepMind’s SIMA 

Google has long been working on AI agents and video games. In 2020, they released a paper about how they used reinforcement learning to create Agent57, an AI agent capable of outperforming humans on all 57 of Atari’s games. 

DeepMind’s SIMA (Scalable Instructable Multi-Agent) project represents a significant milestone in AI research. They created an AI agent capable of navigating and interacting within a broad range of gaming environments across nine games including No Man’s Sky, Goat Simulator, and Teardown

The project, which published their findings in February 2024, is made up of two models: one that’s capable of precise image-language mapping and another video model that predicts what will happen next on-screen. Researchers then taught SIMA how to play using imitation learning techniques as well as recorded images and the keyboard and mouse inputs of human players. 

The researchers gave the agents instructions to test how well they could carry them out and evaluated the AI agents on 600 video games skills. These included simple actions like ‘turn left,’ object interactions like ‘pick up the sword’, and menu use such as ‘open the map.’ They found that the agent was able to transfer its knowledge of gameplay concepts in one game to other games with similar game mechanics. 

SIMA agents also demonstrated remarkable adaptability and problem-solving capabilities across a diverse range of simulated worlds, including the ability to understand and act on higher-level language instructions to carry out tasks involving complex goals. For example, SIMA agents would be able to figure out how to light a fire in a game world even if they needed to complete multiple tasks to do so such as gathering wood, getting kindling, and finding matches. Less sophisticated AI would need to be instructed to complete each of those tasks separately. 

However, the AI agents struggled with certain kinds of instructions. For example, when told to go chop down a tree in a game, the researchers were not able to specify exactly what tree the AI agent should chop down. 

While Google admits that their AI agent is still in the early stages of development and can only perform at 60% of human capacity, they believe it heralds a future where AI agents can be helpful in any type of environment. 

Significantly, they created SIMA without any access to the game state, which suggests that the integration of the game state into an AI agent’s inputs could be a simple way to greatly improve an AI agent’s gameplay performance today. 

Inworld’s AI Engine

Inworld’s AI Engine showcases the potential for AI agents to enhance gameplay and create truly novel gaming experiences. While many believe that the only current application for generative AI in gaming is enabling conversations with NPCs, we’ve gone beyond just powering dialogue by enabling agentic behavior and novel game mechanics, as well as, non-NPC use cases for AI agents. 

Recent demos from Ubisoft and NVIDIA, as well as games from NetEase, have showcased sophisticated new game mechanics like voice commands for in-game actions, trust-based relationship progression, and the ability to respond to dynamic interrogation techniques. However, these experiences are just a hint of the kind of games Inworld’s AI Engine has the ability to power – now and in the future. 

Our research and development focus has always been on agentic behavior. But, rather than trying to create fully autonomous AI agents who act out freeform lives independent of the player or the game narrative, as the Stanford AI Village experiment attempted, our goal is instead to give game devs the capabilities and control they need to build experiences with AI agents in video games where every interaction meaningfully contributes to the game narrative and experience. 

Here are some features Inworld developed to support agentic behavior, reasoning, and action orchestration: 

Goals and Actions

Inworld’s Goals and Action system was our first attempt at enabling agentic and emergent behavior in games. Our AI agents can be programmed with goals that are triggered by different kinds of activation events that then lead to specific actions. 

For example, if you have a goal to have a character suggest a quest to the player, your activation condition might be the player asking about available quests. When this intent is recognized by the AI agent, the goal is activated, and the character executes the associated actions - perhaps describing the quest, instructing the player on how to start it, or suggesting a couple of quests that are aligned with the player’s interests. In non-NPC use cases, an environmental change or action orchestration function could be triggered by game play actions or a player mentioning something. For example, every time a player mentions the game's villain, the sky in the game world could darken and thunder could strike.   

Activated actions can include spoken actions, character state changes, behaviors, or physical actions. Actions can be randomized and also leverage parameters to take in dynamic information from the client at runtime.

This system balances the agency of the AI agent in deciding whether the activation condition has been met and the control that game developers and other kinds of storytellers need to ensure the characters remain on topic and the game narrative moves forward. 

Customizable Reasoning

We realized that to enable truly emergent gameplay, NPCs need core reasoning capabilities that drive their thought processes and decisions. That’s why we created our Customizable Reasoning module. Customizable reasoning capabilities don’t just allow developers greater control over NPC behavior, personalities, and interactions – but they also open up more complex and sophisticated uses of AI agents in games. 

For example, conditional AI reasoning could make a guard character increasingly suspicious of players who loiter near a restricted area and decide to chase them off the property or a companion character with analytical reasoning could assess the tone of the player's messages to gauge frustration levels, then offer helpful tips or join in combat to assist the player. But it could also enable dynamic changes in the game state such as having a non-NPC AI agent analyze the player's state of mind and level of frustration to responsively adjust gameplay difficulty or unlock rewards to improve retention. 

Our Customizable Reasoning module allows developers to add a context-rich reasoning step that supports the orchestration of actions and agentic behavior in some of the following ways: 

  • Analysis-motivated behavior: By adding a customized analytical lens, AI agents are required to analyze their interactions with players and other information like the game state through a specified lens before deciding on actions or dialogue. For instance, a detective character could be made to assess whether the player is acting suspiciously before deciding how to respond or what action to take. 
  • Emotion-motivated behavior: Adding emotions-based reasoning allows developers to add nuanced emotional states and motivations to characters beyond Inworld’s standard emotions. For example, a developer might add a reasoning step to ensure that a villain analyzes interactions through the lens of jealousy or pride. The character would then interact, speak, and act on the basis of these subtle feelings.  
  • Goal-motivated behavior: While Inworld’s Goals and Actions systems supports basic goal-oriented behavior, our Customizable Reasoning module allows for more nuance by supporting things like conditional or temporal goals based on things like player interactions and game history. For example, a blacksmith might only offer a rare weapon if the player flatters him about the quality of his weapons. Or a healer NPC may only assist the player if they've shown kindness to other villagers earlier in the game.
  • State-of-mind motivated behavior: Our Customizable Reasoning feature allows for 'State of Mind' motivated behavior by creating an inner life for AI agents. This reasoning step both creates character thoughts and provides an additional context-rich deliberative step where the AI agent can consider how they feel or think about how best to respond to an interaction or communication with a player. This can be used to ensure that AI agents in video games respond with more context or more deliberatively thus improving character fidelity. But state-of-mind reasoning can also be used in ways that expose a character’s thought process to the player in a thought bubble or state of mind can be used to help the characters both consider and then conceal their motivations. For example, a spy NPC can pretend to be an ally while working against the player or a curmudgeon character can insult everyone he encounters in thought bubbles. 
  • Game state-Integrated behavior: Integrate game state information into the NPCs' reasoning processes. For example, the ambiance of a scene might change based on a character's mood, with environmental elements like lighting adjusting to reflect this. Additionally, NPCs can use past player actions to inform their decisions, such as remembering if the player previously chose a peaceful or aggressive approach, thereby influencing future interactions and outcomes.
  • Relationship-motivated behavior: With our Customizable Reasoning you can also create  agentic and customized behavior around other characters or players. For example, you could create a reasoning step that asks NPCs to proactively consider what actions they should take in a situation.  NPCs can then walk around and interact with their environment or other characters based on their own agentic choices. If another NPC enters their field of view, their reaction could vary based on pre-defined relationships—ignoring the new character, greeting them, or preparing for a potential threat if the newcomer is an enemy.

These are just a few ways our Customizable Reasoning can be used to create more agentic behavior that allows NPCs to act autonomously – but still ensure their actions are based in their context within the game. We believe this provides the balance between autonomous actions and control that developers prefer to have in games over the type of freeform action that the Stanford project focused on. 

AI agents in video games: The future

As research into AI agents continues, they’re expected to evolve a number of capabilities that would greatly enhance gameplay across a variety of genres. Here are some of the capacities AI agents are expected to develop that will benefit games. 

Advanced decision-making

While AI agents in video games currently demonstrate an ability to remain goal-oriented and make simple decisions, they are expected to develop more sophisticated decision-making capacities, allowing them to adapt to dynamic gameplay situations, implement complex game strategies, and respond intelligently to players’ actions by making autonomous real-time decisions. They will also have the ability to learn from their mistakes in order to adjust their future actions. 

AI agents in video games in the future should be able to predict players’ actions based on memories of their past actions and understand their skill levels and playstyles, allowing them to adapt or shift strategy. This would result in more challenging and immersive gameplay, particularly in genres like Strategy games, RPGs, and Shooters. 

Autonomous actions

AI agents will continue to improve their ability to take autonomous and emergent actions in response to dynamic gameplay. While they’re currently capable of breaking down complex multi-part tasks, in the future they’ll be capable of longer-term strategic reasoning. 

Rather than just knowing how to chop down a tree to make a fire they’ll understand more about the longer-term strategic impact of making that fire in the game world. For example, in a resource-focused RPG, they might realize that chopping down the tree means they’ll need to plant more trees or wage war against the neighboring town to access more wood in the future. Non-NPC AI agents will also be able to take complex autonomous actions to do things like improve the gaming experience or orchestrate game narrative or game state changes. 

This will make AI agents far more interesting to play against in multiplayer games than current AI systems which exhibit predictable algorithmic gameplay and no strategic understanding. 

Improved contextual awareness

While AI agents can currently be developed with contextual awareness, ensuring they understand all the nuances of a situation or environment is sometimes still difficult. After all, ‘context’ is a broad term that encompasses and needs to weave together a variety of capabilities in video games. Good conversational, social, and behavioral context, for example, relies on effective AI memory systems, AI reasoning, and the right length for context windows. Good character, world, and lore context, meanwhile, relies on controlling AI hallucinations through machine learning techniques like reasoning or retrieval. Ensuring characters have environmental context is even more complex as it could rely on the AI agent access to the game state or computer vision.

But the biggest challenge is orchestrating all those multiple levels of context together so that AI agents in video games understand the nuances of a situation. For example, if an AI agent is in a room with an enemy for a negotiation, that should put them on edge and you should see subtleties in how it affects conversation, cognition, and reasoning. If, in the room, a door is slightly ajar, the AI agent should think that means something and they might be about to be ambushed. Currently, AI often misses the more fine grained emotional, social, and environmental nuances but, in the future, this added context will allow AI to interact with players and the environment in a much more naturalistic and sophisticated way.    

Improved memory

In the future, AI agents in video games will have significantly improved memory which will greatly enhance their interactions and behavior. Advances in AI memory will allow AI agents to remember past interactions and decisions made by players which will add a deeper level of contextual understanding and personalization. For instance, an AI companion might remember a player's previous mistakes during a boss fight and later reference those mistakes in a new, similar encounter, creating a more immersive and realistic gaming experience. Or a non-NPC AI agent will be able to remember earlier gameplay interactions in order to adapt the game narrative in light of them or to better generate side quests that the player will enjoy. 

This nuanced memory function goes beyond simple knowledge retrieval – it involves the AI recognizing and appropriately recalling past events to make interactions feel more genuine and intelligent. By orchestrating multiple prompts and refining the contextual application of memory, AI behavior in the future will be more coherent and contextually relevant.

Improved conversational skills

Improved conversational abilities would allow AI agents in video games to engage in more dynamic dialogue with players while remaining in character – and even coordinate attacks over voice chat with other players on their team. 

These improvements will be driven by a reduction in the amount of hallucinations that large language models (LLMs) currently exhibit and a greater ability for AI agents to move from conversation to autonomous goal-oriented action. This will translate into AI players with more cohesive identities who can respond quickly to barks or conversations in multiplayer games, as well as the ability to coordinate complex strategic actions in a gameplay scenario. 

How to add AI agents into your game today 

Currently, game devs can add AI agents to games via AI NPCs or non-NPC agents powered by AI Engines like Inworld’s. While Inworld’s main focus is on increasing AI agent autonomy in games by developing features focused on action orchestration and reasoning, our AI agents also enable a number of novel game mechanics like relationship progression triggers, dynamic Character Mutations, and voice commands.

Interested in our tech? 

Get started with InworldGet in touch to discuss signing up for the Inworld License.