By Harrison Froedge


A general goal of video game development is to craft an engaging experience for the player. Different game types may demand different ways of going about this; however, if part of the player experience depends on their interaction with (or possible explorations of) the game’s universe, then the design of that universe’s inhabitants (shopkeepers, enemies, animals, allies, and other NPCs) may be of key importance. To maintain the balance and progression of the game, traditional game logic (e.g., if entity 2 performs event A, trigger event B for entity 1) is often the go-to for defining the behavior of these entities – writing explicit rules allows developers to purposefully craft the experience they hope the player to follow. However, some adventurous developers may want NPCs that act more organically. Some may dream of a game world the evolves all on its own, with moving parts beyond the control of any agent, developer or gamer. Relatedly, the goal may be to produce a world which evolves around the player and adapts to their goals and actions – a natural world, a digital replica of our own. Writing the logic for such a project, even on a small scale, has great potential to become a gargantuan, bug-ridden task. Machine learning, however, may be the key to avoiding conditional-ridden code and unlocking ever more expansive game worlds.


What is Machine Learning?

As the name implies, machine learning involves the creation of algorithms which can “learn”. Machine learning algorithms (models) consider past events to make predictions about the future. These predictions then fuel decision making. The process of building such a model can be broken down as follows:

  1. Procure a dataset and/or a means of collecting data. Each data point should have various parameters (features). For example, a machine learning algorithm which predicts whether a player will enjoy a game may take variables like player age, gender, and the player’s satisfaction level into account.
  2. Convert the dataset into a numerical form and plot it on a coordinate grid. Choose the feature you want to make your target variable (in our example it’s player satisfaction) and make it the y-axis.
  3. Create a line/surface of best fit to model the data against the chosen target variable. There are numerous methods of doing this. You may have heard of some of them, namely: neural networks, logistic regression, random forests, etc.
  4. Pass new, incoming data through the model. This new incoming data will have all the features except the one you want to predict – player satisfaction. By comparing the new data point to the line of best fit for past data, a prediction for the value of the missing variable can be made.
  5. Collect the prediction as output and act on it.

Given the simple process outlined above, you might begin to wonder about all the excitement machine learning has produced in recent years. It is in fact due to this simplicity that machine learning is such a big deal. Rather than spending hours analyzing data and then outlining rules and conditions via the traditional logic approach to decide if someone will enjoy a game, with today’s modern machine learning developer toolkits, you can build and deploy a model in minutes that does the same thing. This is huge – not only does it take the guess-work out of constructing logic trees manually, but it allows developers to get these kind of tasks done quickly and move on to bigger and better things.


Machine Learning in Game Development

Despite its apparent awesomeness, machine learning is rarely used extensively in game design. There are several reasons for this. One is unpredictability. Machine learning models are inherently black boxes; rather than a well-defined, human-readable logic tree a traditional method might produce, a machine learning model is at its best an equation and at its worse a messy, incredibly complex network of large matrices which are multiplied together to make predictions. For most applications in game development, such as AI for NPC, the resulting model would almost certainly be much too complicated for even an AI expert to decipher. For any programming job, game development included, this can be problematic, as debugging becomes a headache, and there is no way to pre-screen code for bugs. Indeed, with machine learning models, there is no code to pre-screen, only large, incoherent data structures. Secondly, as a scenario grows in complexity (i.e. more variables need to be considered to make an accurate prediction), the process of selecting a functional model becomes more arduous and requires more data. That last detail is perhaps the greatest obstacle of all – data is the new oil, and without it, no machine learning can take place. (However, if you can collect data on the fly, i.e., during the process of the game, there is hope for a machine learning use-case).

While these drawbacks do exist, machine learning that is properly integrated into games often produces great player feedback. Take Black and White, which combined elements of machine learning with manually constructed logic trees. Developed by Lionhead Studios, the game allows players to take on the role of a diety amidst tribes of ancient people. The player can choose to interact with tribes in positive or negative ways, either performing miracles or malicious acts. Players also have a “creature” which they can train to behave to their liking. If the creature interacts with villagers in a way that the player approves of (e.g., watering fields, attacking a villager), then the player can reward the creature. Otherwise, they can punish it. Each interaction with the creature is combined with the preceding creature-action to form a data point; these data points are used to build a machine learning model. The creature learns how to behave by referring to this evolving model, choosing actions which it predicts will result in positive reinforcement.

Black and White was not a small title – it had a sizable team and plenty of capital to support its development. However, that shouldn’t be a put-off for small teams, because Black and White was released nearly twenty years ago, long before the advent of today’s modern machine learning toolkits and libraries like Keras, TensorFlow, and Sci-Kit Learn. These toolkits put work that once required a full development team into the hands of individual devs, who may have little to no knowledge of advanced mathematics or AI. To reinforce the idea that machine learning is not voodoo, consider the fact that gamers with some technical know-how have been using these toolkits to build bots for various games. Check out MarI/O and this neat Old School Runescape PvP bot for some cool examples. With relatively minimal setup, bots were made to act fairly autonomously within the game world, even interacting in complicated ways with other players. If a bot can do it, so can an NPC.



Although not currently a bread-and-butter tool for game developers, machine learning has great potential in the gaming industry. With little effort, a developer now has the power to construct a complex system in minutes and deploy it for immediate use. In the game development world, this kind of power can mean big things – abilities once only within reach of deep-pocketed studios have been handed down to the individual developer, and sandbox-esque NPC experiences once deemed too complicated to attempt are with the realm of possibility. With this touch of optimism in mind, you should remember a few things when dabbling in machine learning. Above all, you should keep in mind that it is an active area of research, and there are still many facets of the field that are rough around the edges. That being said, considerable progress has been made in the two years since modern machine learning toolkits were made available to the public – progress that might serve as inspiration for your own use-cases. If a car can be made to learn how to drive in the real-world, for example, perhaps an NPC can be taught to move around your game world in an intelligent manner. Until you give it try, we can only guess.