Profiles of student success in training games may guide curriculum development.
Each scenario in Strike Group Defender is displayed on screen as above. Players select responses to threats from the icons on the right and get alerts on new threats or incoming messages from the log in the lower right. Image courtesy of the researchers.
Each scenario in Strike Group Defender is displayed on screen as above. Players select responses to threats from the icons on the right and get alerts on new threats or incoming messages from the log in the lower right. Image courtesy of the researchers.

 

Designers of educational materials have traditionally relied on teachers' observations and anecdotal information about students' success with a specific curriculum in order to develop new lessons and the sequence for how those lessons are presented. Increasingly, videogame-like programs are being used to teach fundamental skills and provide specialized training. The capability of these videogames to collect vast amounts of information on users' behaviors leads to an interesting question: Can those data be used to develop teaching tools custom tailored for the needs of each student or trainee?

A research team from MIT Lincoln Laboratory, seeking to answer that question, used data collected from users playing an educational videogame to see what machine learning could do to improve the users' training. The game, Strike Group Defender, was developed by the Laboratory in partnership with the Office of Naval Research and provides realistic scenarios in which sailors can learn and practice the decision making necessary for responding to missile threats to U.S. Navy ships.

"We first sought to identify the types of players who emerge during gameplay and the types of strategies they develop," said team member Sung Son, assistant leader of the Laboratory's Ballistic Missile Defense System Integration Group. To get a profile of user behaviors, the team examined data to find the correlations between numbers of games played, games quit, tutorials played, games replayed, games quit, and "checkup tests" attempted. The team also looked at data about the specific actions taken during a game; for example, an interceptor was launched to destroy an incoming missile, a decoy was deployed to lure the missile away from the target, electronic jamming was used to misguide the missile. Clustering algorithms grouped players characterized by their behaviors and tactics.

"Understanding player behavior can help game designers develop training levels to guide players toward exploring the game in a productive sequence," said Reed Jensen, a technical staff member in the Advanced Concepts and Technologies Group. "Furthermore, an instructor can use data about a player's behavior to tailor instruction so it is best suited to that player's style." A capability to individualize instruction could lead to learning that is more effective and long term than that acquired through a one-size-fits-all curriculum.

The research team used clustering algorithms to identify four player types and four types of strategic actions. After clustering players, the team correlated the types and tactics with players' success (i.e., scores) on the game. "We found that the behaviors present in each cluster mapped to players' performance," said Son. "This post hoc correlation bodes well for being able to use players’ behavior to predict whether they will be high or low achievers."

Several trends emerged in the post hoc analysis of player behavior. The best players tended to explore the game by repeating each level before moving on to a tutorial for the next level; they did not hesitate to quit a level and retry it. Players who performed second best attempted the tutorials unsystematically rather than practicing each level repeatedly. The group ranked third explored only a few tutorials, repeated a level only a few times, and rarely paused the game to take stock of the situation. Of the four groups, the worst performers spent more time with a game paused and quit levels infrequently.

These results suggest that, for optimal success, players should master a level (indicated by repeats of a level) before moving on and should investigate each level's tutorial before challenging that level. The team next wants to test this hypothesis to see whether having beginners follow the strategies of the best players will improve the beginners' skills.

The analysis of tactics through the prism of game success showed that the best players repeatedly employed techniques that had already achieved success in scenarios. The researchers suggested that future exploration into the effectiveness of training games could study how much influence prior instruction on successful tactics, including lessons from experts, could have on players' success in the game.

Understanding player behavior can help game designers develop training levels to guide players toward exploring the game in a productive sequence.

Reed Jensen

The researchers next turned to applying machine learning techniques, such as deep learning, to the prediction of game players' performance. Such predictions could enable instructors to guide individual players to additional practice with levels or to appropriate tutorials.

An investigation into the prediction accuracy of the machine learning model proved that the model fairly accurately classified players into the top or bottom 50 percentile. The team also determined that the features incorporated into the model that were most significant in forecasting player success were the proportion of games quit, the number of games played at a new level, and the proportion of "tests" attempted.

One surprising correlation the team noted was that the players predicted to perform well were the ones who started the most games but also quit games most frequently. "This result could indicate that high achievers restart games when they sense a mistake so that they can make a new attempt at perfection rather than continuing to play a game that is going poorly," said Matthew Gombolay, a technical staff member in the Ballistic Missile Defense System Integration Group. However, Gombolay notes that "it is unclear whether this restarting behavior should be imitated by poor-performing players to improve their performance, or if it just happens to be indicative of people who are already high-performing." The team hopes to answer that question in follow-on studies.

Another experiment the team conducted was designed to predict when players disengage from the game altogether. Using features such as number of games paused or variations of games attempted, the team trained machine learning models for players who quit games after specific numbers of starts. They used a deep learning algorithm to determine when players exhibiting certain features will disengage.

The models predicted with almost perfect accuracy if a player would quit after playing only one game. The models were also able to identify 80 percent of players who would disengage after playing 2 to 5, 6 to 16, and 16 to 30 games. This ability of the models to quickly flag players who are likely to quit a game could allow instructors to recognize when trainees are struggling with a game scenario so that they can intervene and direct trainees to helpful tutorials before those players give up altogether.

Further examination of the sequence, or trajectory, in which players undertook the tutorials in Strike Group Defender revealed that a model derived from data on players' trajectories could predict if players would perform well or poorly. The implication here is that the team can actually use the model to recommend a specific sequence of tutorials a player should complete to most improve his or her performance. Gombolay is particularly excited about this finding: "To the best of my knowledge, this is the first time that computers have had the ability to learn from data how to develop curricula and lesson plans as a human teacher would." The team hopes to test their model to see how well it can improve players' performance.

The research into using data-driven methods to understand game players' behaviors raises many questions for future investigation: Can models be developed to identify weak players before they become ineffectual or disengaged? Can an optimal sequence of lessons be prescribed for players? Can insight into a trainee's gameplay style help instructors tailor a training trajectory for that individual?

Answers to these questions would inform both the development of future iterations of Strike Group Defender and the way Navy instructors present the game to trainees. Moreover, findings from this research could be extrapolated to help developers of educational games re-evaluate and redesign their products and to give teachers a better understanding of the use of games to improve student learning.