BlizzCon 2010 Panel: StarCraft II Multiplayer
Following the StarCraft II Art Panel, attendees were treated to an inside look at balancing StarCraft II’s multiplayer game with Game Director Dustin Browder, Senior Designer Josh Menke, and Associate Game Balance Designers David Kim and Matt Cooper.
The StarCraft II developers feel that it’s important to take a look at the various tools that are employed in defining balance. At first, each one of these tools looks like it could be the one answer you need -- but it becomes clear over time that no single tool provides the perfect solution to balance. Instead, it takes multiple tools and a complete understanding of what those tools tell the designers. So what tools do the developers use?
Player feedback is perhaps the best tool available to the development team, as it allows for many voices to be heard across a variety of skill levels and experiences. This method also represents the largest pool of players. While data is a great tool, raw stats don’t qualify what players are experiencing from their perspectives. By reading the forums and getting feedback from the community team, the developers can gain insight into how the community is playing the game, what units they're using, and what difficulties or successes they're having.
There are drawbacks to utilizing player feedback exclusively. Sometimes the loudest of voices aren't portraying their experiences accurately, and the many can easily drown out a single voice that has different, yet important information the development team needs to make balancing decisions.
Pro players represent another important balancing tool to the development team. These players have a high skill level and understand the minute details of the game. They are also a great resource for critical feedback. On the downside, these players are generally very focused on one particular race and represent a very small subset of the community. When taking these players into account, it’s important to note that they may not know exactly why they lost a match -- whether it was due to their own error or an actual imbalance to the race, ability, or unit they are using.
Tournaments can be a great resource for observing games played at a very high skill level. When watching these matches, however, it’s important to look at the games individually and not just the end results. A talented player like Fruit Dealer may just be so good that he was going to win no matter what race he played. However, each game can give some insight into where the holes within the balance might lie. Players in these tournaments are generally very good at finding these holes and taking advantage of them, and it’s the development team’s job to keep an eye out and determine if something needs to be changed. The weakness in looking only at tournaments lies in knowing that there’s no way to be certain that matches are equal. All it really takes is a single poor performance to keep a top player from progressing.
Play the Games You Make
There’s no better way to see what players are experiencing firsthand than to play the game yourself. It’s a good way to get into the trenches, analyze gameplay, and find out what’s fun, what’s not fun, what tactics work and don’t work, and so on. However, while the development team consists of players of every skill level, the team is only so large -- and even with additional feedback from within the company, it can sometimes take time before the next new strategy gets to our team.
Spreadsheets are a great tool for looking at straight damage numbers, how fast or slow units are made, how often, what combinations of units are used, unit costs, and more. What spreadsheets don’t tell the developers is the how or why. While designers can take a look at the sizes of armies and make adjustments to building times , spreadsheets can’t really take into account pathing, terrain, micromanagement, unit size, random target acquisition, and other factors which only occur in a real game.
Make Combat is a great in-house simulation tool that allows the development team to run various scenarios with units to see how they stack up against each other, but running one simulation isn’t enough. Simulations need to be run multiple times before any sort of pattern begins to take shape -- if there’s even a pattern to be seen. Unlike a spreadsheet, Make Combat can take a look at unit pathing and can even allow micro to be employed if the developers wants to drill down a little bit more. What the simulation doesn’t do well is take into account all the myriad combinations of units or terrain. While it’s a handy tool, it’s only one of many, and results can’t always be taken at face value.
Battle.net provides information on millions of games: who’s playing, what they’re playing, how people are progressing through the ladder, and more. It also allows development to look at the win/loss ratios between the races.
Matchmaking within the system, however, intentionally does not account for win/loss and looks purely at player skill -- and any existing race imbalance gets worked into that equation. “Adjusted win percentage” simultaneously considers both player skill and race balance. After each match, estimates of player skill and adjusted race win percentages are updated relative to the expected outcome of the match. In other words, if what happened was exactly what was expected, then nothing changes. If the system is surprised, then changes may be in order.
From there, the developers can see the win/loss ratio of the various races within each league. Generally these tend to be relatively even across the board, though there can be cause for concern if the percentage of win/loss between the races skews toward 60%/40%. When looking at these percentages, it’s important to note that they can shift very quickly -- in as short as 36 to 48 hours -- based on a change in the metagame.
While percentages between the ladders may look fairly balanced in other regions, the team also looks to Korea as a global leader in developing new strategies and setting metagame trends.
It’s important to take all of these various tools into account when looking at balance. For example: When talking with the community, a common perception is that marauders are too powerful and their Stimpacks need to be nerfed. When running scenarios in Make Combat, it appears that marauder Stim isn’t overpowered and the terrans end up nearly evenly matched with zerg. Make Combat simulations also show the developers that marine Stimpacks are very powerful; however, it may be that marauders are acting as shields for the marines behind them. So is it that marauder health is too high? Or are marine Stimpacks are too powerful? We still don't know -- but we’re always looking for answers to questions like these.
Pros on PvT
When asking the pros about protoss vs. terran matchups, there are conflicting opinions and a split in whether these players think one or the other is the more powerful race. When the pros aren’t sure, then the development team needs to look deeper to see if perhaps there’s a more fundamental issue than game balance to deal with.Personal experiences have shown that terran tend to be strong at the start of a match, but the protoss are more powerful toward the end, which could point to a design issue. However, as you’ve realized by now, it’s impossible to make such a determination based on these tools alone.
The designers employ all of these tools and more on a daily basis to determine future balance changes. Currently, the focus is on making terran vs. protoss matchups more fun and analyzing Stim vs. Psi Storm balance. But today’s problems will inevitably be solved, and others will invariably pop up -- and the development team is dedicated to investigating, analyzing, and balancing for the long haul.