Articles

This portion of the site is dedicated to my writing about games and game design. Everything I write is grounded in a wide variety of experiences I’ve had in the games industry, but I also recognize that my viewpoint can hardly be universal. Games are more art than science, after all, and art is open to everyone’s own interpretation. Still, I’ve picked up a few things over the years, and I’m excited to share them with anyone who is interested!

Max Brooke Max Brooke

Playtest Design Lessons Part Two: Five More Lessons

Last week, I discussed overall playtest design basics, and how designers and testers alike can get the most out of each other to create awesome games. Last week was Lesson One, which I would sum down to “Understand and communicate about the scope of the playtest.” Today, I’m delving into five more lessons I see as key to being a designer who sets their playtesters up for success and being a playtester who delivers invaluable feedback.

Lesson Two: Identify What You Like

The microscope is a metaphor for, uh… your own sense of introspection? Listen, this was a hard one to find an image for.

The microscope is a metaphor for, uh… your own sense of introspection? Listen, this was a hard one to find an image for.

Different people are going to like different things about a game. Bartle’s Taxonomy is one lens through which to look at this (as Andrew Fischer kindly described in one of his articles, saving me the trouble), and there are numerous others. Whichever model(s) you want to apply to your game’s players, individual player preferences for activities in games will certainly come out during playtest. This is a good thing - the player base at large will also have a diverse set of opinions, too. It’s important that the playtesters reflects this range of experiences. But remember that a game is malleable during playtest. This means that when two groups like the game for different reasons, the feedback will naturally pull the designer in two different directions. Designers and playtesters should both account for this in their thinking.

Lesson for Designers: If you’re the designer, then you need to know what you like about your game (or piece of content you’re designing for an existing game). I’ve talked about identifying the heart of an experience before, and during early development (especially of a new game), that can be a process of discovery within yourself. Even if the heart of the experience evolves as you playtest, iterate, and playtest more, you should continue to check in with yourself: does what you like about the game still shine? If what you like and what the playtesters like are diverging, why is this happening? If the playtesters are pushing back on a mechanic, ask yourself whether that mechanic actually enhancing the parts of the game you like, or are you defensive of what you want it to accomplish?

A concrete example of this comes from when I was working on the calculate mechanic for X-Wing 2nd Edition. Frank, Alex, and I knew that we wanted it to reflect the difference between synthetic and organic pilots, and Frank and I were working on specific implementations. We tried iterations where it persisted round-to-round, iterations where it wasn’t spent on use, iterations where the action gave you multiple tokens, but none of them quite clicked - they were proving too difficult to balance against focus, and the playtesters were pushing back. We were a bit stymied. But coming in with a bit of distance, Alex came up with an answer: calculate would be intrinsically different only by being weaker.

Initially, I didn’t like it - the expected values between changing one focus result and changing all focus results looked too similar on paper. But on assessing this idea more deeply, I realized that Alex’s solution did still achieve what we had liked about the prior iterations. Individual droid ships and pilots could have abilities to make them behave more differently on a case-by-case basis, but “change one focus result” does feel significantly more limited than “change all focus results” when you’re actually choosing to take a calculate action. Players tend to vividly imagine the risk of rolling all eyeballs even if it rarely happens. What I had liked about the prior implementations of the content was that they enforced the difference between the two modes of reflecting the pilot’s concentration, but Alex’s option still achieved that.

Lesson for Playtester: One thing that is important to remember is that, unless you have worked with the designer for a very long time, they probably don’t know what you like about the game. Are you a player who enjoys the mental challenge of the game? A player who comes in for the lore, or thematic considerations? A player who is very much driven by a particular fantasy the game provides? Are you a player for whom community is the biggest draw? Is there a particular playstyle that keeps you invested? In all likelihood, all of these things factor into your enjoyment of the game in some way, and providing this context along with your feedback can make it much more useful to the designer.

This information helps the designer make a better game in a couple of ways. First off, it helps guide the designer see what parts of the game are working for which types of player. If the alignment of the mechanics with the lore is sufficient for most people, but not landing for the people who care the most about the setting of the game, it might need a tweak even if it’s something that most people report as being acceptable. By contrast, if what you like about the game is a particular playstyle, the designer needs to weigh this alongside your opinions about competitive balance. That doesn’t mean that the designer can discount your opinion - on the contrary, if that playstyle is keeping you playing the game, and others echo this sentiment, the designer needs to consider it in their long-term thinking. And it can often help to untangle why experiences were negative for you in the moment. If something negatively affects your preferred playstyle, but wouldn’t affect it as much with a minor tweak, the designer can make that minor course-correction if they are aware of the reason behind your feedback. Simply asking for a change without explaining why it’s impacting something you like is far less useful to the designer.

Lesson Three: Delineate Events and Experiences

Subjective experiences are a critical element of what playtesters provide to a designer. I outlined the role of balance feedback in the prior article but, as the example with Darth Vader illustrates, “balance” is not something that can be measured with numerical models alone. As a designer, you need to know what you are balancing “toward.” As a playtester, you need to give the designer experiential feedback that helps them understand how the current balance “feels” so that they can decide whether and how to adjust it.

But the designer also needs to know the events that transpired in the game, to understand mechanistic issues that may have arisen or assess where an individual’s experiences may have differed from the rest of the playtesting pool.

Lesson for Designers: When designing your feedback mechanism (questionnaires, verbal interviews, or whatever method you prefer), ask the questions in such a way that they encourage the testers to report events in one question, then reflect on those events in a subsequent question. Even if you are purely looking for competitive analysis data, give your testers plenty of space in which to report their subjective experiences. And once you have the data, assess it in three ways:

  • What are the trends in the subjective feedback?

  • What are the trends in the reported events?

  • How do these trends overlap and differ?

Imagine that while comparing subjective feedback, you notice that numerous testers are claiming a particular mechanic or strategy is too weak. Look at whether it is actually winning games. Then look at how its record stacks up to the subjective feedback. In the games where it is reported as being weak, are there any other consistent variables? Sometimes you will find a lurking variable this way - perhaps a piece of content is particularly vulnerable to a given archetype or strategy, or perhaps some counterplay trick players can learn renders it far less effective. This can help you decide whether the issue is one of feel, mechanical potency, or some combination of the two.

Lesson for Playtester: As outlined above, presenting your subjective feedback is key to your role as a playtester. Unfortunately, the truth is that for most people, losing is unenjoyable. This can make it difficult to determine whether you disliked something because of an intrinsic part of the experience, or simply because you lost. Where this differentiation will become apparent to the designer is across a wide array of data from numerous testers. If lots of people are reporting losing to something and also reporting hating it, the designer might determine that it’s a balance issue. If only a few people are reporting losing to something but lots of them are also reporting hating it, the designer might instead decide that the issue has to be solved with a fundamental rework of the content in question. You can make the designer’s job a lot easier in this regard by analyzing your subjective experiences, and presenting them (as best you can) separately from the relatively objective series of events that occurred in the playtest.

Some designers ask for this by default, such as with a series of questions about what happened in the playtest game, followed by more detailed feedback about the content that felt weak, powerful, or problematic. Others do not, in which case it is useful for you to make this division yourself, such as by first giving a brief overview of the game’s events (“On turn 1, I did this and my opponent did this. On turn 2…” etc). This doesn’t need to be exhaustive in detail, but try to focus in on the events as they transpired. It can be especially useful to take notes on these events during the game itself, so that you don’t let the details slip into the vagaries of memory.

Then present your subjective experiences. Did you really enjoy some piece of content you were playing with? What made it fun? Did a particular ability make you have less fun in the game? If so, why? Did you regret anything you chose to play? For this portion of the feedback, try to give the designer an impression of what someone encountering this content in the world might feel.

Finally, give your conclusions. What do you take away from this experience? Do you have suggestions how the designer might proceed? Did something seem fine on a conceptual level, but too efficient? Offer details as needed. This could also be where you provide the math to underscore your conclusions (if efficiency is easily derivable from a formula in the game in question - in many, it is not), or offer comparative insights.

Lesson Four: Don’t Catastrophize Over a Single Bad Test

The most tempting, forbidden button in all of Tabletop Simulator…

The most tempting, forbidden button in all of Tabletop Simulator…

Whether you’re the designer or a playtester, after you see or experience a frustrating loss firsthand, it’s easy to feel like the content you’re looking at will ruin the game. And it’s possible that that instinct is right, but it rarely helps to make big changes based on only one data point.

Designer Lesson: Sometimes a single test game goes off the rails, and everyone has a bad time. As a designer, there are few work experiences more excruciating than observing or playing in the test game that just sucks. Reading about it happening is marginally better (for me, at least), but still can be painful. It usually makes me want to set my prototype on fire.

It’s also part of the job, and usually necessary to get to a good game. So what do you do?

First, take a breath. The point of playtesting isn’t for you or the testers to play a game that is already fun. That’s the point of the game you’re trying to create. But the actual act of playtesting sometimes has to include playing a game that is currently unfun. Otherwise, you’ll never get the information you need to make it fun.

So if you’ve had a really bad test, try it again. Try it with a different group. If there’s something egregious, tweak it, but try not to blow up the whole design. Instead, change as little as possible and see if the issue recurs. If it does, consider making more sweeping changes to address it.

Playtester Lesson: You’ve just had a terrible loss, and you really want to warn the designer about how they’re about to ruin the entire game. What do you do?

First, take a breath. Any experienced designer has assuredly heard countless doom-and-gloom prophecies about their games, and will likely be fairly jaded to this sort of feedback when it appears in isolation. This doesn’t mean that you shouldn’t report this bad experience, but it does mean that you should scale the level of alarm you raise to the amount of information you’re basing it upon. Report the data of the test, but before you ring the bells too loudly, see about dedicating another one of your allocated tests to duplicating the circumstances. Switch lists with your opponent. Play the strategy against another tester, and see what happens. Did you get the same outcome? If you have access to other testers, encourage them to try the problem content and report their results. Remember, the power of playtesting lies in data aggregation. Occasionally, a single test will be enough to suss out that something is a problem, but most of the time, more data is needed to draw a solid conclusion. While tabletop games playtesting rarely reaches the level of rigor of a scientific journal (and won’t until universities with large endowments start chucking around generous tabletop game research grants), the same fundamental principles apply to the inquiry process.

Lesson Five: Examine the Things that Seem Inefficient

Over years of watching games develop, I’ve seen one clear trend: the most overpowered things that make it to print are almost always parts of the game that failed to make a good first impression. It happens like this: something doesn’t catch the playtesters’ eyes early on, so the designer continue to drop the price or increase the efficiency to get people to try it. But the playtesters have limited time, and already mentally categorize that piece of content as “weak,” so they don’t end up testing it. Even the designer ends up undervaluing it, but they keep dropping the price to see if this will be the right price. This goes on through the entire process. Then you come to a new phase of testing with new testers (or, in the worst case scenario, the product release) and everyone looking at it with fresh eyes immediately identifies the overpowered content.

Designer Lesson: You should be watching like a hawk to see what is going unused. Overused content will always be at the forefront of your mind, but underused content in testing is way more dangerous to the balance of your game. If a piece of content isn’t showing up in tests, don’t just drop its cost a little bit and expect the playertesters to realize it has changed. They’re trying to absorb an entire new pool of content, and some amount of mental shortcuts are inevitable. This means that if they’ve written something off as “weak,” it will stay “weak” in their minds until such time as you radically increase its efficiency or rework it entirely. Remember the frog in the boiling pot of water: a slow increase in temperature isn’t noticeable, but a dramatic, rapid shift will jolt the frog to react. This isn’t to say that frogs make good playtesters, but I’ve never actually tried, so if you do give it a shot and it works out, let me know.

If you want to make sure this isn’t happening (and your game accommodates it), during balance testing, you can always create a spreadsheet of all of the content and assign it out for some number of games. While this doesn’t guarantee undervalued things will be identified, it at least means they won’t go entirely untested.

Playtester Lesson: This one’s really simple: you can help by intentionally dedicating some of your tests to content you find boring, weak, or otherwise unappealing. Then report in on how it fared. Remember, the more data the designer has on something, the better they can tune it to excellence. This goes for bringing overpowered content down to an appropriate level, but it applies equally to unappealing content.

This may not be necessary if the designer provides assignments to playtesters, making sure that all elements receive table time. However, taking the attitude of wanting to use content that seems unappealing can still apply in these circumstances. Instead of simply selecting it as required and then bringing other pieces you know are powerful to compensate, try to find a way to make the thing you’ve been assigned really shine. If you can’t, the designer definitely needs to know. Remember, your job as a playtester isn’t necessarily to win any given game - it is to gather crucial data for the designer. If you suffer a crushing defeat because you leaned into a strategy that you thought was inefficient, the designer should recognize that this test was just as valuable as a close-fought game.

Lesson Six: Try Not to Put the Conclusions Ahead of the Data

No, no, you just teach the horse to walk backwards, and then…

No, no, you just teach the horse to walk backwards, and then…

As a designer or as a playtester, it can be easy to muddle your conclusions with the data you are assessing or gathering. Worse, you may be tempted to place your conclusions first and then find data to support it. Playtesting is almost never double-blind, and rarely peer-reviewed. It is data-based, but much of the data is subjective. Games are, after all, at least as much art as science.

Designer Lesson: As a designer, you want your game to be good. And when you’re six iterations deep on a specific mechanic, you really want to be good now. It’s easy to cherry-pick data from a large enough pool of playtest data to support the conclusion that things are fine already without even meaning to. Make an active effort to draw your conclusions from the data, and not vice-versa.

Here are a few concrete strategies I’ve found effective for drawing conclusions from data without letting your desires leech into the process (too much, at least):

  • Use spreadsheets to tally up how many times events (positive, negative, or neutral) are occurring across the testing feedback you’ve received. This is especially helpful if you have access to full reports on the games.

  • Ask your playtesters to give numerical ratings to their feelings, especially with regards to how fair a game felt (I usually use a 1-5 scale, with 1 being “very unfair in Player 1’s favor” and “5” being “very unfair in Player 2’s favor”). This can help reduce the ambiguity you might read into more qualitatively presented feedback.

  • Get someone uninvolved with the project to read over the playtest feedback around a particular topic and tell you their conclusions from the data. They might well identify trends you missed, or see something you wanted to avoid looking at to preserve your ego.

Playtester Lesson: As a playtester, your job is first and foremost to gather and present data: subjective data on your experiences, and relatively more objective data on events that occurred in your playtest games. Your conclusions may also be very useful, but they are secondary to providing the designer with the raw material needed to improve the game.

While you can run tests based on a hypothesis (“List A is overpowered”), test that hypothesis (play List A against List B three times), and then submit the data that supports your hypothesis (“List A beat List B three times”), always ask yourself at each step of the way: “Is my data driving my conclusions, or vice-versa?” If you are concerned that your desire to see a certain outcome may be biasing your results, suggest to the designer that they run a test with unbiased participants. If your hypothesis is correct, this gives it a much better chance of being addressed than your data alone.

Bonus Lesson: Consider the Other Side

This is a gimme, but an important one. Think about things from the playtester’s perspective if you’re a designer, and think about things from the designer’s perspective if you’re a playtester. If you’re a designer, think about how you would see the playtest you’ve designed as a playtester. If you’re a playtester, ask yourself what the designer is trying to get from you, and why they might want that information. Both sets of lessons apply to both groups, because each can do their job better by putting themselves in the other’s shoes.

Read More
Max Brooke Max Brooke

Playtest Design Lessons, Part One: Scoping it Out

Last week, I took a break from my article schedule because… it just didn’t feel right to post a normal article after a violent insurrection that attempted to overthrow the legitimate result of the United States’ most recent democratic election. I tried to write something more topical, something that took on hard questions of art and social responsibility, but I couldn’t find the words to express what I wanted. Maybe someday I’ll be able to articulate myself on that topic. I’ll keep trying. But today I want to publish the article I had intended for last week: part one of a two-part article on playtesting and design.

Despite the well-known importance of playtesting, I feel like I rarely see designers writing to other designers about the specifics of their playtesting process, or writing to playtesters about how to give the most impactful feedback. As such, I’ve decided to delve into this topic over the next two weeks. The remainder of this week’s article will address the some basics, and the fundamental concept of playtest scope. Next week’s article will delve into five lessons for how designers and playtesters can set each other up for success.

First Off, What is Playtesting?

For those completely new to tabletop games, I would define playtesting as follows:

Playtesting is the act of playing an unfinished game with the intent of observing and reporting on what one experiences.

Playtesting should help the designer answer certain questions: How do other people feel when they play the game? What incentives does the game create through its rewards and challenges? How is the game interpreted by players who don’t know what you were thinking? Does it feel fair? Is it fun?

The designer then takes this feedback, assesses it, integrates some parts of it into the game, and creates a new iteration, which is subjected to playtesting again. This process is repeated until the game is good (or until the deadline means that the game must be finalized).

Within the game design process, I see playtesting as where a tabletop game goes from being a concept to being a game. I’ve worked with a lot of playtesters, read more playtest feedback than I can really conceptualize, done a lot of playtesting of games I did and didn’t design, and generally seen the process from all angles, across all sorts of tabletop games. It is unquestionably one of the most vital and most labor-intensive parts of the design process. For ongoing games, the playtesters are often the most engaged, dedicated players of the game, and many put vast amounts of time and energy into playtesting. For a designer, high-quality playtest feedback is always a cornerstone of making an excellent game (or creating excellent new content for an existing game).

What Does Playtesting Look Like?

Playtesting looks like playing the game, then delivering some sort of report of the experience (verbal, written, or mental, if the designer is playtesting and/or a telepath) to the designer. I say that it “looks like” playing the game because, fundamentally, it isn’t the same activity. While playtesting can be quite fun and very fulfilling, it isn’t the same as playing the game recreationally. First, the game is unfinished, and may well not be very good (yet). Second, and perhaps more significantly, it requires the tester to have a certain degree of introspection, and to assess not just the game but also their experiences playing it. It’s not even the same activity as playing a game and reviewing it, because a review is, fundamentally, for a potential consumer, but playtest data is for the designer. And the designer’s goal is not just to understand the merits of drawbacks as the game is, but to assess from that what the merits and drawbacks of the game could be if it was slightly different.

It’s worth noting at this point that there aren’t really widely held standards for playtest organization or feedback across the tabletop games industry. Even within the same playtesting communities, goals and expectations often differ between individuals or playtest groups. And, as a result, a lot of the playtest feedback I’ve seen over the years has ended up being unhelpful. Although in every case the playtesters were earnestly engaged or trying to be helpful, sometimes feedback missed the mark. Playtesting is all about communication, and that can break down in a couple of ways.

Pictured: Those things that used to fall over and deprive us of telephone service. You know. Bird seats.

Pictured: Those things that used to fall over and deprive us of telephone service. You know. Bird seats.

The first way communication can break down is if the designer fails to communicate important information to the testers about the scope of feedback needed at a particular stage (I’ve certainly been guilty of this). The second way it can break down is if the playtester gets stuck in patterns of feedback that fall outside the scope and the designer doesn’t step in to correct this. Whenever I’ve seen this happen, the mismatch between the data the designer needed at that time and the data the tester delivered were almost always avoidable in retrospect.

Know the Scope

Not all playtests are the same, even if you’re testing the same game you tested a week ago. For any given playtest, the designer on the project should have specific goals, and these objectives should be articulated to the testers. As the game (or piece of content for an existing game) begins to take form, these goals should shift to deliver the information that is most relevant to the designer for that phase of development.

For most of the games and game lines I have worked on, testing fell into one of three general categories:

  • Function Testing: Is the game is fun and understandable?

  • Balance Testing: Is the game competitively balanced?

  • Patch Testing: Whether a game that is already out in the world should be tweaked or altered in some way?

For each category, the designer and playtesters should approach the test differently.

Function Testing

Cooked and ready to be thrown at a wall.

Cooked and ready to be thrown at a wall.

Function testing encompasses everything from the earliest tests of a new game to the development of new expansions for an existing game, and its goal is primarily experiential rather than mechanistic. While the work of defining a wholly new game is quite different in some ways from designing a successful expansion to an existing game, the process for playtesting starts much the same way. The designer needs to know if players are interpreting the content the way they intend and, if so, if they enjoy it on a fundamental level. This is the phase for the designer to experiment, to throw the proverbial spaghetti at the wall of playtesting and see what sticks. And this is the phase for the playtesters to throw some sphagetti back at the designer’s face.

Function Testing (Designer): Start by asking yourself what you want to know in this stage of testing, and then design your feedback systems around that. Personally, I often use short questionnaires (via Google Forms) as feedback systems, and design different questionnaires for each stages of testing. Written reports allow you to aggregate more data more quickly, which lets you look for commonalities across a larger pool of tests, and having some direct observation to supplement that larger pool of data is also very useful at this stage, as well. Regardless of how you’re receiving feedback, make sure you’re asking the right questions to get the information you want. For function testing, you generally want to know if players are understanding and enjoying the game, so ask questions that will get you this information. Instead of asking “Did you understand X mechanic?”, ask “How would you explain X mechanic to me?” or ask players to give you a summary of their game including a sequence of play. Instead of asking the testers if they enjoyed a specific part of the game, ask them what they enjoyed and see what answers arise. And finally, give playtesters a dedicated space to discuss topics that are not in scope in you reporting system. For example, if you are working on a competitive game, playtesters will always want to raise balance concerns - even if you’re just trying to figure out if a mechanic enough fun to be worth trying to balance. Instead of fighting their inclinations, give them a dedicated space to address these concerns. You can always set aside this information until later if it really isn’t relevant yet, and not having to extract it from the feedback you’re looking for will save you a lot of time and mental effort.

Function Testing (Playtester): During function testing, focus your feedback on two things: function, and that thing you can’t spell it without: fun. Don’t get stuck on the nitty-gritty of balance at this stage. You can still give balance feedback, but segment it off to a separate part of your report, presented after your main findings about why you understood and enjoyed the content (or not). Additionally, try to get not just into whether you enjoyed or understood something, but how and why you understood or enjoyed it it. If you think you might be misunderstanding something, tell the designer how you think it works - if your interpretation isn’t lining up with their expectation, they’ll know they have a problem. If something was unfun to play against, drill down to what specifically made it unfun. Did it take away choices you expected to have during the game? Did it negate choices you made during list-building or the early stages of the game? If you had made different choices, could you have prevented the unpleasant experience from happening? Remember the problems you ran into, and see if they happen again in future games.

Balance Testing

Just try stacking yet another Millennium Falcon on top of that pile…

Just try stacking yet another Millennium Falcon on top of that pile…

Balance testing is what I thought playtesting meant before I had worked in the industry. This is the part of testing where a designer wants to discover if specific strategies, builds, or paradigms that haven’t yet been released are “overpowered,” and generally if the game rules are refined in their presentation. Balance testing is in many ways simpler than function testing: the game works, the question is just whether or not it can be “solved” too easily, and whether significant uncertainties occur in play. Balance testing’s importance varies significantly compared with function testing. For competitive games, the focus of balance testing is usually on creating parity among strategic options. For other games, the focus might be more on whether the game feels fair (such as in roleplaying games and cooperative games), or if players can generally come to a shared understanding of the rules.

Balance Testing (Designer): As with function testing, a designer needs to know what they want to do with the data they acquire in balance testing. Is the goal to promote certain strategies or game pieces over others? Making all pieces equal is not necessarily the target of balance testing, depending on the game. For example, if you are making a competitive game around a famous movie (say, Star Wars), many players will want to use Darth Vader simply because he is already their favorite character. If any other piece is an equally effective choice, this means Darth Vader will be viewed as being “average,” which in turn may not lead to the most iconic game experience (this is a topic for a future article unto itself). By contrast, if the game is an abstract tile-placement game, players may not have any sentimental attachment to the individual tiles, and making one tile better than average could well be seen as a major design flaw. The designer must decide which of these outcomes they intend to deliver, then build the test to get the information needed to bring about that result.

Balance Testing (Playtester): From the perspective of the playtester, it is in some ways more straightforward to give good balance feedback than function feedback. Simply report the results of the game and your impressions of why those results occurred, so that the designer can look at the data across numerous games and determine what’s overperforming, what’s underperforming, and what’s in about the right place. However, there is an additional wrinkle to consider. As I discussed in a previous article on how player choices impact the overall game, players sometimes gravitate toward options that, while not optimally effective, optimize the fun of one player. For example, when the Nantex-class starfighter underwent pre-release points testing in X-Wing 2nd Edition, tests gravitated toward the most tournament-effective build: Sun Fac or another ace with Ensnare. However, the build that began to crop up at tournaments with the most pernicious results was a build with numerous mid-tier pilots with Ensnare. This build could control the entire field, and generally made the other player’s life miserable. It also didn’t win tournaments. By the numbers, it wasn’t overpowered, but it won enough games to satisfy the players using it because it won them in a fun (for that player) way. This is where the trickiest part of balance testing comes in: uncovering the strategies and builds that sit at the cross section of “extremely unpleasant to play against” and “effective enough to be taken, but not great.” And this is why it is important to include experiential feedback even at this stage, because sometimes an ineffective strategy can still be a balance problem the designer should address.

Patch Testing

Hey, they got it working again, didn’t they?

Hey, they got it working again, didn’t they?

Patch testing is any test dedicated to testing the content that has already been released, to see if it should be changed via errata or some sort of other update. Due to the difficulty of changing printed content, patch testing is a far smaller part of the tabletop game space than it is in the digital space. However, the divide between these two has grown fuzzier and fuzzier in recent years. The highly successful launch of Magic: The Gathering’s Arena online version has certainly lead to more “patch testing” of Magic. While once cards were very rarely changed for balance via errata in Standard in the past (instead being banned, if necessary), the focus on Arena has lead to several changes to released cards purely for balance reasons. Patch testing is essentially the ongoing form of balance testing, and it has many similarities, but also some key differences.

Patch Testing (Designer): First and foremost, patch tests should be orchestrated to make the most of the data you have from the game being played “in the wild.” Even if your game has a modest player base, it’s likely that the feedback you receive from the wider community will be much more numerically substantial than what you can draw in an internal playtest. As such, make sure that you’re leveraging what you already know when you not only set up your proposed changes (by taking this data into account in your proposed changes) but also when considering its impact. If a piece of content is highly played and popular, consider not just whether you want to bring down its effectiveness, but by how much, and then ask questions that will get you the data to find the sweet spot you’re looking for. For instance, asking if a piece you have made weaker is still effective can often lead to misleading information. If players are forced to use something, they will often find it is reasonably effective on the table, but this doesn’t mean that they will assess it as effective when deciding whether or not to use it. Instead, ask testers to compare the piece with a now-comparable piece, preferably across two games. If they consistently report the weakened piece underperforming the now-comparable piece, you may have overshot your mark. Additionally, for patch testing, asking players to dedicate some portion of tests to using certain specific builds that currently exist in the wild (especially those you aren’t changing) against altered content is often an extremely worthwhile benchmark.

Patch Testing (Playtester): For the playtester, patch testing is quite a bit like balance testing, but there’s one extra important thing to consider: what are the second-level effects of the changes? While many groups will likely report on the direct effects, you can stand out by examining these secondary effects and reporting the results. Does making one strategy weaker suddenly create a new threat from a game element or list that it was holding in check? Does making something that is underperforming cheaper change the threshold for how many of that thing a player can purchase within a legal list? Will an errata for clarity on one issue create uncertainty on a different issue by establishing an inconsistent precedent? Try to step back and look at the game holistically with the proposed changes implemented, rather than simply focusing on the things that are changing. Oftentimes these tests will end in a negative answer to the question you have set out to answer (“Is this weird new strategy actually good? Probably not.”), but remember that a negative answer is still extremely valuable feedback to the designer.

And that brings us to the end of playtesting scope! Tune in next week for five lessons on designing for playtesters and playtesting for designers!

Read More
Max Brooke Max Brooke

Whose Fun Is It Anyway? Part Two: Design a Healthy Pond

Last week, I discussed the challenges of “fun eutrophication” in long-term game design: How designs received as “fun” in isolation can build up to overwhelm the system and make the game as a whole less fun. But I punted on actually discussing how to address this problem, because the article was getting a bit too long-winded. Today, we’ll dig into that topic: What are some ways a designer can prevent and mitigate fun eutrophication?

To get what is hopefully obvious out of the way at the start, the answer isn’t “whenever testers/players report enjoying a piece of content, don’t implement it or remove it.” This is a course-correction error it’s easy for designers of highly competitive games to fall into, and it leads to its own problems: long-term stagnancy of design and, ultimately, decay of enthusiasm for the game. While not all content needs to be competitively oriented, some portion of it needs to be for a competitive game to thrive. A similar phenomenon to fun eutrophication, the dreaded “power creep,” occurs in many competitive games because the alternative to making progressively stronger content is making new content weaker than old content. And that will starve the game for certain, so if you must err in one direction, err on the side of content people will actually want to use. The same is true of fun in design. A pond with no nitrogen and phosphorous is one without plants, which means the other forms of life that can thrive there are much more limited. Too much of a good thing does mean that on a certain level, the “thing” is good, and perhaps even necessary.

Instead, as with preventing eutrophication while continuing to farm (for which some amount of fertilizer use is usually necessary or at least helpful), what is needed are management techniques. Fertilizing efficiently, planting buffer fields, and even artificially cycling the water to spread oxygen can mitigate or prevent eutrophication. To keep testing the elasticity of this metaphor, I’ll use these as three models for managing content in a game environment.

Using Nutrients Efficiently

A little push goes a long way.

A little push goes a long way.

In the early days of most games, there are things that would never have gotten have printed later. Magic’s “Power Nine” are a classic example, but even much more mundane cards like Dark Ritual (get three mana for one) from the original set had enough of an unbalancing effect on the overall game to be pushed out over time. Dark Ritual is fun (for the player using it) - you get more mana, so you get to play more cards, and people love playing their cards. It’s less fun for the person sitting opposite, who gets to watch turn one Hypnotic Specter get played five games in a row. I got well acquainted with this back in high school (spoilers: I was not the person casting Dark Ritual into Hypnotic Specter, but I was the person just about ready to throw my deck at my best friend if he did it again).

In X-Wing First Edition, one place where fun eutrophication occurred was dice modification (as discussed last week), but another was action economy (cards that gave a ship extra actions). The card Push the Limit made action economy widely available to all pilots and, at the time it was printed, seemed like a really good idea. It was well-received in development for giving fragile ace ships that much-needed bit of extra maneuverability while still being useful to less maneuverable ships by granting access to double modifications. And it was fun! Doing a boost into a barrel roll with your A-wing or TIE interceptor made you feel like you were behind the stick of a zippy, maneuverable starship. In many ways, it was a great piece of design.

In other ways, Push the Limit was a huge problem. Its ubiquity made the Elite Pilot Talent slot nearly a requirement for a pilot to be considered for competitive use (outside of some niche cases). It so dramatically increased action efficiency that going without it was a very unusual choice, but what’s more, it didn’t improve the performance of all ships or pilots equally. High Pilot Skill pilots got vastly more use out of it, as did any pilot who could stack it with other forms of action efficiency. Because extra actions often translated into extra dice modification (through the Focus, Target Lock, and Evade actions), it also stacked well with other forms of dice modification. It became hard to design pilots without the Elite Pilot Talents lot because lacking access to Push the Limit so dramatically curtailed their effectiveness. Meanwhile, it became challenging to design anything for one of the most ubiquitous slots in the game because this one card was so desirable. Push the Limit was the equivalent of dumping fertilizer all over your field and letting it run into the pond: it was way too much of a good thing.

When Frank, Alex, and I sat down to discuss X-Wing 2nd Edition, we knew that Push the Limit was going to be one of the most complex cards to handle in the process of translation. On the one hand, it was beloved. On the other hand, it singlehandedly reduced the variety of viable lists more than any other card and made numerous already-pronounced issues even worse. We knew that we needed to preserve some ships being able to do multiple actions in a turn, but we also knew that letting nearly any pilot do any two actions was unbalancing in the extreme. And this is where we came up with the idea of linked actions: actions that directly lead into another specific action.

So, like a farmer using an irrigation system to deliver nutrients to the roots of plants directly, we built a system for delivering specific action combinations to specific ships, while giving a couple of ships innate chassis abilities that granted even greater freedom of actions. This meant less design “runoff,” significantly reducing the unintended consequences of players stacking multiple actions with other beneficial effects. Action economy is still an important part of 2nd Edition, but it’s distributed by the system itself rather than being delivered in an unregulated manner via a generic upgrade card.

To generalize this idea: when you find something that is both fun and warping to the ecosystem, figure out how to deliver the experience of its fun in the smallest dose that accomplishes what you want. You don’t need to maximize the quantity or value of a good thing for it to be enjoyed by players. With the benefit of hindsight, one can say that if First Edition Push the Limit had allowed a ship to perform a specific action on its action bar as a second action (such as a boost or barrel roll) or if Dark Ritual had some additional requirement that made it less likely to be useful on the first turn (like many of its successor cards), it would have been strong enough, and still been a fun tool for players that achieved its purpose without having such a dramatic effect on the overall game.

Planting Buffer Fields

The color pie helps keep effects from eroding into one big, colorless slurry. Kind of like the planeswalkers stopped the Eldrazi from turning the multiverse into one big colorless slurry. Actually I have no idea if that’s accurate or not. My Magic l…

The color pie helps keep effects from eroding into one big, colorless slurry. Kind of like the planeswalkers stopped the Eldrazi from turning the multiverse into one big colorless slurry. Actually I have no idea if that’s accurate or not. My Magic lore knowledge is mostly from Invasion. Is Gerrard still around?

Another means to keep fertilizer out of waterways is planting buffer fields (also called riparian fields). These are grasses and similar plants that both absorb runoff nutrients and slow the process of erosion, which prevents soil from flowing into the waterways. In game design, buffer fields are the internal limits you place on your design space and the external limits you place on the choices players are allowed to make. Both of these are means of “siloing,” which keeps certain pieces of content from interacting.

A simple example of an “internal limit” buffer field is Magic’s famous Color Pie, the thematic constraints that define certain mechanics as “belonging” to certain colors. The color pie isn’t coded into the rules of the game in any way; in theory, a designer can make a blue card that deals direct damage. However, the internal convention says that such cards should be infrequent, and less efficient or more limited than their red counterparts. This has important thematic ramifications, helping to give each color a personality and feel. But importantly, it also prevents players from doing everything with a single color. To combine certain effects, a player must accept some amount of inefficiency - either building a multicolor deck, or choosing less efficient cards. Because this paradigm exists within the design of Magic, the content of the game naturally pushes back against efforts to simply put all of the best cards into a deck. If the best creatures are green but the best direct damage is red and the best card draw is blue, one must naturally accept a tradeoff: either the deck becomes less efficient (because it becomes more likely that the cards you draw won’t match the lands you need to play them) or the player gives up some types of effects (or at least selects less-efficient cards to produce those effects).

An example of an “external limit” buffer field can be seen in Warhammer 40,000 8th Edition’s detachment system. For many years, Warhammer 40,000 had hard limits on combining armies (with a few thematic exceptions, it was impossible). But combining armies was thematic to the lore of the game and had obvious sales ramifications, so eventually the designers relented and added a thematically driven alliance chart. Unfortunately, much as tearing out a barrier field causes erosion that exacerbates eutrophication, when this system was opened up, the game was quickly plagued by power combinations created by taking the best force-multiplying passive abilities from several armies and using them together to get dramatically more value out of all of the army’s units. With 8th Edition, the option to combine armies remained (as it had proven quite fun), but much of the barrier field was replanted with a simple change: the passive aura abilities were no longer shared. Each army could only improve its own units. And since each army is already siloed to a large degree, this meant that the number of ways a unit could be improved was naturally limited. Combined force armies remained popular (indeed, too popular due to some unrelated issues), but this issue was resolved.

X-Wing 2nd Edition had a number of these buffer fields, but one of the clearest examples was created for the Tactical Relay upgrade slot introduced with the Separatist faction. When I was designing the first cards for this slot, I knew that the Tactical and Super Tactical droids needed to be an important part of this faction thematically and area buffs were a fun way to reflect their different personalities and strategies. But I also knew that we would quickly have stacking problems if multiple Tactical Relays could be used in the same list. While one answer would be to simply make Tactical Relays really expensive to make taking more than one unappealing, the practical implementation of this solution would likely have made taking one unappealing, as players reasonably shy away from extremely expensive upgrade cards when they can be destroyed along with a single ship. As such, I created the Solitary restriction for Tactical Relays, preventing multiple Tactical Relay cards being taken in a single list. From a design perspective, this solution is a bit direct (some might argue hamfisted), but it worked. Tactical Relays provide playstyle-defining effects at a competitive price, something that players enjoy, but you always have to choose which one you want.

Cycling Content

Not that kind of cycling!

Not that kind of cycling!

While researching this article, I encountered one unconventional solution to eutrophication I hadn’t heard of before: cycling oxygen into the water source via a bubbler. Doing this keeps the water oxygenated, preventing excessive buildup of plants and allowing animals and other things that need oxygen to thrive. It’s not a perfect solution - it requires external energy to run, which isn’t ideal, and it works better in some environments than others. It also has an excellent analogue in gaming: cycling of content.

Cycling is quite simple. Every X amount of time, you kick some of the content out of the game. An MMO like World of Warcraft does this very organically by introducing a new dungeon or raid which has gear that completely outperforms whatever you acquired in the previous set of dungeons or raids. While the game doesn’t take away your old gear, you don’t need it any more and will likely stop using it. Magic: The Gathering and other card games do this by cycling out various sets from its competitive formats. X-Wing 2nd Edition does this through Hyperspace rotations, and (infrequently) by recosting troublesome upgrade cards to a price that is higher than their perceived value so that design space for similar cards is created underneath them. For instance, Precognitive Reflexes was brought in to essentially replace Supernatural Reflexes, which had proven too potent over the first year of the game. It didn’t technically replace the older card, but it was much more competitively costed for most pilots than Supernatural Reflexes, which had been increased to a price point well above where most players considered taking it.

Cycling has an obvious advantage: the designer gets a partial or total refresh, a chance to take what they’ve learned and iterate in new ways. The new content will almost always be better than its predecessor because it was made with vastly more information about the way it will be used. Further, old ideas can be iterated upon in new ways, including by creating more controlled doses of fun effects and siloing effects that were previously too efficient in concert from each other. From the perspective of the players, it gives them the chance to re-explore familiar concepts within the context of a new environment instead of having to learn entirely new paradigms of play.

It also has an obvious drawback: it renders the content people acquired less valuable, or potentially erases its value entirely. This means that cycling works better in some games than others, and some player bases will accept it while others won’t. For instance, a game like Warhammer 40,000, where a player puts tons of time and energy into each miniature, can’t really cycle content on a timescale that matters to the current player base. While it can shuffle up the metagame with new codex releases and price updates and periodically phase out really outdated units, there are limits to how much the player base will tolerate.

Finally, cycling can also have a subtler drawback. Like the bubbler in the lake, it requires energy from outside the system to run and maintain. The designer (or someone else) has to manage the process of the cycling itself, planning out the course of the game over time in a highly regimented way, and then communicate the changes as they are implemented. And the player base has to spend energy keeping up with the changes. All of this takes time and energy over and above what proactive efforts like judicious design and siloing require. Once the player base is accustomed to a rhythm of cycling, it may take less energy to implement, but will also take more energy should the designer want to change it in the future.

Closing Thoughts

Of course, these three methods are not the only ways of dealing with the problem of player-selected fun making a less fun game. Some games heavily limit player choice about certain things; there is no army building in most versions of chess, after all. Other games, especially small, tightly designed games, can have baked-in limits on player interaction that prevent player choices from fundamentally altering the feel of most games played. In the base game of Carcassonne, the player’s choices are so simple that there isn’t much they can do that is extremely unfun for the rest of the table (except perhaps spend too long deciding where to place their piece). On the other hand, most games of base Carcassonne are very similar to every other game of Carcassonne, because the players’ aggregate choices have so little impact on the experience of the game. This can be a positive or a negative, depending on your perspective (personally, I find Carcassonne very meditative, and an excellent palate cleanser after a more cutthroat game).

For large, ongoing games, however, the challenge of fun eutrophication is a very real issue, and not easily avoided, because it rarely occurs due to a single decision. Instead, like water eutrophication, it accumulates over time, and can be quite difficult to undo once it has set in. There is no silver bullet method to avoiding it or undoing it. Proactive management methods like judicious design and content siloing generally have fewer direct drawbacks than reactive ones like cycling, but for most long-running games, a combination of all of these and other methods is required to maintain a healthy pond.

Read More
Max Brooke Max Brooke

Whose Fun Is It Anyway? Part One: Too Much of a “Fun” Thing

Today we’re going to talk about ecological problems as a metaphor for a challenge one will often face in designing a long-running competitive game. Let’s start with the metaphor and work out from there.

When is a Game Like a Pond?

(Credit: Nara Souza, Florida Fish and Wildlife Commission. Public domain.)

(Credit: Nara Souza, Florida Fish and Wildlife Commission. Public domain.)

Have you ever seen a body of water that looks like this?

What’s often going on in this picture is pretty straightforward. Fertilizer runoff from fields or lawns gets into the pond (or other body of water), carrying with it lots of nitrogen and phosphorous. Much like the plants that the fertilizer was intended for, the algae in the pond grows explosively when given this excess of nutrients. At first consideration, this might seem harmless - it’s just algae, after all. But the algal blooms that form cover the surface of the pond, preventing light from reaching the underwater plants. Suddenly, the plants that were creating oxygen necessary to sustain the ecosystem of the pond begin to die off. Further, so many plants are dying and being broken down in the environment that their decay eats up even more of the oxygen. The environment becomes eutrophic (literally “well-nourished”). Too much of a good thing changes the entire environment and makes it uninhabitable to fish and other creatures that previously called it home. There are things that can be done to mitigate eutrophication, such as planting barrier strips of plants to absorb nutrients between fields and water sources, actively managing erosion with plants that hold the soil in place, and using fertilizer more judiciously by changing delivery methods (or avoiding it entirely, for lawns and other decorative plantings). This is by far the most serious issue this article will address, and is a problem that affects many bodies of water around the world.

A healthy competitive game can often be compared to a healthy ecosystem. And just like a body of water can become eutrophic if it receives too much of a good thing, a game can also become oversaturated in the mechanics that seem fun in isolation but turn the entire environment toxic after crossing a certain threshold. Without meaning to, a designer who is simply trying to give the local organisms “what they want” can easily turn the entire game into a sludge pit.

You Can Always Get What You Want (But You May Not Like It)

They don’t really look like box cars to me. Some spiders have six eyes - feels like a missed opportunity for a theme match with “snake eyes.”

They don’t really look like box cars to me. Some spiders have six eyes - feels like a missed opportunity for a theme match with “snake eyes.”

As I discussed in my article on identifying the heart of a game, when you look at player behavior in aggregate, it converges toward the strategies that are effective rather than towards what players find the most fun to play. On the other hand, if multiple options are reasonably close in efficiency, players will generally pick the one they enjoy more.

But there is a second, even trickier corollary to that point: if multiple efficient options exist, most players will choose what is fun for them to play, not what makes the game as a whole the most fun activity to participate in. And this makes sense from the perspective of the player: in a game where you and your opponent both create a list beforehand without the other’s knowledge, it’s hard to conceptualize where their fun will be. Maybe you avoid some of the most egregiously unfun things that you’ve observed in the past, or that the community agrees are unfair, but fundamentally, you’re the one who’s going to have to use the list, not your opponent. It should be fun for you to play.

In games that rely on randomness, dice modification effects can be an excellent illustration of this point. The randomness of dice is one of the most widely reported “unfun” parts of miniatures games like Warhammer 40,000 and X-Wing: The Miniatures Game. While some people love the thrill of clattering dice (be they Christmas-colored octahedrons or other polyhedrals), to many others, they’re seen as a necessary evil. As such, players (especially competitive players) tend to report that effects that mitigate this randomness through rerolls, modifiers, or setting a die to a specific face as “fun.” If you offer players an effect that mitigates their randomness in a vacuum, they’ll usually accept it and tell you it’s making the game more fun. From the perspective of the designer, it can look like an easy win to add tons of these effects, because players almost always love them.

And this remains true as long as they are the ones who get to mitigate the randomness.

When the opponent gets to mitigate the randomness, and especially when the opponent gets to significantly mitigate the randomness (usually by stacking lots of these effects), players often report growing frustrated by feelings of “helplessness.” And this makes sense too; if you have to roll dice but your opponent doesn’t, you’re at a huge disadvantage. Dice in these sorts of games may be a necessary evil in the eyes of many players, but when the dice are allowed to fall where they may, they are at least viewed as unbiased in their unfairness. If the opponent’s mitigation of randomness is significantly more efficient than a player’s own, that apparent impartiality of dice vanishes.

But perhaps more significantly, when both players have substantial amounts of randomness mitigation, the entire game can begin to feel like it is sliding into a series of pointless inevitabilities worthy less of an epic battle directed by George Lucas and more of a novel by Albert Camus. The whole game becomes less fun for both players despite the removal of an element that each player would have reported as impeding their personal fun. And interesting, this loss of fun with an increase in predictability does not depend on one player having an advantage. Even if no player has a clear advantage in their list, games where both players have substantially mitigated the randomness of their army are generally reported as less fun. So this is not merely a question of balance, but of how players experience the game. This was a core problems of X-Wing 1st Edition in its waning days. Randomness might be unpleasant in the moment, but certainty will really squash fun at the larger scale. This is the eutrophic pond.

The Big Picture

The Nazca Lines (The Tree), Photographed by Diego Delso, 2015

The Nazca Lines (The Tree), Photographed by Diego Delso, 2015

Obviously, not all games use randomness, and even among those that do, the place that randomness occupies in the game loop has a substantial impact on how it feels in the game (as explained in this excellent article by Andrew Fischer). But in my experience, the idea that there can be “too much of a fun thing” can be applied more broadly. In X-Wing, it applies to post-maneuver movement effects, as well. At the start of 2nd Edition, these seemed even safer than dice modification - after all, people are moving their ship (the game’s core fantasy), and it relates to positionality and bluffing, two of the game’s pillars. Yet even so, when a ship hit a certain point of saturation of post-movement options - say, Kylo Ren with Supernatural Reflexes - the mechanic that an individual player generally finds fun to use started to become something that sours the overall experience. The algae starts building up.

And interestingly, “too much of a fun thing” can even be relevant when a mechanic is only present in a few pieces of content. Companions in Magic: The Gathering are exactly the sort of mechanic players probably reported loving during testing. And the appeal is really obvious: people love Commander (in large part because of the reliability with which they get to play and use their Commander, as I mentioned last week). What if you could have a Commander in standard games, with a drawback that sets certain restrictions on your deck? But if you and your opponent can both always play your favorite creature on the same turn, games become very repetitive. Because the probability of the event occurring (you getting this creature) is 100%, even a few cards with this mechanic can cause it to become oversaturated in the experience of the players. So it’s no surprise that Companions were quickly changed via errata to have additional costs, making them more subject to the natural rhythm of the game like other cards.

As an interesting sidenote, why doesn’t Commander suffer from this problem? Well, it can, but in my observation, as long as you’re not playing the same decks against the same people over and over again, it won’t. So much of winning and losing Commander is tied up in the specific politics of the table situation. Barring lockout combos that your friends will probably start refusing to play with, the same situation rarely recurs. Commander tends to get stale when you play only with the same people for too long. So ultimately, it’s about the saturation of the mechanic within a player’s experience, not the saturation of the mechanic as a percentage of the content.

When a designer approaches new content, they need to be aware that “fun” experienced and reported at the micro level can create a toxic environment if allowed to build up too much. In this way, actions that seem sensible when considered alone can lead to adverse outcomes when viewed as part of the wider picture. It’s part of the game designer’s job to see the big picture - to recognize that nitrogen and phosphorous are already in the pond, and indeed, are necessary for life - but in excess quantities, they’ll create an environment uninhabitable to fish.

So, what’s a designer to do? Well, much as with eutrophication, there are mitigating steps you can take. But you’ll have to tune in again to see how far I can stretch that already fraying metaphor. Come back next week for Part Two: Designing a Healthy Pond.

Read More
Max Brooke Max Brooke

Rediscovering the Magic: Fandom, Creation, and Games

About a year ago, I fell down the Skyrim rabbit hole again. How does this relate to X-Wing and other tabletop games, you ask? Give me a second, I’m getting there. But first I have to talk about how my Skyrim character kept freezing to death.

Special Modifications

Fun fact: This is also the weather forecast for Minnesota next week…

Fun fact: This is also the weather forecast for Minnesota next week…

It happened like this: I hadn’t thought about the game with no end in years, and suddenly the Youtube algorithm or some other news source threw me to a video on how the game had gotten official downloadable content (DLC) based on earlier fan-made modifications (mods) that supported survival mechanics. Instant traveling is disabled, your character needs to eat and sleep - the sorts of things you’d expect in any post-Minecraft survival game. I was curious, so I reinstalled Skyrim, played for a few hours, and was hooked. I’d “finished” the game twice before (which is to say: sunk in around a hundred hours before getting bored and stopping - whether Alduin the World Eater was actually stopped or not in his quest to eat the world, I can’t recall). 

But upon starting this new playthrough, I was immediately struck by how differently I was playing the game was this time around. The need to keep my character supplied, rested, fed, and warm while traveling everywhere on foot created a set of challenges that drew me into the world of Tamriel in a way I had never really experienced before. I was paying attention to things I had never bothered to acknowledge before: the food scattered meticulously throughout the world. The weather, and sheltered locations between cities to wait out a cold night. The location of beds; there are so many beds in that game, but really I’d never cared before. Needing to use these resources made me care that the developers had taken the time to put those details into the world, and my overall experience with the game was made much richer for it. This in turn led me deeper, into fan-designed mods that enhanced the survival experience and elevated other existing content in Skyrim better interact in this new playstyle.

X-Wing has also had a vibrant rules modding community for much of its lifespan (it also has a flourishing hobby scene that is worthy of discussion, but is also slightly tangential to my point today). It ranged from people running custom scenarios at cons to the designers of complex AI systems like Heroes of the Aturi Cluster and Fly Casual to folks creating rules and minis of even the most obscure of EU vessels, a lot of creative energy comes out of the X-Wing community. I’ve run some of those custom scenarios at cons myself, and I even got to dip my own toes into the more “official DLC” side of things when designing the rules and scenarios of Epic Battles, the expansion for multiplayer game modes in X-Wing 2nd Edition. So I’ve spent quite a bit of time observing which game modes and fan rules seem to have a big impact on the community, and which ones drop off the radar.

Everything Old Made New Again

The Cloisters - Playing Cards ca. 1475–80, because the more things change, the more they become the same again…

The Cloisters - Playing Cards ca. 1475–80, because the more things change, the more they become the same again…

Obviously, quality of execution counts for a lot. Yes, “homebrew” can get a bit of a bad rap for its notoriously inconsistent quality in tabletop spaces (especially if you’re playing D&D 3.5 in 2005 and your GM has a folder full of rules that only exist when it’s inconvenient for you, personally). But for X-Wing, I can confidently say that there’s a lot of content out there that is very high-quality, and some things still stick more than others. A well-executed custom ship or unique narrative scenario will be played and enjoyed, but I can’t think of any that have caught on widely. There tends to be a certain consumability to that sort of content - most people play it once or twice, but don’t make a habit of using it. As a result, they don’t proliferate it by sharing it with lots of other players over numerous games - which means fewer people are exposed to it, reducing its impact.

In an interesting parallel, we even observed this phenomenon for the official narrative scenarios included in many 1st Edition X-Wing products; while many people liked the idea of them in theory, we found that few people played them more than once, and only a small number of people played them at all. And as a result, their impact on the community was minimal even though they were official, well-made, and widely distributed. And this was because they offered a very narrow experience - a specific encounter, set up and played with specific pieces. That meant that people didn’t want to replay them, generally, and didn’t encourage their friends to try them. When I set about designing the scenarios for Epic Battles, I aimed to make them less like these highly specific narrative scenarios and more like “encounter archetypes” that players could approach with lots of different builds and strategies, with the goal of replayability and longevity in mind, and from what I have seen, this has paid off at least to some degree.

So I don’t think execution is the whole picture for enduring appeal of a tabletop game mod, official or fan-made. Generally, the tabletop game offshoot projects that seem to enjoy the longest attention are the ones that break players’ out of their preconceptions about how to play the game, and as a result, help them find new fun in parts of the game that were already there. Like the survival mechanics that made me look down at the details of Tamriel I had ignored in previous playthroughs, the X-Wing mods that have been the most impactful are the ones that serve to make a large portion of the game’s content fresh again.

Heroes of the Aturi Cluster, for example, pits the player against an AI opponent, alone or with human allies, across a campaign’s worth of games. Further, losses are persistent across multiple encounters. While in a standard competitive game of X-Wing, sacrificing a pilot might be expected, losing your pilot, who has advanced across numerous games and has a name and perhaps even a backstory, is a much more concerning prospect. By bringing a player’s loss aversion into the equation, defensive strategies and upgrades suddenly become more appealing than they would be in the standard game. Dials that can break away from combat more easily might also look more useful than before. And most importantly, the player’s thinking begins to align with the fictional pilot they’re controlling, who presumably wants to stay alive to fight another day. Even for someone who has played countless games of standard X-Wing, the emotional engagement here is weightier. The core of the game from the player’s perspective is much the same (plan maneuver, see if your plan pays off when your enemy moves, attack and defend), as is the content of the game (the various pilots and their abilities, along with upgrades), but the meaning of that experience has changed. You’re not the chess player sacrificing the pawn any more; now you’re the pawn, trying to survive, and many choices look very different when analyzed through that lens.

Perspective Shifts

They’re a lot more intimidating from this angle.

They’re a lot more intimidating from this angle.

This weight of this shift in perspective ties back into my Skyrim experience. Playing a game a lot can cause you to hone in one one specific part of the experience, even to the point that you start to miss the forest for the trees. For me, when I played Skyrim the first time, it was combat and progression. I had fun solving the combat puzzle, seeing new enemies and locations, and optimizing my equipment, but eventually those things became the only parts of the game I saw When I revisited Skyrim in a way that incentivized (read: forced) me to interact with many of the details I started had started glossing over a few hours into my last playthroughs, I found a totally new source of fun in a pool of content that was, by volume, 99% the same as before.

Shifting your audience’s perspective to remind them of the parts of the game they had begun to overlook can serve to rekindle their interest in the entire experience works for tabletop games, too. It was amazing watching X-Wing players light up about Aces High at Worlds 2019, and all the new builds they would never have considered previously but now wanted to try. And it should not come as a surprise to anyone that Aces High was inspired by a format run at conventions.

Maybe the most famous is Magic: The Gathering’s Commander format (Elder Dragon Highlander to us fogeys), which transforms Magic into a diplomatic strategy game. But perhaps more importantly to its success, I think, Commander puts cool creatures front-and-center and reworks the whole game to let them shine. And cool creatures are a huge part of what inspires people to play Magic, but they are also frequently rendered less than awesome by a plethora of removal, tempo effects, and other realities that keep them from being dominant in the competitive metagame most of the time. Focusing too much on how you play a game and losing sight of why you play it isn’t just a pitfall for designers; it also affects players. Sometimes the best way to rediscover what you love about a game is by changing the how completely and letting the why fall back into place.

So, where does someone go with this? I’m sure about one thing: I don’t think the widespread appeal of a fan project is a marker of how worthwhile it is. If you had fun making it, it was worthwhile. Fandom spaces are big enough to accommodate everything from tournament-friendly alt-art cards to full custom factions to Mario Kart-inspired racing game modes, and all of them make the game richer and more interesting. Honestly, I love seeing that fans are dedicated enough to make rules and minis for the Yorik-et Coralskipper! And for some people, realizing that they have that freedom to create is itself the shift in perspective they need to find joy in the other 95% of the game’s content again. The barrier for entry to homebrewing content for tabletop games is very low, and that’s a good thing, whether you use it to do something as approachable as adding your own Free Parking rule to Monopoly or something as comprehensive as a custom Magic game mode.

Read More
Max Brooke Max Brooke

Swimming with Sharks: Reflections on Competitive Players

As a player of one-on-one games, I have never been particularly successful in the competitive sphere. In high school, I had vague aspirations of playing Magic competitively enough to go to the occasional Standard event, but even that crashed upon the rocks of needing to maintain academics. I played Warhammer 40,000 through high school and college, and lost most games I played. I had a solid run of success in a Dust Warfare league circa 2012, but that was mostly because I stumbled onto a rather overpowered jet pack Allies list and gatling gunned my way to victory. I’m a fair hand at X-Wing and Armada these days through sheer exposure and the amount of time I’ve spent studying their competitive play, but I wouldn’t expect to win any given event I attended (especially if any of my former coworkers show up).

(As a sidenote, I am pretty good at the Commander format of Magic and other games of a political bent, but those are more about reading the room than head-to-head skills.)

So I have established that I am an outsider to the world of competitive success. However, having attended many gaming conventions, judged numerous top-tier events, and interacted with lots of competitive players formally and informally over the last decade, I have now spent a lot of time with very successful competitive miniatures games players. And I think my being an outsider to these waters gives me some unique perspectives on their denizens.

There are certain stereotypes of successful competitive players in most game circles, whether it’s the would-be poker player hiding beneath a hat, hoodie, and sunglasses, the power-gamer with the net-list exactly like that of half the other players at the tournament, or the player who is sure they have a rule for this somewhere in their codex, but can’t actually find the page in question. While you will meet such people, I have found that the majority of players I’ve encountered at competitive events don’t fit the stereotypes. And perhaps more interestingly, the people who repeatedly end up at the top of minis tournament brackets tend to display a very different set of traits.

Throughout this article, I will be using pictures of sharks to break up the text. These pictures are mostly there because of the title, and also because I like sharks.

Trait One: Flexibility of Perspective

The sawfish looked at teeth and said “Grabbing and tearing? How about I create a slashing meta instead!”(Also technically it is a ray, but they’re closely related so I’m leaving it)

The sawfish looked at teeth and said “Grabbing and tearing? How about I create a slashing meta instead!”

(Also technically it is a ray, but they’re closely related so I’m leaving it)

The players you see time and time again at top table of minis events tend to be folks who work to buck the expectations of their opponents and are hard to catch off-guard themselves. Successfully reading the metagame is difficult, but sticking to the same strategies for too long makes a player predictable. Unless there is a specific game element or list that is just dramatically too efficient, it’s hard for a player to ride a single list to victory time and time again. And in the same way, it’s usually hard to win with the most popular list at the event - after all, everyone is prepared for it. Even in environments when a single element or list is overperforming, the players who actually win usually have some minor tweak to their list not seen in the more widely-used version that gives them an advantage in the mirror match. There are people out there who make single-list mastery work for them, but in my observation, this is the harder path to success. This is also one of the places I struggle personally; as a player, I really like list mastery, the feeling of progress that comes with improvement using a single set of tools, and the even aesthetics of using the same game pieces consistently. But flexibility will often take you further in a competitive environment.

Interestingly, this flexibility of perspective often makes these top-table players more open to new mechanics than the general population, too. Their willingness to try new strategies and adapt how they play the game can also mean that they are not as fixed in seeing what the game “must be.” This is something to keep in mind when working with them as playtesters or soliciting the opinions of highly successful competitors, and why it is important to make sure to populate playtest pools with players of all skill levels. What a highly successful competitive player finds to be a fun challenge can often frustrate a more casual or novice player.

Trait Two: Motivated by Community

Not to be confused with whales, which have a slightly different - but also very significant - impact on games.

Not to be confused with whales, which have a slightly different - but also very significant - impact on games.

A lot of the most successful players are part of a tight-knit circle. This isn’t surprising - for a tabletop game, you need people to play with regularly to build your skills. But being part of a community group that either meets locally or discusses the game online has other advantages. For starters, it is easier to maintain a flexible perspective if you are constantly being exposed to new ideas. Additionally, not all great list-builders are great players - in many competitive board and card games, pairs of a “mechanic” who maintains the “car” and a “driver” who actually “races” are quite common. You might even have whole teams of people putting their heads together to make the best list for a single person to take to victory.

However, I think there is more to the community connection’s role in success than just getting new ideas or getting help from your friends to perfect your list. In-person tournaments are long, draining affairs, and while they are highly rewarding, they are exhausting. Rarely have I ever been more tired than after a day of attending GenCon events, and I usually wasn’t even the one playing! A community cheering you on can dramatically raise a player’s morale in critical moments, preventing a player from becoming distracted or frustrated by a small setback. Being friends with your opponent can have a similar effect, defusing some stress of elevated stakes. I rarely see people discuss the emotional component of tournaments, but I believe it to be at least as important to success as the intellectual. Emotional endurance is far easier to maintain with good friends at your side and across the table.

So it shouldn’t come as a surprise that a lot of top-table players I have met are very socially engaged, very outgoing, and even very friendly to their opponents. When running events, I’ve answered plenty of judge calls at top table, of course, but I’ve also seen a lot of more of these issues resolved amicably without my interference than I would have expected, and I believe this was in part because the players already knew and trusted each other, either directly or by reputation. One would expect that the higher the stakes (such as they are when), the higher the stress would be, but this generally isn’t what I’ve observed. Even in the final game of major events, I’ve seen players wave-off strategically inconsequential but procedurally damaging mistakes by their opponent so that the two can get back to playing the game.

Trait Three: A Desire to Play the Game

Like the people I describe in this section, this picture is really just here for fun.

Like the people I describe in this section, this picture is really just here for fun.

Just as not everyone plays the game the same way (as I’ve discussed in a past article), players also attend tournaments for a variety of reasons, and playing the game isn’t always the top of the list. Some people go because it’s a group social event - a chance to see their friends from around the world. Some people go because they want to win prizes, or even just receive door rewards. Some people want to be part of an experience bigger than themselves, and see great games played. When I often attended Magic: The Gathering pre-releases, playing the game was something I was willing to do, but I was really excited to see new cards I could add to my decks, interact with artists who attended the event, and spend a day out with my friends. I usually played out my games (because I’d paid to be there), but lots of folks I observed would drop once they had optimized their rewards from the event, and I didn’t have any trouble seeing why.

Any competitive game is likely to have a community conversation about “playing the tournament” versus “playing the game” - such as whether byes provide an unfair advantage (or disadvantage), and whether players in prior to the cut should be required to play a game that cannot change their outcomes of the event (but might affect players at a lower ranking if either player suffers a catastrophic defeat). Card gamers tend to take a very pragmatic attitude toward this. By contrast many minis gamers have more fixed attitudes that each game should be approached as its own undertaking, to be given full attention. However, in my observation, the players who win events do really tend to be there for a love of the game, not just a love of the accolades winning a tournament brings. And I think there’s a reason for this. A player who approaches every game as a chance to enjoy the experience of competition will, in the long run, learn more than a player who is just looking to get through the day. This perhaps goes back to the openness of perspective, and a willingness to see opportunities where others might see only challenges. Further, playing for the love of it can be contagious, and help to strengthen bonds with other players.

Obviously, these observations and experiences aren’t going to be universal - different games are, by their nature, different, and while I’ve met a great many top-tier competitive players across different games, I haven’t spent much time with actual professionals who make their living at it. Professionalized games have a different atmosphere because they have different incentives; like any job, people will figure out how to maximize the benefit they receive from it while minimizing the effort required. Games without list-building also have different considerations. And then there’s the fascinating phenomenon of the competitive mindset as it applies to roleplaying games. But that is truly a topic for its own article.

Read More
Max Brooke Max Brooke

The Heart of the Matter: Understanding What Makes the Game Tick

Whenever I approach a new game, whether as a player, as a designer, or simply as an outside observer, I split my inquiry into two questions:

-How do people play the game?

-Why do people play the game?

To clarify what I mean by each of these:

How people play the game is the knowledge of how players choose to interact with the content of the game. Which cards do they add to their deck? Which units do they choose to put into their army? What moves or tactics do they put into play at the table? What strategies are dominant in the metagame - and what strategies are perceived to be dominant, whether or not the data supports that conclusion?

Why people play the game is the knowledge of what parts of the game experience bring and keep the player to the table. What do they get excited about having the chance to do in the game? Do they enjoy putting some skill into practice, such as remembering trivia, or judging probabilities, distances, or angles? Do they love bluffing, mind games, and feints? Do they get a visceral thrill out of rolling dice? Do they enjoy seeing a narrative unfold at the table?

Both of these questions are important to understanding a game from a developmental perspective, especially if it’s your job to expand upon that game. And if you’re not careful, it can be easy to conflate the two. In this article, I’ll go over why it’s important to understand the how and the why individually, a few ways this can be used to identify the heart of the game, and what you can do with that knowledge as a designer.

Part One

Scrabbling for Answers.jpg

To discuss a concrete example of the divide between why and how games are played, let’s talk about good old Scrabble. First, a little context on how Scrabble is played.

Scrabble, as it is understood by most people, is a game of playing tiles to create words that score points. Play the highest-value words, win the game. However, if you’ve ever had the experience of playing Scrabble with someone who has dipped their toe into the waters of the competitive scene, you see quite a different game emerge. Most turns, instead of going for a large play, a skilled player will usually play small words, or even pass to churn tiles through their hand. Conservative, defensive play that forces the other players to engage with a very different sort of creativity than “what words can you think of with these letters?”

That’s because any large play has to be extremely worthwhile from a points perspective, as large plays “open” the board, creating more opportunities for opponents to score their own points. And the way the board is set up, being the first player to “open” the board is often disadvantageous, as another player can follow-up to hit an even higher-value bonus tile. In casual play, this serves as a catch-up mechanic in Scrabble - large plays are worth a lot of points, but they give your opponents the chance to make their own large plays. But when you turn to competitive games of Scrabble, it becomes apparent that it is more reliable to play defensively, denying your opponents the opportunity to score points than it is to go for big plays that create more big plays. “Closing” the board is a key tactic to winning, and if by doing so you can force your opponent to be the one to “open” the board, all the better.

Thus, a competitive player will seek to use small words to fill in blocks, which serves the dual purpose of scoring in multiple directions and preventing any other player from having their choice of spaces on which to hang a new word. The ideal board state is the one where your opponents can’t play anything without giving you access to the high-value resources on the board. And that is when a competitive player will go for a large play, generally with a word that exhausts their entire hand and hits a double or triple word score, or even scores in multiple directions. The core activity of the game, competitively, becomes less “what words can I play?” and more “how do I limit my opponent’s options to control the spaces on the board with the greatest value?”

None of this is to say that one means of play is better than the other - they just reflect two different expectations about the conventions of gameplay. In extreme cases, one means of play (usually the competitive one) might come to be seen as antithetical to fun overall, but I don’t believe this to be the case with competitive Scrabble, which is well-liked by its adherents. Of course, if your expectation is casual Scrabble and you wander into a competitively driven environment, you might not have a good time.

So, we have two different answers for the how.

  • Casual players (who comprise most of the player base) try to play the biggest word they can every turn.

  • Competitive players try to control board position to limit their opponents’ options or set up game-winning plays.

Part Two

Word Play.png

Now let’s dig into the why.

When you observe how the game is played by the best players of the game, you see a game of tight map control and strategic awareness, not unlike go or chess. An experienced player will have all memorized a list of key words for specific situations from the player’s dictionary, so word knowledge and retrieval is a much smaller part of the game.

When you observe why the game is played across the spectrum (from casual to competitive), though, you’re likely to see something quite different. Most people play Scrabble to show off a knowledge of large or obscure words, to demonstrate creativity on the board by recontextualizing current words or letters for even bigger plays, and as a family bonding activity that cleverly disguises education in an enjoyable game.

And, in my observation of competitive games, even most of the people who play Scrabble competitively probably sought out that game in particular because they liked the experience of stretching their vocabulary. After all, if they wanted a game of pure spatial awareness and strategy, there are no shortage of options for great games that don’t include memorizing long lists of how best to dump a “Z.” The way they play the game might not reflect that original motive at first glance, but it was still core to the original appeal - the heart of the experience. There are probably some players who jumped directly into the competitive mindset of an area control game, but most people probably didn’t end up playing Scrabble that way. They first showed up for the wordplay. And the reason they first showed up remains important to them, even if it’s not reflective of how they’re playing now.

Part Three

Score on Several Axes.jpg

So let’s imagine you were working on an expansion to Scrabble. Imagine further that the expansion was intended to appeal to both the competitive and general audiences (and let’s handwave whether or not this would in reality be the most commercially viable product to make).

In setting out to design an expansion to Scrabble, even one intended for competitive players, a designer who looked only at how Scrabble is played competitively would miss a key part of the picture. New mechanics that interacted with spacing and scoring mechanics would likely be the obvious place to expand in that case - after all, that is what competitive players do with the game. But is that what players, even competitive players, want to be doing? In my observation of games, mechanics that pushed Scrabble further toward an area control game would not be received well by a majority of the casual or competitive base. By contrast, even competitive players would generally respond positively to mechanics that made words or spelling important, provided those mechanics did not invalidate the emergent behaviors like “opening” and “closing” the board that they have accepted into their repertoire or make their existing pool of knowledge useless. Even mechanics that seem pretty wild on paper might well land with the competitive base while seemingly “safer” mechanics that interacted more with the demonstrated emergent behaviors might be rejected because they diminish the focus on the core of the game.

The importance of emphasizing the why in design is visible in the Scrabble successors that have shown staying power. Upwords allows stacking of letters, freeing up the board to a much greater degree than Scrabble, while the popular “Scribble” fan-game (later sold under the title Bananagrams) removes the shared board entirely, having each player play solely in their own sphere without interference from others. These are, at first glance, radical changes to the gameplay (indeed, these are enough to justify these being separate games). But the core appeal of constructing words creatively on a grid, building structures with future moves in mind, and flexing one’s vocabulary while doing so, are preserved, so these games generally appeal to large chunks of the Scrabble crowd.

Part Four

Into the Seat.png

To jump to a concrete example of this phenomenon: during my years working on X-Wing, I experimented with a lot of different mechanics that built upon existing parts of the game. One thing that I quickly learned that players strongly disliked any mechanic that relied on randomness. And I found this interesting, because in addition to being a convention of miniatures games at large, randomness is already enshrined as important part of X-Wing’s core experience, through the dice and the damage deck. Probability calculation and risk assessment is a key skill for any top-tier player. Yet any new mechanic that added randomness in places it previously didn’t exist got a very chilly reaction. Even if the effect had a high enough expected value in its outcome that players chose to use it, few would report that they enjoyed using it. When unsuccessful, it felt like a waste; when successful, it felt “unearned.” By contrast, some of my strangest ideas from the first waves I worked on, like blowing up asteroids (Seismic Torpedo) or dropping new debris clouds (Rigged Cargo Chute), were quite well-received despite seeming at first blush to be unprecedented. These ideas seemed risky on paper - like stacking letters in Scrabble, they did things outside the usual conventions players thought of when they considered how they play the game. But these concepts pushed people back toward thinking about the fact that they were flying their ships, rather than simply calculate odds of dice outcomes. And it turns out that one of these activities is tied directly to the heart of the game, its core fantasy, and the other is something that players had accepted they would need in order to live out that core fantasy in a competitive environment.

Like with Scrabble, playing the game competitively requires certain skills (probability assessment and gambling on outcomes). But neither the calculation of probabilities nor the thrill of rolling dice are activities that engage the fantasy of a majority of players, even among competitive players who do these activities often. The inclusion of calculated risks is also important for the game itself, as predictability breeds stagnation - the game needs uncertainty to be fun. And rolling dice and seeing it pay out (or not) can be fun at times by providing a wild swing that takes the game’s narrative in an unexpected direction. But the mechanical importance of uncertainty does not actually indicate an experiential centrality of randomness to X-Wing; the need for uncertainty could be filled in other ways without damaging the core of the game. Nor does the amount of player energy spent on understanding and reducing randomness actually indicate a widespread enjoyment of these activities in this particular game. Most players aren’t interested in seeing these mechanics dramatically expanded beyond the level at which they are necessary, compared with those that reward them for engaging with the heart of the game. Effects that change the setup of the board directly affect a player’s decision-making in an interesting way. And that’s the heart of X-Wing: the fantasy of flying a starship through dangerous debris and navigational hazards while facing down foes. When a new game element helps a player feel more like they’re a pilot in a cockpit making daring maneuvers and less like they’re standing at a table calculating odds on dice, most players like it.

Part Five

Design Pitfalls.jpg

An easy trap to fall into, as a designer approaching a game, is to conflate the means by which a game is played with the reason it is played. This risk is especially significant when approaching a long-lived game that has grown alongside its player base. The player base, especially the competitive player base, will spend most of its energy discussing elements of the game that may or may not actually factor into why they are playing the game in the first place.

Of course, this isn’t to diminish the importance of knowing how a game is played at various levels. If anything, understanding why a game is played makes the knowledge of how it is played more useful and important - equipped with both, a designer can create content that both scaffolds the core experience and is actually used by the competitive portion of the base. Knowing why the game is played without leveraging how it is played in your design won’t actually get the players to engage with the parts of the game they enjoy, it will simply lead to a glut of content nobody really uses. That’s probably a topic worthy of its own future article. And while there is certainly a place for polarizing elements like “bad cards” (or other niche or suboptimal content), a significant portion of content still needs to land with a majority of people for a game to thrive.

So the lesson can be put simply as this: don’t just look at what players are doing, look at what they want to be doing. And beyond observing, ask them what they want to be doing. Play the game yourself and ask “What did I want to be doing?” Use this information to identify the heart of the game, and the core fantasy of the majority of players, then design to enable that experience. This experience will vary game-to-game, and may not be what the original designer envisioned, especially when dealing with a game with a long lifespan.

As a final note, this lesson is a bit less neat to study when designing a wholly new game rather than expanding upon an existing framework, but is no less important. The how will be extremely slippery - you’ll constantly be changing the how through the iterative decisions you make. This makes a clear vision of the why all the more important throughout the entire design process.

Read More