Quantcast
Channel: Sample Size – MeasuringU
Viewing all articles
Browse latest Browse all 15

What You Get with Specific Sample Sizes in UX Problem Discovery Studies

$
0
0

featured imageWhat sample size should you use for a problem discovery (formative) usability study?

In practice, the answer is based on both statistics AND logistics.

A statistical formula will tell you an optimal number to select. But the real-world logistical constraints of budgets, recruiting challenges, and time will often dictate the maximum number of participants you can test with.

In our earlier article, we described the sample size formula for problem discovery studies and how two parameters (likelihood of a problem and problem occurrence) impact the sample size.

In our experience, these logistical constraints lead research teams to set aside a specific number of days and a specific budget to run studies. The problem discovery formula may suggest testing 18 participants, but if you have only two days to collect data, your maximum sample size may be only ten. What do you get with ten? And then what if two participants don’t show up, and you have to toss out the data from another because of prototype issues, leaving you with 7?

A practical approach to handling sample size discussions is to flip the common question of “What sample size do I need?” to “What will I be able to detect given a specific sample size?”

In this article, we present a table (a kind of size chart for sample sizes for discovery studies) and walk through how to use it and the associated graphs to see what you can expect to get with different sample sizes for problem discovery studies.

Sample Size Table for Problem Discovery Studies

To help UX researchers plan for a variety of discovery percentages and problem probabilities in formative usability studies, we created Table 1. There is a row in the table for each sample size from 1 to 25 and columns for different possible problem probabilities from 1% to 75%. The values in the table cells are discovery rates (the likelihood of observing the problem at least once) for each combination of sample size (n) and problem probability (p) computed using the formula 1 – (1 – p)n. For easier lookups, all values are shown as percentages. Because 100% discovery is, strictly speaking, not possible, the discovery rates of 100% in the table mean that the expected percentage of discovery is at least 99.5% (and for high problem probabilities and large sample sizes, more like 99.99999%).

Probability of Problem in Studied Population
n 1% 5%10%15% 25% 30% 50% 75%
1 1% 5%10%15% 25% 30% 50% 75%
2 2%10%19%28% 44% 51% 75% 94%
3 3%14%27%39% 58% 66% 88% 98%
4 4%19%34%48% 68% 76% 94%100%
5 5%23%41%56% 76% 83% 97%100%
6 6%26%47%62% 82% 88% 98%100%
7 7%30%52%68% 87% 92% 99%100%
8 8%34%57%73% 90% 94%100%100%
9 9%37%61%77% 92% 96%100%100%
1010%40%65%80% 94% 97%100%100%
1110%43%69%83% 96% 98%100%100%
1211%46%72%86% 97% 99%100%100%
1312%49%75%88% 98% 99%100%100%
1413%51%77%90% 98% 99%100%100%
1514%54%79%91% 99%100%100%100%
1615%56%81%93% 99%100%100%100%
1716%58%83%94% 99%100%100%100%
1817%60%85%95% 99%100%100%100%
1917%62%86%95%100%100%100%100%
2018%64%88%96%100%100%100%100%
2119%66%89%97%100%100%100%100%
2220%68%90%97%100%100%100%100%
2321%69%91%98%100%100%100%100%
2421%71%92%98%100%100%100%100%
2522%72%93%98%100%100%100%100%

Table 1: Problem discovery rates for sample sizes from 1 to 25 with probabilities of problem occurrence in the studied population from 1% to 75%.

How to Use the Table

There are two ways to use Table 1:

  1. Start with a sample size to see the chance of uncovering problems.
  2. Start with a problem occurrence and see the sample sizes needed to uncover them.

Option 1: Start With a Given Sample Size

Returning to the question posed at the beginning of the article, what do you expect to get with a sample size of ten or a sample size of seven?

We start in the “n” column and go down till we find 10. That’s the given sample size. We then go across the columns, inspecting the expected discovery rates from the very low problem probability of 1% to the very high probability of 75%. The first discovery rate in the “10” row is 10%. This is low and means we’ll have only a 10% chance of seeing problems that affect 1% of the population at a sample size of 10 (in other words, with ten participants we can expect to discover about 10% of problems that have a 1% probability of occurrence). It’s not a 0% chance (yes, we’re saying there’s a chance), but those are low numbers to count on. Moving to the right, we get to 97% discovery for problems that affect 30% or more of the user population. That means at a sample size of ten, we’ll have a great chance (almost 100%) that we’ll see the relatively common problems (something that affects about a third or more of all users).

Repeating this now with seven participants shows that our chance of detecting problems (if they exist) that affect 30% of the population drops down to 92%. Still, a good of seeing these and any more likely problems. Where we had a good chance of seeing problems of 15% with ten participants (80% likelihood of discovery), with only seven participants, the discovery likelihood drops to 68%.

Another interesting pattern the table shows is that for rare problems affecting only 1% of the population, the sample size and chance of detecting these 1%ers track closely. At ten users, you’ll have about a 10% chance of detecting them, a 1% chance with one participant, and about a 22% chance with 25 participants. Not shown in the table, at 100 participants you’d expect to discover about 63% of them. It’s hard to uncover uncommon problems with small sample sizes.

Option 2: Start With a Problem Occurrence

To get started, you need to make two decisions:

  • The lowest problem detection frequency of interest (the Problem Detection Probability column)
  • The desired goal for the likelihood of discovery of those problems (the cells in the tables)

For example, suppose you’ve decided to focus on discovering problems that will happen to at least 15% of the population of interest (the problem detection probability), and the desired likelihood of discovery is at least 80%.

The smallest sample size that meets those criteria is n = 10. From Table 1, what you can expect with ten participants is:

  • For p = 1%, the likelihood of discovery is: 10%
  • For p = 5%, the likelihood of discovery is: 40%
  • For p = 10%, the likelihood of discovery is: 65%
  • For p = 15%, the likelihood of discovery is: 80%
  • For p = 25%, the likelihood of discovery is: 94%
  • For p = 30%, the likelihood of discovery is: 97%
  • For p = 50%, the likelihood of discovery is: ~100%
  • For p = 75%, the likelihood of discovery is: ~100%

This means that with ten participants, you can be reasonably confident that the study, within the limits of its tasks and population of participants (which establish what problems are available for discovery), is almost certain (> 90% likely) to reveal problems for which the problem detection frequency is at least 25%. As planned, the likelihood of discovery of problems with a detection probability of 15% is 80%.

For problems with a detection probability of less than 15%, the rate of discovery will be lower but will not be 0 when n = 10. For example, the expectation is that you will find about 65% of problems for which the detection probability is 10%, and you’ll find about 40% (almost half) of the problems available for discovery whose detection probability is 5%. You would even expect to detect 10% of the problems with a detection probability of just 1%. That’s not a bad haul for a small-sample qualitative study.

Another way to use the table is to scan from left to right to see whether the patterns of discovery rates for any of the sample sizes are acceptable, but scanning is easier when the data in the table are presented in graphs.

Sample Size Graphs

Figures 1 and 2 show different ways to depict the information in Table 1.

Discovery Rates by Problem Probabilities

Figure 1 shows the trajectory of problem discovery for different problem probabilities for each sample size from 1 to 25. The trajectories are almost linear for the lowest probabilities (1% and 5%) and dramatically nonlinear for the highest probabilities (50% and 75%). The speed with which high-probability problems approach discovery over 95% is why we didn’t include any higher than 75% in the table. Any problem with a probability of occurrence over 80% should be discovered with two participants.

Problem discovery rates by sample size for different problem probabilities

Figure 1: Problem discovery rates by sample size for different problem probabilities.

The box in Figure 1 shows the expected discovery rates for each of the different problem probabilities when n = 10, matching the extended example presented in Table 1.

Discovery Rates by Sample Size

Figure 2 shows the trajectory of problem discovery for different sample sizes (5, 10, 15, 20, and 25) for various problem probabilities (1%, 5%, 10%, 15%, 25%, 30%, 50%, and 75%).

The trajectories are similar to those shown in Figure 1, becoming less linear and smoother as the sample size increases from 5 to 25. The distance between the lines illustrates the diminishing returns in discovery rates associated with increasing sample sizes. For example, increasing the sample size from 5 to 10 shows significant benefits in the middle of the range of problem probabilities, but the benefit achieved from increasing the sample size from 20 to 25 is much smaller.

Problem discovery rates by problem probability for different sample sizes

Figure 2: Problem discovery rates by problem probability for different sample sizes.

Summary and Discussion

One sample size doesn’t fit all research needs for problem discovery studies like formative usability studies. Fortunately, tabular and graphic aids can help UX researchers determine and justify sample sizes for these types of studies.

What do you get with a specific sample size in problem discovery studies? For each possible sample size, you are likely to observe (at least once) some of the problems that will happen to only a small percentage of the population of interest, more of the problems that affect a moderate percentage, and most of the problems that affect most of the population. The likelihood of discovery increases as the sample size increases but with diminishing returns.

What drives sample size decisions for problem discovery studies? The appropriate sample size for problem discovery studies depends on two factors—the smallest problem probability you wish to detect and the desired discovery rate. In other words, how rare of an event do you need to be able to detect at least once, and what percentage of those events do you need to discover in the study?

What decision aids are available to guide sample size decisions for problem discovery studies? You can use the table and graphs presented in this article to understand what you can expect to get with different sample sizes for problem discovery studies. This can be useful for initial sample size planning and understanding the consequences of events that lead to the reduction of the initially planned sample size.

Technical note: Some early approaches to sample size decisions for formative usability studies relied on the average observed value of p across a group of discovered problems. This approach does not compute the variability of the mean of p. Also, estimates of mean p from samples consistently overestimate the actual likely value of p. There are some complex mathematical approaches to deal with these issues, but the method we describe in this article avoids the issues because it does not require estimating an average value of p.


Viewing all articles
Browse latest Browse all 15

Trending Articles