USER RESEARCH
Understanding sampling bias
I saw this on my way to work today. It’s quite a neat idea in which people can answer a poll question with their cigarette butt. It’s obviously just a bit of fun but this particular poll illustrates survey sampling biases quite well.
The way you find your research respondents, affects the questions you can realistically ask them and then interpret the answers as being representative of your population.
In this example, Edinburgh City Council have asked people who take the time to dispose of their cigarette butt responsibly, whether they believe dropping it on the ground should be considered littering. There are 5 votes for Yes and 1 for No.
But that’s not surprising. To understand why, is to understand sampling bias.
The people who consider it acceptable to drop their butt on the street are less likely to be included in the sample. Their butts are more often being dropped on the street than being disposed of responsibly, so they aren’t able to vote.
Site surveys and frequent flyers
I most often find myself talking about sampling biases when I’m advising colleagues about site surveys. Let me invent a scenario to explain how this might play out.
The site survey software used at Skyscanner, serves users with a cookie to say they have been invited to do a survey. It’s quite common practice. It prevents the same users being repeatedly bugged by survey invites.
Let’s say we’ve been running site survey invitations at a percentage of total traffic for several years. If we set up a new survey aimed at finding out what percentage of our users fly more than 10 times per year, we will run into a problem with sampling bias.
People who use Skyscanner and fly more than 10 times per year are likely to have been on the website more than those who fly a lot less. They have been playing ‘site survey roulette’ each time they visit, so many of these users will be taken out of the sampling pool. All of the new users who have never used our site before are included in the potential sampling pool. This skews the sample toward less frequent flyers and therefore makes it an inappropriate way of sizing the audience group.
In order to remove this particular sampling bias you would need to have the survey ignore the cookie and use a percentage of your total user base, rather than just those who hadn’t yet been invited to do a survey.
Who is and isn’t being included?
In both of these examples you can hopefully see that the method being used to invite people into the survey, is biased toward one group over another. These groups are likely to respond differently to the question being asked. They are biased samples for these particular research questions.
Comments appreciated
Did I make sampling bias an interesting subject to read about? What else do you want me to write about? Let me know in the comments.