Sampling/Selection Bias

In any research, there is a population of interest - the largest group that you want to understand. For example, if you are trying to improve the anti-smoking program in your high school district, your population of interest is high school students in your district. If you are trying to improve the water quality of streams and rivers in Illinois, your population of interest is all rivers and streams in Illinois.

In any population, there is a certain reality that you want to understand. For example, the reality of your high school population may be that "trusting the tobacco industry dramatically increases your likelihood of smoking." Similarly, there is a reality for streams and rivers in Illinois. Perhaps the presence of a farm field tile drain increases the pesticide concentration in the receiving stream.

However, there is rarely the time or resources to research the entire population. Instead a sample of the population must be chosen and researched. If this sample is representative of the entire population, then the reality of this sample will be the same as the reality of the population, and we will come to the correct conclusion in our research (assuming we have not other sources of error). However, if the sample is biased , then the reality of the sample will not be the same as the reality of the population, and what we learn from the sample will not help us understand the population. In fact, it could mislead us about the reality of the population and cause us to waste time and money on interventions that will never work, or miss opportunities to solve the problem.

Convenience samples

Convenience samples are made up of those subjects that were easiest for the researcher to access. For example, in the anti-smoking research, a convenience sample would be the wellness class that the researcher is currently teaching. In stream research, a convenience sample might be the rivers and streams that are within a 1-hour drive of the researcher's home. It is not hard to image ways in which a convenience sample may not be representative of the population. For example, the wellness class might be taken only by freshman, or the streams within a one-hour drive may flow through farm land that has a higher clay content than the rest of the state. This does not necessarily bias results. A convenience sample will bias results only if a factor that makes the sample non-representative is also related to the relationship being studied. For example, if the relationship between smoking and attitudes about the tobacco companies varies by age, then using the wellness class taken by freshman as a sample of the entire high school district population will produce a biased result. Similarly, if clay retains pesticides better, then the relationship between field tile drains and stream water quality will be different in this sample than the rest of the state.

Volunteer (or self-selection) bias

Volunteer (or self-selection) bias is another form of selection bias. This occurs when subjects can choose whether or not to participate.

If those more likely to participate exhibit a different relationship between the variables under study, then the sample is biased. For example, if students in one high school were asked to volunteer to take the survey about tobacco companies and smoking, it is likely that only a minority of students would volunteer. Who are they likely to be? Perhaps those who feel most strongly about the issue - non-smokers who hate the tobacco companies and smokers who feel the tobacco companies have gotten a bad wrap. This sample would tend to produce a much stronger relationship between these variables than would be true for the population as a whole. In the case of stream research, we would not have volunteer bias, since streams can't "volunteer" for the study. However, if a key to sampling the stream was to get permission from a farmer to use his/her land, then volunteer bias might apply.

Volunteer bias is extremely common in studies of humans, since it is usually not possible to force people to participate. In each case one must judge how much bias may have been introduced. In surveys, one should ideally obtain better than an 80% response rate. Response rates under 50% should be highly suspect, since a majority of people have reasons why they are not participating. The likelihood that one or more of these reasons is related to the relationship under study is fairly high.

Classic Examples of Biased Sample

A classic example of a biased sample and the misleading results it produced occurred in 1936. In the early days of opinion polling, the American Literary Digest magazine collected over two million postal surveys and predicted that the Republican candidate in the U.S. presidential election, Alf Landon, would beat the incumbent president, Franklin Roosevelt by a large margin. The result was the exact opposite. The Literary Digest survey represented a sample collected from readers of the magazine, supplemented by records of registered automobile owners and telephone users. This sample included an over-representation of individuals who were rich, who, as a group, were more likely to vote for the Republican candidate. In contrast, a poll of only 50 thousand citizens selected by George Gallup's organization successfully predicted the result, leading to the popularity of the Gallup poll.

Another classic example occurred in the 1948 Presidential Election. On Election night, the Chicago Tribune printed the headline DEWEY DEFEATS TRUMAN, which turned out to be mistaken. In the morning the grinning President-Elect, Harry S. Truman, was photographed holding a newspaper bearing this headline. The reason the Tribune was mistaken is that their editor trusted the results of a phone survey. Survey research was then in its infancy, and few academics realized that a sample of telephone users was not representative of the general population. Telephones were not yet widespread, and those who had them tended to be prosperous and have stable addresses. (In many cities, the Bell System telephone directory contained the same names as the Social Register.) In addition, the Gallup poll that the Tribune based its headline on was over two weeks old at the time of the printing.

Response Rates

Once a sample is selected, an attempt is made to collect data (e.g., through interviews or questionnaires) from all of its members. In practice, researchers never obtain responses from 100% of the sample. Some sample members inevitably are traveling, hospitalized, incarcerated, away at school, or in the military. Others cannot be contacted because of their work schedule, community involvement, or social life. Others simply refuse to participate in the study, even after the best efforts of the researcher to persuade them otherwise.

Each type of nonparticipation biases the final sample, usually in unknown ways. In the 1980 General Social Survey (GSS), for example, those who refused to be interviewed were later found to be more likely than others to be married, middle-income, and over 30 years of age, whereas those who were excluded from the survey because they were never at home were less likely to be married and more likely to live alone (Smith, 1983). The importance of intensive efforts at recontacting sample members who are difficult to reach (e.g., because they are rarely at home) was apparent in that GSS respondents who required multiple contact attempts before an interview was completed (the "hard-to-gets") differed significantly from other respondents in their labor force participation, socioeconomic status, age, marital status, number of children, health, and sex (Smith, 1983).

The response rate describes the extent to which the final data set includes all sample members. It is calculated as the number of people with whom interviews are completed ("completes") divided by the total number of people or households in the entire sample, including those who refused to participate and those who were not at home.

Whether data are collected through face-to-face interviews, telephone interviews, or mail-in surveys, a high response rate is extremely important when results will be generalized to a larger population. The lower the response rate, the greater the sample bias. In general, data from mail-in surveys with return rates of 20 or 30 percent, which are not uncommon for mail surveys that are not followed up effectively, usually look nothing at all like the sampled populations. This is because people who have a particular interest in the subject matter or the research itself are more likely to return mail questionnaires than those who are less interested.

One occasionally will see reports of mail surveys in which 5 to 20 percent of the sample responded. In such instances, the final sample has little relationship to the original sampling process. Those responding are essentially self-selected. It is very unlikely that such procedures will provide any credible statistics about the characteristics of the population as a whole.

Sample size

The use of appropriate sampling methods and an adequate response rate are necessary for a representative sample, but not sufficient. In addition, the sample size must be evaluated.

All other things being equal, smaller samples (e.g., those with fewer than 1,000 respondents) have greater sampling error than larger samples. To better understand the notion of sampling error, it is helpful to recall that data from a sample provide merely an estimate of the true proportion of the population that has a particular characteristic. If 100 different samples are drawn from the same sampling frame, they could potentially result in 100 different patterns of responses to the same question. These patterns, however, would converge around the true pattern in the population.

The sampling error is a number that describes the precision of an estimate from any one of those samples. It is usually expressed as a margin of error associated with a statistical level of confidence. For example, a presidential preference poll may report that the incumbent is favored by 51% of the voters, with a margin of error of plus-or-minus 3 points at a confidence level of 95%. This means that if the same survey were conducted with 100 different samples of voters, 95 of them would be expected to show the incumbent favored by between 48% and 54% of the voters (51% ± 3%).

The margin of error due to sampling decreases as sample size increases, to a point. For most purposes, samples of between 1,000 and 2,000 respondents have a sufficiently small margin of error that larger samples are not cost-effective. We will revisit the concept of margin of error in Lesson 9 - Chance.

Sampling Bias Evaluation Worksheet

The following are a list of questions that help us work through potential sampling bias in a study:

Sampling Bias - are the subjects representative of the target population?

What is the target population? How might it differ from the broader population to which the authors wish to generalize? (this is typically a minor form of bias, but is important to note).
How did the authors obtain their subjects?
What was the final sample size?
What is the response rate?
Is there anything about the sampling method that could produce a non-representative sample? (explain)

Overall assessment of sampling bias: _____ Low _____Moderate ____High

Explain:

Lesson Eight