In 1936, Literary Digest polled its readers and concluded that Kansas governor Alfred Landon would beat Franklin Roosevelt in the presidential election 57 percent to 43 percent. In reality, Roosevelt won the election in a landslide. Literary Digest's problem was sampling error. They only polled their own readers, who were much less likely to vote for Roosevelt than the rest of the public. An accurate poll would have used a type of random sampling, giving left-leaning voters an equal chance of being selected.
Simple random sampling is the gold standard for statistical research in theory, but is difficult to implement in practice. Simple random sampling means that every member of the population being studied has a perfectly equal chance of being selected by the researchers. For example, a political poll might use all of a state's registered voters as the population. By randomly dialing phone numbers, the pollsters could approximate a truly random sample. The advantage to this technique is that, in its pure form, a simple random provides the truest approximation of the population as a whole. The disadvantage is that small problems like some voters not having telephones can creep up and distort the data.
A problem with simple random sampling is that odd quirks called "artifacts" often crop up and distort findings. For example, women tend to be more likely to complete surveys than men. In a simple random sample, the researchers would select roughly equal numbers of men and women to respond, but might end up with many more completed responses from women. Stratified random sampling seeks to rectify this problem by randomly selecting participants from within pools of preselected groups. Surveyors might specify that they will randomly select 200 men and 200 women, for example. Alternatively, they might target a specific number of minority participants to ensure adequate representation in the results.
Cluster sampling is a technique allowing researchers to collect data over a large geographic area where simple random sampling is impossible. Researchers randomly select several small geographic areas, like city blocks, and then include everyone living in that city block in their study. Cluster sampling is much more practical in many situations than simple or stratified random sampling because it allows researchers to expand their sample size without expanding the territory they have to cover. For example, scientists published a paper in the Lancet in 2004 estimating casualties from the Iraq war using cluster sampling. The downside of the technique is that it produces high margins of error. For example, a single city block leveled by an American bomb might radically distort the results.
Convenience samples are groups of participants selected because they're easy for the researchers to access. For example, many social science experiments rely on undergraduate students because the researchers are professors and thus have easy access to them. Convenience sampling is unguided, in that anyone could hypothetically volunteer, but it isn't truly random because not every member of the population has an equal chance of being selected. Still, convenience sampling can be useful for studying homogenous populations where most people are expected to respond similarly to similar stimuli. The addition of control groups can help to offset the problems of non-random selection.