How to Avoid Sample Size Error in Statistical Analysis
The idea that a small sample of subjects can be used to draw conclusions about a much larger population seems hard to believe. Nevertheless, sampling is done all the time---in public opinion surveys and various statistical analyses. For this to be done well, the statistical analyst has to ensure the use of quality sampling procedures. Determining a sample size is one of the most important---and difficult---steps in planning a statistical study. A sample that is too small may result in inaccurate results that cannot be generalized to the larger population. A sample that is too large wastes time and valuable study resources.
Instructions
-
Getting the Right Sample Size
-
1
Ask yourself how much sampling error you can live with. Bear in mind that any time you use a subset of subjects to draw conclusions about a larger population, the issue of sampling error is present. Sampling error refers to the difference between the sample and the larger population on a variable of interest. Sampling error can be reduced, of course, by taking a larger sample or studying the entire population, but this increases the cost of your study. Because of the resources required, studying an entire population is often not feasible.
-
2
Select a confidence interval for your study, which corresponds to the level of sampling error you can accept. The margin of error you see reported in many polls and surveys on television news programs is a good example of a confidence interval. Suppose we are planning a survey that will ask a sample of registered voters whether they approve of the job the president is doing while in office. For this example, let's say we want a confidence interval of plus or minus 3 percentage points. That means we would like the presidential approval level among our sample to be within 3 percentage points of the approval level for the whole population.
-
-
3
Select a confidence level, which is the measure of confidence that our results on the variable of interest (for our example, presidential approval rating) for the sample are within the confidence interval of the approval rating for an entire population. Most statisticians use a 95 percent confidence level, but some studies use confidence levels of 90 percent or even 99 percent. Using a confidence level of 95 means we are 95 percent certain that the percentage of people in our sample who approve of the job the president is doing is within 3 percentage points (the confidence interval) of the population's approval rating. Once you have a confidence level, look up the corresponding Z value in your statistics book or guide. Tables of Z values are usually found in the appendices of most good statistics books.
-
4
Consider the size of the population, the next number you'll need for calculating your sample size. For our example, let's suppose that we know there are 100 million registered voters across the country.
-
5
Calculate your sample size, using a formula that takes the squared Z value (in this example, 1.96, which squared is 3.84) multiplied by 0.5 times 1 minus 0.5. The value 0.5 refers to the worst-case percentage of your sample that picks a particular answer. When determining sample size, you have to use the worst-case percentage. Take this result and divide it by the squared value of the confidence interval (in this example, 0.03, which squared is 0.0009). For this example, the formula gives us a sample size of 1,067.
-
1
Tips & Warnings
Many websites have sample size calculators that allow you to enter the values of your confidence level, confidence interval and population size to get a sample size.
Double-check your work. A wrong decimal or a simple multiplication error can result in an incorrect sample size, which could undermine the validity and reliability of your study.