The Search for Significance: A Crash Course in Statistical Significance Using ACS 2007
If we told you the American Community Survey (ACS) found that 26 percent of Hoosier women between the ages of 35 and 44 had a bachelor's degree or more compared to just 23 percent of men, how can you know if that is a real difference in educational attainment (that is, a statistically significant finding) or just a result of random sampling error? This article provides a brief tutorial on calculating statistical significance for those who want to accurately use ACS data without becoming statisticians.^{1}
Margins of Error
As with any survey, margins of error are critical—particularly as the size of the population in question decreases (because that typically increases the margin of error). A large margin of error makes the survey estimate less reliable, which can negatively affect your analysis and comparisons. The ACS reports the margin of error for the 90 percent confidence level. Therefore, if we look at the first row in Table 1, we can say that we're 90 percent confident that the number of Hoosier men between the ages of 25 and 34 is between 422,281 and 427,919 (that range—which is the estimate plus or minus the margin of error—is known as the confidence interval). In other words, there's only a 10 percent chance that the actual number of men in that age group falls outside of that range.
Table 1: Educational Attainment and Confidence Intervals for Indiana Men and Women, 2007
Subject  Male 
Female 

Estimate 
Margin of Error 
Confidence Interval 
Estimate 
Margin of Error 
Confidence Interval 

Population 25 to 34 years  425,100  +/2,819  422,281427,919  412,591  +/2,879  409,712415,470 
Percent high school graduate or higher  86.7  +/0.8  85.987.5  89.3  +/0.7  88.690 
Percent bachelor's degree or higher  23.4  +/1.0  22.424.4  28.1  +/1.0  27.129.1 
Population 35 to 44 years  447,489  +/2,440  445,049449,929  444,091  +/2,585  441,506446,676 
Percent high school graduate or higher  87.1  +/0.9  86.288  90.5  +/0.8  89.791.3 
Percent bachelor's degree or higher  22.8  +/0.9  21.923.7  25.7  +/0.9  24.826.6 
Population 45 to 64 years  796,162  +/2,157  794,005798,319  824,930  +/2,596  822,334827,526 
Percent high school graduate or higher  88.1  +/0.5  87.688.6  89  +/0.5  88.589.5 
Percent bachelor's degree or higher  24.2  +/0.6  23.624.8  21.5  +/0.7  20.822.2 
Population 65 years and over  328,860  +/1,151  327,709330,011  464,296  +/1,431  462,865465,727 
Percent high school graduate or higher  75.3  +/1.2  74.176.5  73.7  +/0.9  72.874.6 
Percent bachelor's degree or higher  18.9  +/0.9  1819.8  11.1  +/0.6  10.511.7 
One might think that this is all the information we need to determine statistical significance: As long as the confidence intervals of two numbers don't overlap, we're good to go, right? Unfortunately, it is a bit more complex than that, and the Census Bureau discourages the use of confidence intervals alone to determine a value's statistical significance. Instead, we should calculate zscores, which are standardized figures that allow us to make comparisons.
Three Steps to Determining Significance
The first step in determining statistical significance is to convert the margin of error into a standard error. This calculation varies depending on if we are using numbers directly from published ACS tables or if we've done some intermediate calculations on our own, such as calculating a percentage. Since our data do not contain any derived estimates, all we need to do for this step is divide the margin of error value by 1.645.^{2}
The second step is to calculate the zscore itself (see Table 2). If we let A represent the male estimates, use B for the female estimates and use SE(A) and SE(B) for the standard errors of those respective estimates, the formula is as follows:
Table 2: Comparing Male and Female Educational Attainment ZScores for Indiana, 2007
Subject  Male (A) 
Female (B) 
ZScore Comparing Male and Female Populations* 

Estimate 
Margin of Error 
Standard Error 
Estimate 
Margin of Error 
Standard Error 

Population 25 to 34 years  425,100  2,819  1,714  412,591  2,879  1,750  5.11 
Percent high school graduate or higher  86.7  0.8  0.486  89.3  0.7  0.426  4.02 
Percent bachelor's degree or higher  23.4  1  0.608  28.1  1  0.608  5.47 
Population 35 to 44 years  447,489  2,440  1,483  444,091  2,585  1,571  1.57 
Percent high school graduate or higher  87.1  0.9  0.547  90.5  0.8  0.486  4.64 
Percent bachelor's degree or higher  22.8  0.9  0.547  25.7  0.9  0.547  3.75 
Population 45 to 64 years  796,162  2,157  1,311  824,930  2,596  1,578  14.02 
Percent high school graduate or higher  88.1  0.5  0.304  89  0.5  0.304  (2.09) 
Percent bachelor's degree or higher  24.2  0.6  0.365  21.5  0.7  0.426  4.82 
Population 65 years and over  328,860  1,151  700  464,296  1,431  870  121.32 
Percent high school graduate or higher  75.3  1.2  0.729  73.7  0.9  0.547  1.75 
Percent bachelor's degree or higher  18.9  0.9  0.547  11.1  0.6  0.365  11.86 
Source: IBRC, using data from the U.S. Census Bureau American Community Survey
Here's an important note for Excel users: When downloading percentage data from American FactFinder, it will format the values as percents (22.8%), which Excel stores in decimal form (0.228). The margins of error, however, are stored as regular numbers (0.9). As one can imagine, mixing those two formats yields utterly meaningless zscores. Therefore, always make sure to convert any percentages to numeric format (22.8) so they are in the same units as the margin of error before calculating the zscore.
The third step is to use the zscore to determine if the difference between the genders is significant or if random chance can explain the difference. Table 3 provides the zscore thresholds with their corresponding confidence level. Essentially, as the absolute value of the zscore becomes larger, the more confident we are that a real difference in the estimates exists. Looking back at Table 2, we find that nearly all of the values are significant at the 99 percent level, which means that we're 99 percent sure that the difference is not due to random chance.
Table 3: ZScores and Levels of Significance
If …  Then the difference between A and B is … 
z <  1.645 or z > 1.645  Significant at the 90 percent confidence level 
z <  1.96 or z > 1.96  Significant at the 95 percent confidence level 
z <  2.576 or z > 2.576  Significant at the 99 percent confidence level 
For more information, download the Census Bureau's instructions on statistical testing and ACS 2007 available at www.census.gov/acs/www/guidance_for_data_users/handbooks/.
Notes
 Data in this article are extracted from Table S1501 in the 2007 American Community Survey dataset, available via American Factfinder at http://factfinder.census.gov/.
 The denominator is 1.645 for ACS data from 2006 and later; For ACS data from 2005 or earlier, 1.65 should be used. For the Census Bureau recommended calculations for derived estimates, visit www.census.gov/acs/www/guidance_for_data_users/handbooks/
Rachel Justis, Geodemographic Analyst
Indiana Business Research Center, Kelley School of Business, Indiana University
Search: