The effect of measuring procedure on Gini ratio estimates

The Census Bureau publishes Gini ratio for households as based on the Current Population Surveys conducted every March. Unfortunately, the CPS data are not compatible over time. (Actually, the CB mentions that in footnotes, but this is not the best place for general public and even for experts.) Therefore, the estimates of Gini ratio are biased and cannot be used in order to characterize the evolution of income inequality in the US. At the same time, each estimate is accurate to the extent the data and calculation procedures allow. Here we present the case of changing data granularity in 2009 which affected the estimates of Gini ratio for various household sizes.  

It is well known that the total income increases with time due to the increase in nominal GDP and population growth. The Census Bureau was measuring the household income distribution in $2500 bins with the upper limit of $100,000 since 1994. All households with income above $100,000 were counted in the open-ended ”$100,000 and above” bin. In 1994, there were 6,581,000 households in this bin and the portion of income was only 26%. This is not good for the Gini ratio estimation since one bin cover a quarter of all total income. In 2008, this bin accommodated 51% of total income. Such a bin counting is too crude and it makes the Gini ratio calculations almost worthless. Since the higher incomes are distributed according to the Pareto law, i.e. a power law, the CB can and does calculate the Gini ratio analytically for higher incomes.

In any case, the Census Bureau had to increase the bins to $5000 and the upper limit to $200,000 together with calculation of Gini ratio with bin counting up to $250,000 (the readings in the bins above $200,000 are not published!).   For the convenience of the CB, this change is appropriate. But it induced a step in the Gini ratio time series. Figure 1 displays the jumps for households of various sizes – from one person to seven+ people. Since households with more people have higher incomes one can expect that the portion of households with $100,000+ income increases with household size. The change in bin granularity and the upper limit from 2008 to 2009 has to change this portion and induce a step in the Gini ratio series.  Table 1 lists these portions for 2008 and 2009 as well as their ratios.

Table 1. The portion of households with income above $100,000 in 2008 and $200,000 in 2009.
One person
Two people
Three people
Four people
Five people
Six people
Seven people or more

We illustrate the step in Gini ratio using the overall income distribution. The overall Gini was calculated using the Pareto law approximation for the higher incomes and thus is not biased as the estimates of individual household sizes.  Figure 2 depicts three Lorentz curves based on the relevant CB estimates of household income distribution in 1994, 2008, and 2009. One can see a dramatic difference in Lorentz curves in 2008 and 2009. The high income bin with a half of total income makes the 2008 Gini ratio to be highly underestimated compared to the 2009 estimate. Both curves are identical for 85% of population, however. The 1994 curve also coincides with the 2009 one up to the last bin. Table 2 compares our estimates of Gini ratio and those reported by the CB. One can see that the 2008 CB estimate is corrected, but the step of 0.023 well reproduces the step observed for the individual household sizes in Figure 1.

Table 2. The estimates of Gini ratio in this post and those reported by the CB.
Gini ratio
CB Gini ratio


Figure 1. The evolution of Gini ratio for individual household sizes. Notice the step between 2008 and 2009.

Figure 2. The Lorentz curves for household income distribution in 1994, 2008, and 2009, as constructed from the CB income distributions without approximation of the higher incomes by the Pareto law.

No comments:

Post a Comment