The
Census Bureau publishes Gini ratio for households as based on the Current
Population Surveys conducted every March. Unfortunately, the CPS data are not
compatible over time. (Actually, the CB mentions that in footnotes, but this is
not the best place for general public and even for experts.) Therefore, the
estimates of Gini ratio are biased and cannot be used in order to characterize
the evolution of income inequality in the US. At the same time, each estimate
is accurate to the extent the data and calculation procedures allow. Here we
present the case of changing data granularity in 2009 which affected the
estimates of Gini ratio for various household sizes.
It
is well known that the total income increases with time due to the increase in
nominal GDP and population growth. The Census Bureau was measuring the
household income distribution in $2500 bins with the upper limit of $100,000
since 1994. All households with income above $100,000 were counted in the open-ended
”$100,000 and above” bin. In 1994, there were 6,581,000 households in this bin
and the portion of income was only 26%. This is not good for the Gini ratio
estimation since one bin cover a quarter of all total income. In 2008, this bin
accommodated 51% of total income. Such a bin counting is too crude and it makes
the Gini ratio calculations almost worthless. Since the higher incomes are
distributed according to the Pareto law, i.e. a power law, the CB can and does
calculate the Gini ratio analytically for higher incomes.
In
any case, the Census Bureau had to increase the bins to $5000 and the upper
limit to $200,000 together with calculation of Gini ratio with bin counting up
to $250,000 (the readings in the bins above $200,000 are not published!). For the convenience of the CB, this change
is appropriate. But it induced a step in the Gini ratio time series. Figure 1 displays
the jumps for households of various sizes – from one person to seven+ people. Since
households with more people have higher incomes one can expect that the portion
of households with $100,000+ income increases with household size. The change
in bin granularity and the upper limit from 2008 to 2009 has to change this
portion and induce a step in the Gini ratio series. Table 1 lists these portions for 2008 and 2009
as well as their ratios.
Table
1. The portion of households with income above $100,000 in 2008 and $200,000 in
2009.
2008
|
2009
|
ratio
|
|
One person
|
0.054
|
0.008
|
6.52
|
Two people
|
0.213
|
0.037
|
5.69
|
Three people
|
0.270
|
0.048
|
5.61
|
Four people
|
0.339
|
0.068
|
4.99
|
Five people
|
0.311
|
0.071
|
4.36
|
Six people
|
0.282
|
0.057
|
4.90
|
Seven people or more
|
0.278
|
0.053
|
5.20
|
We
illustrate the step in Gini ratio using the overall income distribution. The overall
Gini was calculated using the Pareto law approximation for the higher incomes and
thus is not biased as the estimates of individual household sizes. Figure 2 depicts three Lorentz curves based on
the relevant CB estimates of household income distribution in 1994, 2008, and
2009. One can see a dramatic difference in Lorentz curves in 2008 and 2009. The
high income bin with a half of total income makes the 2008 Gini ratio to be
highly underestimated compared to the 2009 estimate. Both curves are identical
for 85% of population, however. The 1994 curve also coincides with the 2009 one
up to the last bin. Table 2 compares our estimates of Gini ratio and those
reported by the CB. One can see that the 2008 CB estimate is corrected, but the
step of 0.023 well reproduces the step observed for the individual household sizes
in Figure 1.
Table
2. The estimates of Gini ratio in this post and those reported by the CB.
Gini ratio
|
CB Gini ratio
|
|
2009
|
0.466
|
0.465
|
2008
|
0.443
|
0.466
|
1994
|
0.457
|
0.456
|
Figure
2. The Lorentz curves for household income distribution in 1994, 2008, and 2009,
as constructed from the CB income distributions without approximation of the
higher incomes by the Pareto law.
No comments:
Post a Comment