We are going to revisit our model for personal income distribution, PID. It was first formalized in 2003 and used income distributions through 2001. We had to convert all reports published by the Census Bureau in pdf format between 1947 and 1993 into excel tables. It took a month of hand work together with proof reading. These reports are not converted into digital format yet.
In 2006, we used new data (through 2005) and re-estimated the model. In 2010, we published a book on personal income distribution using data through 2008. It is a good time to refresh the model and evaluate its performance since 2001 with ten more years of data. All major results will be presented in this blog.
We start with presenting original data. The distribution of personal incomes since 1994 is characterized by a higher resolution – income bins are only $2500 wide. Our model assumes that the overall income distribution depends on the age pyramid and the level of real GDP per capita. However, the evolution of PID is slow and at a twenty year horizon one actually sees a frozen PID. The frozen PID results in an almost constant Gini ratio over time, which is actually reported by the Census Bureau.
We illustrate PID in a few figures below. Figure1 presents all PID published since 1994 between $0 and $100,000 as they are. We have included all people without income into the bin between $0 and $2500. One can observe that the number of people in higher income bins increases with time as well as the number of people with incomes above $100,000 shown in Figure 2. The portion of people with incomes above $100,000 has been increasing by 0.3% per year since 1994. Figure 3 shows the number of people with income above $100,000 as a function of work experience. The fastest growth is observed for the groups between 30 and 40 years of work experience, i.e. between 45 and 55 years of age.
Figure 4 depicts the population density functions, PDFs, for the years between 1994 and 2010. First, the estimates presented in Figure 1 were normalized to the total population for a given year. Then we reduced the income scale for individual years, i.e. from 1995 to 2010, by the total growth of real GDP. This allows normalizing the curves to the total income, i.e. we reduce all scales to that of 1994. Finally, we normalize the portions of populations in given bins to their widths for individual years and obtain the population density functions. Figure 4 proves that the distribution of personal incomes has not been changing over time in relative terms, i.e. a given portion of population always has a given portion of total income. From the PIDs one can always build the relevant Lorenz curves and estimate Gini ratios. For higher incomes, the distribution has to be described by the Pareto distribution. Figure 5 shows that the PDFs at higher incomes do follow a common power law with an exponent of -3.9.
Our first assessment of the income data obtained after 2001 is that they do follow up the previously obtained relationships. We expect that our model for personal income distribution should perform well.
Figure 1. Personal income distributions from 1994 to 2010.
Figure 2. Portion of people with income above $100,000. The portion increases by 0.3% per year.
Figure 3. The number of people with income above $100,000 as a function of work experience. The fastest growth is observed for the groups between 30 and 40 years of work experience, i.e. between 45 and 55 years of age.
Figure 4. Population density function, i.e. the number of people in a given bin normalized to the total number of people and the width of income bin, as a function of income reduced by the overall GDP growth.
Figure 5. The Pareto distribution at higher incomes.