## 9/23/15

### The Pareto distribution of top incomes

Our model implies that persons with the highest S and L may have income only by a factor of 225 larger than that received by persons with the smallest S and L. The exponential term in (11) includes the size of earning means growing as the square root of the real GDP per capita. As a result, it takes longer and longer time for persons with the maximum relative values S29 and L29 to reach the maximum income rate (see Figure 4), while persons with S1 and L1 reach their peak income in a few years and then retain it at the level of GDP growth. The actual ratio of the highest and lowest incomes is tens of millions, if to consider the smallest reported of \$1. All in all, our microeconomic model fails to describe the highest incomes.

Fortunately, it is not necessary to quantitatively predict the distribution of the highest incomes. Physics helps us to formulate an approach, which is based on transition between two different states of one system through the point of bifurcation. The dynamics of the system before (sub-critical state) and that beyond the bifurcation point (super-critical state) are described by quite different equations. For example, the hydrostatic equation cannot describe convective motion in liquid. Hence, it would be inappropriate to expect the equation of income growth in the sub-critical (“laminar”) regime to describe the distribution of incomes in the super-critical (“turbulent”) regime. It is favorable situation for our approach based on physical understanding of economy that the sub-critical dynamics can exactly predict the portion of system in critical state near the bifurcation point and the time of transition. For personal incomes, the point of transition is equivalent to some threshold, which separates sub- and super-critical regimes of income distribution.

So, in order to account for top incomes, which are distributed according to a power law, we assume that there exists some critical level of income that separates the two income regimes:  the exponential (sub-critical) and the Pareto one (super-critical). We call this level “the Pareto threshold”, MP(τ). Below this threshold, in the sub-critical income zone, personal income distribution (PID) is accurately predicted by our model for the evolution of individual incomes. Above the Pareto threshold, in the super-critical income zone, the observed PID is best approximated by a power law. Any person reaching the Pareto threshold can obtain any income in the distribution with a rapidly decreasing probability governed by a power law. To completely define the Pareto distribution, the model for the sub-critical zone has to predict the number (or portion) of people above the Pareto threshold, which must be in the range described by the model.  The predictive power of the model is determined by the possibility to accurately describe the dependence of the portion of people above MP on age as well as the evolution of this dependence over time. If the portion of people above the Pareto threshold fits observations then the contribution of the PID in the super-critical zone to any aggregate or disaggregate measure of personal income is completely defined by the empirically estimated power law exponent.

The mechanisms driving the power law distribution and defining the threshold are not well understood not only in economics but also in physics for similar transitions. The absence of explicit description of the driving mechanisms does not prohibit using well-established empirical properties of the Pareto distribution in the U.S. – the constancy of the measured exponential index over time and the evolution of the threshold in sync with the cumulative value of real GDP per capita [Piketty and Saez, 2003; Yakovenko, 2003; Kitov, 2005b, 2006]. Therefore, we include the Pareto distribution with empirically determined parameters in our model for the description of the PID above the Pareto threshold. The stability and accuracy of the observed power law distribution of incomes implies that we do not need to follow each and every individual income as we did in the sub-critical income zone.

The initial dimensionless Pareto threshold is found to be MP(τ0)=0.43 [Kitov, 2005a], which is within the range described by the model. Without loss of generality, we can define the initial real GDP per capita as 1. In this case, MP(τ0)=0.43 for any starting year, where Y(τ0)=1.   Then the Pareto threshold evolves with time proportionally to growth in real output per capita:

MP (τ) = MP(τ0) [Y (τ) / Y (τ0)]  = MP(τ0) Y (τ)                                                                       (20)

This retains the portion above this threshold almost constant over time as shown in Figure 7. In the model, the Pareto threshold does not depend on age.

Figure 7. The portion of people above the dimensionless Pareto threshold MP=0.43 between 1930 and 2011. The portion drops during WWII and hovers around 10% ever since.

As we discuss in the next Section, the Pareto threshold was different for males and females in the 1960s and 1970s. This difference is one of principal features of the general gender disparity in the U.S. and deserves deeper analysis and special modelling. It is easier to incorporate the observed lower Pareto threshold for women in our model than to understand the forces behind such a difference. In this Section, we discuss the overall model and illustrate it by income features of the total population.

Theoretically, the cumulative distribution function, CDF, for the Pareto distribution is defined by the following relationship:

CDF (x) = 1 – (xm/ x)k                                                                                                                                           (21)

for all x>xm, where k is the Pareto index. Then, the probability density function (PDF) is defined as:

PDF(x) = kxm(xm/x)k−1                                                                                                                                                (22)

Functional dependence of the probability density function on income allows for the exact calculation of total population in any income bin, total and average income in this bin, and the contribution of the bin to the corresponding Gini ratio because the PDF defines the Lorenz curve.

The actual estimates of index k reveal clear age dependence [Kitov, 2008a]. The evolution of the Pareto law index was estimated as the slope of linear regression line in the log-log scale. Using the CPS PIDs in various age groups aggregated over several years we obtained: k=3.91 for the age group between 25 and 34 years; k=3.48 between 35 and 44; k=3.38 between 45 and 54; k=3.14 in the age group between 55 and 64. It is clear that index k declines with age. Obviously, a smaller index k corresponds to an elevated PID density at higher incomes. The observed decrease in k with age deserves a special examination and should be inherently linked to some age-dependent dynamic processes above the Pareto threshold. The declining k is a specific feature of the age-dependent PIDs, which is not incorporated in our model yet.

For the entire population of 15 years of age and over Kitov [2008b] estimates k =3.35. It is close to the estimate in the age group between 45 and 54 years. This is not a coincidence since the number of people in the Pareto distribution is also a function of age and the potion of population with the highest incomes is the largest between 45 and 54 years of age, as discussed later in this Section. As a result, this age group has the largest input to the entire population in the Pareto range. Thus, the power law index for the entire population is practically the same as in this age group. For numerical calculations, we fix k=3.35 as estimated from the overall PIDs. The bias introduced by this choice into various income estimates for other age groups diminishes with their representation in the highest income range. As shown below, the portion of rich people in the youngest and eldest age groups is negligible.

One can also expect that the age-dependent and overall k undergo some changes over time. The overall index may vary because of the changing age pyramid and the time needed to reach the peak income Tc. In other words, the overall index may change because the input of various ages varies with time. Here, we study the gender-related difference in k, which can also be age-dependent. The observed income disparity between men and women may also be expressed in their presence among the richest share of U.S. population.

We have modelled the number of people above MP=0.43 from 1930 to 2011 [Kitov and Kitov, 2013]. Left panel in Figure 8 displays the predicted and observed numbers of people above the Pareto threshold in 1962 and 2011. We have measured these numbers from the annual PIDs borrowed from the IPUMS. Here, we have to stress that we used the entire working age population as the model input and calculated the whole period between 1930 and 2011 using only real GDP per capita as defining parameters. All other constants and initial values were fixed in 1930 and their evolution was defined by GDP growth only. The microeconomic model covers more than 80 years and gives correct predictions for two randomly selected points.

The fit between the measured and predicted numbers is excellent in various aspects. First of all, two curves for 2011 are close through the entire age range, except may be the youngest ages. The theoretical curve starts from 20 years and the observed one - from 18 years of age, but the latter curve is close to zero anyway. The measured 1962 curve is slightly higher than the predicted above the peak age. Overall, the model accurately predicts the age-dependent number of people with the highest incomes in two different years. At the same time, the predictions for 1962 and 2011 are coherent in terms they are calculated in one run with the same defining parameters and exogenous parameters (GDP and age pyramid) borrowed from official sources. This means that the model accurately predicts the evolution of each and every individual income and the Pareto threshold altogether.  Taking into account the successful prediction of the past values, one may use our model for projection of income distribution in the future. The microeconomic model describes all important aspects of income dynamics.

Figure 8. Left panel: The measured and predicted number of people with income above the thresholds \$7,000 in 1962 and \$73,000 in 2011. Right panel: The curves in the left panel are normalized to their respective peaks. The age of peak portion shifts from 41 years to 51 years.

All curves in the left panel of Figure 8 have clear peaks and then the number of people falls to zero at the age above 75; no elder people can be found in the Pareto income zone. In order to highlight the relative dynamics above the Pareto threshold we have calculated the portion of people above MP for all ages and then normalized the obtained portion curves to their peak values. In the right panel of Figure 8 we present the normalized portion of people who has reached the Pareto threshold as a function of age. This is the best illustration of the change in Tc, at least the peaks are sharper than in the mean income curves (see Figure 1). The latter contain two ingredients – low-middle (sub-critical) incomes and higher (super-critical) incomes. From Figure 8, we estimate Tc=27 years in 1962 and Tc=38 years in 2011. The difference between 1962 and 2011 is 11 years. Considering the accuracy of measurements, these estimates are in a good agreement with those obtained in Section 1.5 for Tc as a function of real GDP per capita. Such a big change has not been discussed in income-related economic literature yet.

There is an ongoing problem with the accuracy of the highest income measurements associated with confidentiality. Since the population with top incomes is represented in the CPS universe by a few people their actual incomes are “topcoded”, i.e. reduced to income bin boundaries [e.g., Larrimore et al., 2008]. According to IPUMS [2015], topcoding is defined as “a determination by the CPS that some high values were too sparse and specific to be recorded as they were reported to the CPS without the possibility of identifying the respondents.” The bias introduced by topcoding into the mean income estimates is not the only problem. It is very unfortunate for quantitative analysis that the topcodes are prone to severe revisions by income sources and by year. Moreover, the personal income estimates above the topcode were processed in different ways over time, i.e. they were changed according to different rules. In any case, all these procedures result in lower incomes reported by the CPS than those in reality. The artificial difficulties related to the topcoding deserve detailed investigation [e.g., Burkhauser et al., 2011].

The richest people make a significant share of the total personal income and the topcoding may introduce a measurable bias in some aggregate estimates like average income. For our model, the distortion of top incomes is not relevant, however. First of all, there are age groups where the effect of topcoding is marginal. For the youngest people, the portion of people in the Pareto distribution range is negligible while the dynamics of income growth at the initial segment of work experience is the most prominent. Figure 1 demonstrates that young people raise their incomes from zero to 60% of the peak mean income in the first five to ten years. As one can see in Figure 8, the observed portion of rich population in 2011 is less than 1% for ages between 15 and 25 and then starts to grow at a high rate. The curves in Figure 5 evidence that the mean income of the 22-year-olds is 30% of the peak mean income measured for the 50-year-olds. As a benefit of real economic growth for quantitative modelling, the period needed to enter the top income range increases with real GDP per capita, and thus, the effect of topcoding starts at larger work experience. By good fortune, the key parameters of our model, the dissipation factor, α, and the minimum size of work instrument, Λmin(τ0), can be most accurately estimated using the initial segment of the growth trajectory.
Secondly, the deviation from a power law distribution and errors in income estimates related to the CPS income topcoding do not affect the portion of people above the Pareto threshold. As we discussed in the beginning of this Section, there is no physical link between these two processes in the long-term observations and in the model. Effectively, whatever process disturbs the distribution of top incomes it is not driven by and does not drive the processes of income growth below the Pareto threshold. There exists only one connection between the people in the low/middle income range and the rich with the top incomes – the portion of people above the Pareto threshold. Figure 8 proves that our model exactly predicts this portion for the entire period with measurements.
The CPS observations show that the processes controlling the top incomes and those in the low-middle income are not linked. The rich and not-rich are not competing for the same personal incomes, at least for incomes from the sources included in the CPS questionnaire. The causes of the accelerated income growth in the top percentiles are in the focus of political, social as well as economic [Atkinson and Piketty, 2007; Atkinson et al., 2009; Burkhauser et al., 2012] discussions. They are beyond the scope of our model since the measures of income inequality are likely biased or/and misinterpreted in these discussions and they are related to the change in formal assignment of income sources to personal incomes rather than to real economic processes [Kitov, 2014].
Thirdly, the age of peak mean income, Tc, does not depend on the absolute value of the portion of people in the Pareto zone and the exponent, k, of the corresponding power law. It is defined by the sub-critical processes only. As Figure 1 proves, the peak age is the same for the CPS and IRS. This is because the sources of top incomes do not eat money from the sources of low-middle incomes.
Our model includes all necessary parameters to describe the distribution of top incomes, whatever are their sources and changes over time. We use the CPS estimates because they provide the most consistent and longer time series. As discussed above, the input of top incomes can vary with time, but such variations are fully accounted for by the changes in CPS estimates. For a quantitative model, the measured portion of true personal income should be constant over time. The CPS data are the closest to this requirement.