1.1.** The model flexibility**

Our microeconomic
model has a large degree of freedom despite it is driven by one exogenous
variable – real GDP per capita. One can change the distribution of the
capabilities to earn money and the sizes of work capital or scale them in a
different way. The dissipation factor, **α**,
has to be estimated from data as all other constants and variables in the
model, but one can also change it according to own understanding of income
discounting. The critical age, *T*_{c},
can be changed as well as its functional dependence on GDP. The index of
exponent describing the fall beyond the critical age is difficult to estimate
because of data sparsity, scarcity, and low accuracy as observed for the youngest
and eldest population. All parameters are adjustable and do change the model
outcome. But this change is not random and cannot fit artificially designed
personal income distribution. The changes associated with the adjustment of
model parameters have to fit actual observations in the U.S. and do fit these
observations.

For the overall population, we have estimated the best-fit set of
constants and variables. Any change reduces the best fit with observations. At
the same time, the male and female PIDs, their derivatives and aggregates are
so different that some changes in defining parameters of the original model are
inevitable. Moreover, there are a few new features associated with the female
income distribution, which we did not observe in the overall PID. These
features have to be included in the model, but they should not disturb the
results of the original model. In this Subsection, we demonstrate how the
original model reacts to the change of some defining parameters in view of the
new features discussed in Section 2.

In this study, we consider males and females separately. They are
different in the age pyramid, which is an important exogenous parameter of the
model. Crudely, women make approximately 51% of the total working age
population in the U.S., and thus, the males’ share is 49%. For the sake of
simplicity we multiplied the overall population pyramid by factor 0.51 and 0.49
and obtained the female and male age pyramid, respectively. This approach may
introduce some minor errors in the gender ratio for some ages, but these errors
are smaller than those related to income measurements and GDP estimates. And
the population pyramid does not affect the mean income predictions, which are based
on the proportional distribution of people over the capability to earn money
and the sizes of work instruments.

The initial value of the critical age, *T*_{c}, is* *borrowed
from the overall model and is fixed to 19.07 years in 1930 in all models [Kitov
and Kitov, 2013]. In some versions of the gender-dependent model we move the
start year to 1960. Then the initial value of critical age is changed according
to (19), i.e. as the square root of the cumulative change in the corrected real
GDP per capita. It makes 26.08 years in 1960. This value is also fixed in the
model. The index of the Pareto law is fixed to *k*=3.35 for both sexes and does not depend on calendar years and
age. As we found in the previous Section, the largest deviations in *k* are observed for ages having extremely
low representation in top incomes. Females’ participation in the Pareto zone is
low throughout the entire 20^{th} century. As a result, constant *k* does not affect the accuracy of model
predictions.

In Section 2, we discussed a possibility that the size of instruments
used to earn money may be smaller for women than for men. The instrument size
affects the time needed to reach a given threshold as well as the level of
income one can obtain. In order to change the instrument size we introduce a
scaling factor, FL, and multiply all standard sizes by this factor. Figure 23
presents the evolution of predicted mean income in 1962 and 2011 for three
different cases: FL=0.5, 1.0, and 1.5. In order to retain portion of population
above the Pareto threshold at constant level we scale standard M^{P}=0.43
by the same factors as the instruments. As a result, M^{P} in Figure 23
has values 0.215, 0.43, and 0.645 for FL=0.5, 1.0, and 1.5, respectively.

The increase of all standard sizes *L*_{j}_{
}by factor 1.5 results in a larger relative income, which is just scaling
the mean income curve: the ratio of the peak incomes in 1930 and 2014 is
retained in all three cases. The critical age *T*_{c}_{ }also does not change because it depends
only on GDP. The slopes of two FL=1.5 curves in Figure 23 decrease relative to
those in the standard model with FL=1.0, however. This is the same effect as
observed with increasing real GDP per capita in the overall mean income curves
(Figure 1). All in all, the increased
sizes of work capital do not create new effects.

Figure 23. Mean income as a function of age in 1930
and 2011. The size of work instrument is multiplied by a factor FL=0.5, 1.0,
and 1.5. The Pareto thresholds M^{P} are also multiplied by the same
factor in order to retain the portion of people with the highest incomes. They
are 0.215, 0.43 and 0.645, respectively. Notice a shelf in the mean income
curve for 1930 with FL=0.5.

Figure 24. The number (left panel) and portion (right
panel) of people above the Pareto threshold as a function of age in 1930 and
2011. The size of work instrument is multiplied by a factor FL=0.5, 1.0, and
1.5. The Pareto thresholds M^{P} are also multiplied by the same factor
in order to retain the portion of people with the highest incomes. They are
0.215, 0.43 and 0.645, respectively.

For smaller sizes, the mean income curves become different. They contain
periods when mean income does not change: between 5 and 18 years of work experience in 1930 and from 20 to 37
years in 2011. This is the effect we have found in the females mean income in
the 1960s and 1970s (see Figure 15). As we know from the discussion in Section
1, smaller instruments imply faster growth of all incomes. The initial segments
of mean income demonstrate steeper growth to maximum values for all earning
capabilities *S*_{i}_{ }from
2 to 30. The lowered Pareto threshold suggests that everyone who could reach it
in normal conditions FL=1.0 does achieve it, but much faster. When all people
reach their maximum incomes, including those in the Pareto zone, the model
suggests no further changes before the critical age. With growing GDP, the time
when all incomes reach maximum and *T*_{c}
both increase. The start point and duration of two shelves in Figure 23 both
increase.

The portion of people above the Pareto threshold is presented in Figure
24. Both curves for FL=0.5 are characterized by early and steep growth. The number of people is displayed in the left
panel. Here, we use the males’ age pyramid and all observed fluctuations are
associated with the varying number of population rather that with the model
parameters. It is instructive that the total number of people increases with falling
FL – faster income growth involves younger population into the Pareto zone. The
number above M^{P} in a given age does not affect the mean income since
the portion does not depend on the total number. That is why the mean income
curves in Figure 23 are smooth.

When the Pareto threshold is retained at the same level for all FL, the
number of people is much lower for FL=0.5, as Figure 25 demonstrates. This
might be the reason of very low women’s representation in the top incomes in
the 20^{th} century. For FL=0.5, the number of people is just
marginally above zero in 1930 and 2011, as was observed in the females’
distribution in Figure 18. So, the lowered sizes of work instruments available
for women in the U.S. result in a very low number of women with top incomes.

Figure 25. The number of people above the Pareto
threshold as a function of age in 1930 and 2011. The size of work instrument is
multiplied by a factor FL=0.5, 1.0, and 1.5. The Pareto thresholds M^{P}=0.43
for all cases.

In Section 3.2, we demonstrate that by change in the size of instrument
people use to earn money and synchronized change (or no change) in the Pareto
threshold our model is able to qualitatively explain some striking differences
in income distribution as observed for men and women. This is a good basis for
accurate quantitative prediction of principal features observed in the males’
and females’ PID and their derivatives. The age dependence of the mean income
and the portion above the Pareto threshold are likely the most prominent
feature which demonstrates secular evolution coherent with the growth in real
GDP per capita. At first glance, male population in the U.S. demonstrates
simpler
behaviour. We begin gender-dependent modelling with prediction of men’s personal incomes.

1.2.**Male model**

There are some
new features we can predict using the original setting of our model. Figure 14
shows that the male mean income curves have the same critical age and rate of
growth as the total population. In 2013, the critical age is slightly larger
for men. One may suggest that men in the U.S. economy use larger work capitals
than women. In essence, their work instruments create the size distribution
defined in Section 1. Within our framework, men’s incomes can be modeled with
standard instruments.

In Figure 26, we present the
predicted and observed mean income curves for 1962, 1977, 1987, and 2011. All
defining parameters are the same as in the original model together with 1930 as
the start year. There are no income microdata before 1962 and we compare our
predictions with actual measurements. The fit between the curves depends on age
and year. For the model, the most important part of income evolution is before
the critical age. This segment is the most sensitive to defining parameters
including the critical age. Therefore,
the almost perfect match observed before the critical age during the period
from 1962 to 2014 proves that our model predicts personal incomes of male
population in the U.S. with incredible accuracy.

The change in the slope and shape of the initial segments is described
precisely as a function of GDP. Real economic growth leads to slower relative
income growth as was already found in the overall model. The growth in critical
age, i.e. the age of peak mean income, has been growing since 43 years of age
in 1962 and currently is above 56 years of age. We expect further increase in the age of peak
income.

There is a common feature observed
in all measured curves above the critical age, which is best highlighted when
compared to the predicted curves following the exponential fall defined by *T*_{A}=60 years of work
experience and A=0.65 (see equation (18)
for details). At the age from 64 in 1962 to 68 in 2011, the measured curves
experience a sharp drop to the level between 0.4 and 0.5. The period between *T*_{c}(*t*) and this specific age decreases with time – from 20 years in
1962 to 12 years in 2011. In Figure 13,
this effect was ironed out by smoothing with MA(7). As we discussed for female
population, the reason behind this drop is likely associated with retirement.
The retirees have constant income as a share of their work incomes. Then the
time and amplitude of the drop can be explained by the performance of social
security system.

Figure 26. Comparison of measured and predicted mean
income as a function of age. Selected years between 1962 and 2011 are
presented. The measured and predicted curves start to diverge above 64 in 1962
and above 68 years of age in 2011.

The overall fit
between the predicted and observed mean income suggests that the portion of
people above the Pareto threshold should also be accurately estimated. The overall
model exactly predicts the number of people of a given age above the Pareto
threshold. One can directly calculate their total sub-critical income as a sum
of all *M*_{ij}(*τ,t*)>*M*^{P}, as if they did not move to the super-critical regime
of income distribution. The net gain obtained by these people when they move to
the super-critical power law distribution can be calculated since we have the
number of people for each age and the overall index *k*.

Obviously, the
net gain is constant for a given *M*^{P}
since the sub- and super-critical total incomes are fixed. For the original
model the ratio of the super and subcritical total incomes is 1.33 for *M*^{P}=0.43 [Kitov, 2009]. As a
result, we do not need to calculate individual incomes in the super-critical
power law distribution to get an estimate of the total contribution of rich
people to the mean income. We just need to multiply all *M*_{ij}(*τ,t*)>*M*^{P} by a factor of 1.33.
Hence, when the mean income dependence on age is accurately predicted we
believe that the number of people above the Pareto threshold is also accurately
described as a function of age.

Figure 27. Comparison of measured and predicted number
of people above the Pareto threshold. Actual thresholds are $11,000 in 1962 and
$87,000 in 2010.

Figure 27 illustrates this statement. In the left panel, the measured
and predicted number of males is presented as a function of age for 1962 and
2010. The overall fit is more than excellent considering only one parameter (real
GDP) describing the whole variety of changes (*e.g*., inflationary periods, recessions, high and low oil prices,
changes in fiscal and budgetary rules, varying accuracy of all involved
measurements, revisions to all involved variables among many others) during the past
half-century. Moreover, to predict the number of the elder males the model involves
their almost 50 year history of work experience, which includes the real GDP
per capita time series since 1910. Figure 4 illustrates the effect of work
experience history on income in a given year. In the right panel of Figure 27,
we demonstrated the difference in the critical ages. Unfortunately, the age
estimates are subject to bias because of high-amplitude fluctuations, which are
related to the data scarcity at higher incomes and, partially, are induced by
topcoding.

The drop at the age of retirement is not seen in Figure 27. This
observation suggests that retirement does not affect the processes in the
Pareto distribution. At the same time, the fall in the number of people above
the threshold beyond the retirement age is accurately predicted by our model,
which uses the capability to earn money and the size of work capital as
defining parameters evolving as the square root of the real GDP per capita.
Therefore, the effect of retirement is likely an artificial feature related to
sub-critical incomes, which should be incorporated into the model as it is.

1.3.**Female model**

The difference
between the male and female income distribution in the U.S. is a
well-established fact. We do not consider the reasons behind the catastrophic
disparity, but have to stress that the work capabilities used in the model to
predict personal incomes are likely the same for men and women. From technical
point of view, this difference allows to improve the model and introduce new
features which were not observed in the overall PID and its aggregates and
derivatives. The updated model successfully meets a number challenges.

There are features
in the females’ curves in Section 2, which likely manifest some changes in the
parameters considered in the original setting as constant. The convergence
trend of the male and female PIDs observed since the earlier 1960s suggests
that the size of work instruments available for women has been growing. It does
not reach the level of the males’ instruments, however. So far, we used fixed
relative sizes of work instruments. So, our model has to include a new option
allowing the size change according to some predefined time or real GDP
function.

It is reasonable to start with an approximate estimate of FL which fits
best the observed features for different years. We have calculated a number of
models with FL changing from 0.2 to 1.0 with a 0.05 step and all other
parameters as in the original model. As we know from Section 2, the most
sensitive part in the mean income dependence on age is the initial segment.
Figure 28 depicts the measured and predicted mean income curves for 1962 and
1977 as obtained with FL=0.45. This FL value is the best to describe the
dynamics of mean income growth in the youngest population and also demonstrates
the feature of constant mean income before the critical age. The predicted curves start to fall long
before the critical age, however. But *T*_{c}
is controlled by a different dependence on real GDP per capita and does not
influence the initial growth. So, the estimate of FL=0.45 is not compromised by
the deviation of the measured and predicted critical ages, which we have
address next.

In Figure 29 we present similar curves for 1996 and 2011, but for FL=0.65.
The measured curves are accurately approximated by the model between the start
point and the critical age. The fall is
also well predicted but there are some deviations in 1996, which could have
some connection to retirement, as discussed in the previous Subsection. Comparing
the curves in Figures 28 and 29 one can conclude that FL has to grow from 0.45
in 1962 to 0.65 in 2011 in order to fit the rate and duration of the initial
growth.

Figure 28. The observed and predicted mean income for
1962 (left panel) and 1977 (right panel). In both models FL=0.45.

The simplest way to match the
critical age observed between 1962 and 1977 is to rise the initial value *T*_{c}(1929). Then the predicted value in 1962 and 1977 has
to change accordingly as the square root of real GDP per capita. Figure 30
displays the modified mean income curves (green dotted lines) for the same
years as in Figure 28 and 29, which now fit observations in 1962 and 1977 before
and beyond the critical age. For comparison, we have also drawn the curves from
the original model (red dotted lines). The change in *T*_{c} is not justified by any reasonable relationship,
however. In addition, the curves predicted for 1996 and 2011 with the higher
initial value of *T*_{c}_{ }do
not fit observations neither in the initial segment not in the critical age. The
shelf exists in all models, but its length is not well predicted. Considering the
large number of contradictions we met when modelling the female mean income it
is necessary to extend our original model to match all observed changes in a
consistent way.

Figure 29. The observed and predicted mean income for
1996 (left panel) and 2011 (right panel). In both models FL=0.65. Notice the
fit between the observed and measured critical ages.

Figure
30. The observed and predicted mean income as a function of age for selected
years between 1962 and 2011. Two models are shown: standard model with FL=0.45 and *T*_{c} =19.07 years in 1929 (red
dotted line); standard model with FL=0.45 and *T*_{c}=26.0 years in 1929 (green dotted line).

A lower FL in the earlier years provides an extended “no-change” period
before the critical age. The difference between the observed and predicted
critical ages in Figure 28 suggests that the concept of critical age belongs to
the super-critical distribution rather than to the low-middle incomes predicted
by the model. Moreover, the critical age for women observed between 1962 and
the earlier 1980s is almost constant and close to the retirement age. It is
reasonable to suggest that there are two critical ages – one for the
super-critical distribution, *T*_{c},
and another for the sub-critical distribution, *T*_{S}. The former
variable depends on GDP. The latter one – is practically constant and may
correlate with the retirement age. It is easy to model the retirement effect by
an exponential fall in the portion of people in the labour force [Munell,
2011]. Using the basic concept of (18) one can write the following equation:

**η*** *= −ln*B / *(*T*_{B} – *T*_{S})* *(23)

where
*B *is the constant relative level of income rate at age *T*_{B}>*T*_{S}.
Both constants in (23) have to be estimated from data. Then the evolution of
sub-critical incomes above the retirement age is described by (23) and the
evolution of super-critical incomes is defined by (18). When a super-critical
income falls below the Pareto threshold it starts to follow (23). As we know,
the portion of people in the Pareto zone falls exponentially and

The effect of *T*_{S} in
the overall distribution was masked by income dominance of male population,
data roughness, and by the fact that the most recent period was modeled. As a
result, *T*_{S} is missing from
the original model. The use of the detailed IPUMS income microdata available
from 1962 has revealed this effect for both genders. Figure 26 indicates that
this effect cannot be neglected even in the males PIDS and we consider it later
on.

Having the new concept of critical age, *T*_{S}, for sub-critical incomes we can model now the
lengthy shelf observed in the mean income curves. Three parameters in (23) have
to be fixed in the model: *T*_{S},
*T*_{B}, and *B*. The age of retirement may vary by a
few years over the whole modelled period between 1962 and 2011 as we see in the
measured mean income curves. Without
loss of generality, from the lengthy period of income measurements, from the
estimated age of women’s retirement, as well as from the results of extensive
modelling we fix *T*_{s} to 43
years of work experience or 58 years of age. This is the start of retirement
process, which is described by an exponential fall in the portion of people in
labour force. The estimated age of retirement is defined by the portion of 50%
in the labour force [Munell, 2011]. To match observations we vary *A* and *B* in (18) and (23) and obtain the best fit estimate by a simple
regression procedure in the age ranging from 15 to 70 years. Larger deviations
observed beyond 70 years of age are likely related to poor measurements and
should be neglected as noise.

From Figure 30, we have estimated the overall increase in FL from 1962
to 2011. From the IPUMS data, one cannot estimate the exact form of functional
dependence of women’s *L*_{j}
on time or real GDP per capita before 1962. It is reasonable to choose 1960 as
the start year and to reduce all initial values used in the original model
starting in 1930 to 1960. They are as follows: *τ*_{0}=1960, *T*_{c}=25.92
years, and **α**= 0.0795, *Y*(1960)=1. For the dependence of FL on
time, we suggest the simplest linear time growth between 1962 and 2011. It
makes a 0.004 annual step in FL starting from 0.45 in 1960. We understand that
the actual FL dependence on time may deviate from the simple linear
approximation, but the accuracy of the IPUMS data is not good enough to
estimate exact FL for each year between 1962 and 2011.

There is one model parameter pending adjustment – the Pareto threshold
for women, which likely increases over time. As we discussed in Section 3.1,
the Pareto threshold may have a direct link to the size of work instrument.
Interestingly, the reduction factor FL for women can be best estimated from the
rate of growth at the initial stage before 30 years of age. The Pareto distribution
is most prominent for the ages above 30 and below 60. But they are connected by
some immanent bond, which manifests itself in a slightly different way for
women and men.

Observations in Figures 9 and 22 suggest that the power law
approximation works from different incomes for males and females. For example,
in 1962 the Pareto threshold is $9,000 for males and $6,000 for females. If to
consider the males’ threshold as related to normal sizes of work instruments,
then the females’ threshold should be reduced by a factor of 2/3. It makes *M*^{P}=0.29 in 1960 and then this
threshold should increase to the level close 0.43 in 2011. In order to match
the linear growth in FL, we suggest a similar time function for *M*^{P} for women. The initial
value *M*^{P}(1960)=0.29 and
then it rises at a rate of 0.002 per year, making 0.39 in 2010. Then the growth
in *M*^{P} and FL are
synchronized.

There are several novelties proposed in Section 3 to accommodate a few specific
income distribution features observed for females. First of all, we have found
that the processes in the sub-critical and super-critical income zones have
different critical ages, *T*_{c} and
*T*_{S}, and indexes of
exponential fall - relationships (18) and (23). Essentially, the high-income
dynamic processes are decoupled from those in the low-middle income zone. The
only connection between them is the exchange of people reaching the Pareto
threshold from below and above. The joint distribution of incomes driven by different
processes in two income zones is able to describe all features observed in the
mean income curves for females.

The multivariable matching process results in the best fit model over
all involved parameters. Figure 31 depicts the observed and predicted mean income as
a function of age for four years between 1962 and 2011. The fit of the initial
segments is the same as in Figure 28 and 29 because the reduction factor FL for
the size of instruments to earn money changes from 0.45 in 1960 to 0.65 in
2011. The shelf modelled in Figure 30 by different critical times *T*_{c} now consists of two
segments as related to the critical ages in the sub-critical (*T*_{S}) and super-critical (*T*_{c}) zones. Since *T*_{S}
is constant, the duration of the second leg, i.e. *T*_{S}-*T*_{c},
has been decreasing with time while the duration of the first leg has been
increasing. The second leg of the shelf has a length of a few years in 2011.
Following this trend, the second leg will disappear in the future when *T*_{c}>*T*_{S}. The U.S. political and economic authorities may
increase the age of retirement, however.

Figure
31. The observed and predicted mean income as a function of age for selected
years between 1962 and 2011.

The overall fit between the observed and predicted mean income curves in
Figure 31 is much better than that obtained by the original model. In 1993, we
notice a slightly lower agreement between the observed and measured mean income
during the second leg of the shelf. This is likely the result of the linear
approximation of FL and *M*^{P}.
In reality, these parameters are different from those predicted by linear time
function. It is worth noting that this discrepancy disappears in 2011 because
of the overall convergence between the males and females characteristics. In a
few decades, the male and female PIDs should be identical. Meanwhile, it is
instructive to apply the upgraded model to males.

1.4.**Male model upgraded**

The relative
sizes of work instruments *L*_{j}_{
}and the Pareto threshold *M*^{P}
for males do not change with time as well as the age of retirement *T*_{S}. This fact facilitates the
use of the upgraded model. We have to adjust only the levels *A* and *B* in (18) and (23) in order to obtain the best fit model. Figure 32
depicts the measured and predicted mean income curves for six years between
1962 and 2011. The overall fit is better than that achieved in Section 3.2 (see
Figure 26). The fit below the critical age *T*_{c}
is the same, however, since it is controlled by the parameters, which were not
changed in the upgraded model.

There are two distinct segments
beyond the critical age for the super-critical incomes, i.e. those distributed
according to the Pareto law. First segment spans the ages between *T*_{c}_{ }and *T*_{S} and thus shortens with
time. Because of the larger portion of males above the Pareto threshold the
first segment demonstrates a clear fall, which is not well seen in Figure 31.
The slope of this segment also increases with time since the exponent index **γ̃ **in (18) depends of *T*_{c}(*τ*).The second segment of the mean income curves is driven by to
processes of exponential fall. The highest incomes continue to fall as in the
first segment and the sub-critical incomes start their fall beyond the age of
retirement. The joint effect of two processes is expressed by an expedite drop
beyond *T*_{S}.

The first segment related to the
super-critical incomes is predicted with an excellent accuracy for all
presented years as well as the trajectories before *T*_{c}. This is an important verification of the model
predictive power. The changing slope and length of the pre-critical stage (*t*<*T*_{c})
and same features of the first segment - all change over time as defined by
their dependence on one external variable – real GDP per capita. The predicted
curves not only fit the measured ones for the selected years. They demonstrate
synchronized change with time (real GDP) – the ultimate demand of any dynamic
model in physics. Hence, we conclude that our microeconomic model is a physical
one rather than economic.

The prediction of the second segment
is complicated by a few shortcomings.
Firstly, the age of retirement changes with time depending not only on
legislation but also on economic situation. After the last financial and
economic crisis, many discouraged people have to leave the labour force. The
rate of participation in labour force has fallen from 66.4% in 2007 to 62.4% in
September 2015. (During the same period the rate of unemployment has changed
from 4.6% to 5.1%, with an extremely high rate in-between, however.) As a
result the age of retirement may drop. Secondly, the accuracy of income
measurements is significantly lower for the eldest population because of the
overall under-representation. Thirdly, the population pyramid for the elderly
fluctuates as a consequence of WWII and the post-war baby boom. Our model can
adjust all these aspects and predict the second segment better than its current
version. This would be non-physical amendments, however, with are not the
priority of quantitative modelling.

Figure 32. The observed and predicted mean income curves
for male population as a function of age.