10/17/15

How universal is the law of income distribution? Cross country comparison (post #777)

This paper could be absolutely amazing for physicists. It shows that income distribution in four (English Speaking) countries follows a universal law. Incomes are driven by only one(!) external variable - real GDP per capita. All differences in income distribution are defined by the gap in GDP: Canada, New Zealand and the UK exactly follow steps of the USA with a time delay of 15 to 25 years! This is definitely a fundamental result for physics of income evolution as described by our model in the previous post.

Luckily, this is post #777 in this blog.

How universal is the law of income distribution? Cross country comparison

Ivan O. Kitov and Oleg I. Kitov (link to full text on arxiv.org via IDEAS)

The evolution of personal income distribution (PID) in four countries: Canada, New Zealand, the UK, and the USA follows a unique trajectory. We have revealed precise match in the shape of two age-dependent features of the PID: mean income and the portion of people with the highest incomes (2 to 5% of the working age population). Because of the U.S. economic superiority, as expressed by real GDP per head, the curves of mean income and the portion of rich people currently observed in three chasing countries one-to-one reproduce the curves measured in the USA 15 to 25 years before. This result of cross country comparison implies that the driving force behind the PID evolution is the same in four studied countries. Our parsimonious microeconomic model, which links the change in PID only with one exogenous parameter - real GDP per capita, accurately predicts all studied features for the U.S. This study proves that our quantitative model, based on one first-order differential equation, is universal. For example, new observations in Canada, New Zealand, and the UK confirm our previous finding that the age of maximum mean income is defined by the root-square dependence on real GDP per capita.

Gender income disparity in the USA: analysis and dynamic modelling

We have published principal results of this study as a working paper on arxiv.org. Our model provides an accurate quantiative description of how the difference between men and women evolves since 1962.

Gender income disparity in the USA: analysis and dynamic modelling

Ivan O. Kitov and Oleg I. Kitov
(link to full text on arxiv.org via IDEAS)

We analyze and develop a quantitative model describing the evolution of personal income distribution, PID, for males and females in the U.S. between 1930 and 2014. The overall microeconomic model, which we introduced ten years ago, accurately predicts the change in mean income as a function of age as well as the dependence on age of the portion of people distributed according to the Pareto law. As a result, we have precisely described the change in Gini ratio since the start of income measurements in 1947. The overall population consists of two genders, however, which have different income distributions. The difference between incomes earned by male and female population has been experiencing dramatic changes over time. Here, we model the internal dynamics of men and women PIDs separately and then describe their relative contribution to the overall PID. Our original model is refined to match all principal gender-dependent observations. We found that women in the U.S. are deprived of higher job positions. This is the cause of the long term income inequality between males and females in the U.S. It is unjust to women and has a negative effect on real economic growth. Women have been catching up since the 1960s and that improves the performance of the U.S. economy. It will take decades, however, to full income equality between genders. There are no new defining parameters included in the model except the critical age, when people start to lose their incomes, was split into two critical ages for low-middle incomes and the highest incomes, which obey a power law distribution. Such an extension becomes necessary in order to match the observation that the female population in the earlier 1960s was practically not represented in the highest incomes.

10/5/15

Gender income inequality: final discussion of modelling results

Our model stems from extensive physical intuition and is supported by direct comparison of income observations with closed-form solutions of simple differential equations describing fundamental physical processes. The set of equations describing the growth and fall of incomes is fully borrowed from physics, with the empirically estimated constant of dissipation and the distribution of sizes of personal capabilities and instruments. The transition to the power law distribution of the highest incomes is also a physical process, which can be qualitatively described by the concept of self-organized criticality. In that sense, the dynamics of the highest incomes is not governed by simple physical relationships – the power law distribution is a purely statistical description rather than a solution of a system of differential equations. Nevertheless, all properties in the super-critical regime are defined by two parameters – the number of people above the Pareto threshold and the power law index. The former parameter is exactly predicted by our model as a function of time and age for males and females. The index has to be empirically estimated, as in other physical cases like for the slopes of earthquake recurrence curves in various seismic regions [Kitov et al., 2011]. Overall, the system of personal income distribution is fully and accurately described by physical equations. In that sense, it is a physical system.

We introduced the microeconomic model of personal income distribution a decade ago and used the CPS historical data to calibrate all defining parameters. The March CPS reports aggregate incomes in five-year age cells since 1993. Ten-year cells are used between 1947 and 1992, with sporadic appearance of shorter cells for the youngest population. The income data and the U.S. age pyramids between 1947 and 2011 published by the U.S. Census Bureau were used as they are without gender separation. The IPUMS income microdata not only make it possible to distinguish between males and females but also provide various estimates in one-year cells. In this study, we use the advantage of income microdata and model two specific age-dependent features of personal income distribution: mean income and portion of people in the Pareto distribution. These two features are most sensitive to the influence of time and age. A correct income distribution model must accurately describe the dynamics of secular and age-dependent changes observed in actual data. Any model not predicting the dynamics of actual changes should be disregarded. Our model successfully predicts all principal changes in both features observed between 1962 and 2014 for males and females separately.

The difference in income dynamics demonstrated by two genders represents enormous challenge for quantitative modelling. A model unifying (at first glance) incompatible results for two genders has to be parsimonious and include only parameters common for both cases. The dynamic discrepancy between male and female incomes has to be explained only by values of defining parameters: constants and variables controlled by exogenous measurable forces represented by continuous time series. The evolution of gender-dependent income features together with all changes in the difference between them should be driven by the same driving forces. In our model, the only force moving personal income distribution along predefined trajectories is real economic growth as expressed by GDP per capita calculated for working age population.

Dynamic behavior of the difference in income distribution between males and females requires a special approach in quantitative models of income distribution. The original version of KKM made no difference between men and women. Here, we extend the KKM by introducing two independent populations with different features of income distribution as reported by the CPS and IPUMS. Since gender divides U.S. population in approximately equal proportions over time and age the gender-related income effects do improve the KKM predictive power upon the original version. In other words, females have sizeable contribution to the total income. The next step to a more precise model might be the introduction of race differences of income distribution. The income difference between white males and black females is much more dramatic than income difference between two sexes considered in this study. This is a real challenge to our income distribution model.

All in all, we have demonstrated in this paper that the refined KKM accurately explains a number of common and gender-specific features. The principal finding of this study is that female population in the U.S. has the same distribution of the capability to earn money (notation similar but not equivalent to human capital) and consistently lower sizes of work instruments (work capital) compared to those for men. The income gap between women and men has been closing since 1960 and currently an average female has work capital making 65% of that available for an average man. It was only 45% in the 1960s. Considering the same capability to earn money for females, one can conclude that the relatively lower work capitals (e.g., job positions, assets, …) are controlled by external force. A fair distribution has not been achieved yet. It will likely take decades.

The relatively lower instrument sizes available for females make the proportion of female above the Pareto threshold lower. In turn, this effect lowers the mean income for the same age since a relatively lower number of rich females occurs in all age groups. However, the lack of rich women is partially compensated by the effect of lowered Pareto threshold for females, which is most prominent in the 1960s and 1970s. The coherent increase in the instrument size and Pareto threshold for women has been incorporated into our model. As a result, the model accurately predicts the early growth trajectory, which is most sensitive to the size of work instrument, and the number of females above their own Pareto threshold. As in the original model, both parameters increase with time as the square root of real GDP per capita. For women, we have introduced a specific option as revealed from observations - the relative instrument size and the Pareto threshold both follow linear time trends with different slopes.

The female mean income shows a very specific feature – it is practically constant during an extended period spanning the ages between ~30 and ~60. In our model, this feature results from the fast growth of all personal incomes to their peak values, which are then retained at the same level. The expedite rise in all incomes is induced by the lowered sizes of work instruments available for women. In turn, the lower instruments do not allow personal incomes to reach the Pareto threshold and there are almost no rich women by male standards in the 1960s and 1970s. Therefore, the disparity in work capitals affects the low-middle incomes and higher incomes together. Such a shelf is absent in the overall mean income curve because of larger instrument sizes available for males.

The shelf in the females’ mean income curves has also revealed the difference between critical times for the low-middle (in physical notation - sub-critical) and high (super-critical) incomes, the latter governed by the Pareto distribution. Equation (13) describing the sub–critical regime is valid from the start of work experience to the age of retirement. Then incomes fall along an exponential trajectory described by equation (17). The actual age of retirement varies in a narrow band between ~60 and ~65 years and is embedded into the model as constant. The fall is described by an exponential function with a negative index. This is a new feature of the upgraded model. In the original model, the critical age, T_c, was the same for low-middle and high incomes. The input of rich men in the overall PID masked the presence of the low-middle income critical age. Instructively, the mean income measured for males supports the existence of two critical ages.

The refined model includes several new features not compromising the underlying physical concept of saturation growth and the transition from sub-critical to super-critical regime of income distribution. The extended version of the original model accurately predicts the PID evolution for males and females in the U.S. from 1962 to 2014, i.e. where the IPUMS data are available. Since the GDP estimates are available from the U.S. Bureau of Economic Analysis since 1929 we start our model in 1930 for males. Actually, the model spans the period since 1870, i.e. the year when started their work people who reached the age of 75 in 1930. For females, the start year is shifted to 1960 because of changing relative size of work instrument and Pareto threshold.

Forced deprivation of higher job positions (work capital) is the cause of the observed long term income inequality between male and female in the U.S. It is not only unjust to women but has a negative effect on real economic growth. The replacement of highly capable women with less capable men results in lower total income, which is an equivalent to real GDP. Women have been catching up since the 1960s and that improves the performance of the U.S. economy. It will take decades, however, to full income equality between genders. The problem of race income disparity will take longer time to full resolution, however.

10/3/15

Modelling gender income inequality

1.1. The model flexibility

Our microeconomic model has a large degree of freedom despite it is driven by one exogenous variable – real GDP per capita. One can change the distribution of the capabilities to earn money and the sizes of work capital or scale them in a different way. The dissipation factor, α, has to be estimated from data as all other constants and variables in the model, but one can also change it according to own understanding of income discounting. The critical age, T_c, can be changed as well as its functional dependence on GDP. The index of exponent describing the fall beyond the critical age is difficult to estimate because of data sparsity, scarcity, and low accuracy as observed for the youngest and eldest population. All parameters are adjustable and do change the model outcome. But this change is not random and cannot fit artificially designed personal income distribution. The changes associated with the adjustment of model parameters have to fit actual observations in the U.S. and do fit these observations.

For the overall population, we have estimated the best-fit set of constants and variables. Any change reduces the best fit with observations. At the same time, the male and female PIDs, their derivatives and aggregates are so different that some changes in defining parameters of the original model are inevitable. Moreover, there are a few new features associated with the female income distribution, which we did not observe in the overall PID. These features have to be included in the model, but they should not disturb the results of the original model. In this Subsection, we demonstrate how the original model reacts to the change of some defining parameters in view of the new features discussed in Section 2.

In this study, we consider males and females separately. They are different in the age pyramid, which is an important exogenous parameter of the model. Crudely, women make approximately 51% of the total working age population in the U.S., and thus, the males’ share is 49%. For the sake of simplicity we multiplied the overall population pyramid by factor 0.51 and 0.49 and obtained the female and male age pyramid, respectively. This approach may introduce some minor errors in the gender ratio for some ages, but these errors are smaller than those related to income measurements and GDP estimates. And the population pyramid does not affect the mean income predictions, which are based on the proportional distribution of people over the capability to earn money and the sizes of work instruments.

The initial value of the critical age, T_c, is borrowed from the overall model and is fixed to 19.07 years in 1930 in all models [Kitov and Kitov, 2013]. In some versions of the gender-dependent model we move the start year to 1960. Then the initial value of critical age is changed according to (19), i.e. as the square root of the cumulative change in the corrected real GDP per capita. It makes 26.08 years in 1960. This value is also fixed in the model. The index of the Pareto law is fixed to k=3.35 for both sexes and does not depend on calendar years and age. As we found in the previous Section, the largest deviations in k are observed for ages having extremely low representation in top incomes. Females’ participation in the Pareto zone is low throughout the entire 20^th century. As a result, constant k does not affect the accuracy of model predictions.

In Section 2, we discussed a possibility that the size of instruments used to earn money may be smaller for women than for men. The instrument size affects the time needed to reach a given threshold as well as the level of income one can obtain. In order to change the instrument size we introduce a scaling factor, FL, and multiply all standard sizes by this factor. Figure 23 presents the evolution of predicted mean income in 1962 and 2011 for three different cases: FL=0.5, 1.0, and 1.5. In order to retain portion of population above the Pareto threshold at constant level we scale standard M^P=0.43 by the same factors as the instruments. As a result, M^P in Figure 23 has values 0.215, 0.43, and 0.645 for FL=0.5, 1.0, and 1.5, respectively.

The increase of all standard sizes L_jby factor 1.5 results in a larger relative income, which is just scaling the mean income curve: the ratio of the peak incomes in 1930 and 2014 is retained in all three cases. The critical age T_calso does not change because it depends only on GDP. The slopes of two FL=1.5 curves in Figure 23 decrease relative to those in the standard model with FL=1.0, however. This is the same effect as observed with increasing real GDP per capita in the overall mean income curves (Figure 1). All in all, the increased sizes of work capital do not create new effects.

Figure 23. Mean income as a function of age in 1930 and 2011. The size of work instrument is multiplied by a factor FL=0.5, 1.0, and 1.5. The Pareto thresholds M^P are also multiplied by the same factor in order to retain the portion of people with the highest incomes. They are 0.215, 0.43 and 0.645, respectively. Notice a shelf in the mean income curve for 1930 with FL=0.5.

Figure 24. The number (left panel) and portion (right panel) of people above the Pareto threshold as a function of age in 1930 and 2011. The size of work instrument is multiplied by a factor FL=0.5, 1.0, and 1.5. The Pareto thresholds M^P are also multiplied by the same factor in order to retain the portion of people with the highest incomes. They are 0.215, 0.43 and 0.645, respectively.

For smaller sizes, the mean income curves become different. They contain periods when mean income does not change: between 5 and 18 years of work experience in 1930 and from 20 to 37 years in 2011. This is the effect we have found in the females mean income in the 1960s and 1970s (see Figure 15). As we know from the discussion in Section 1, smaller instruments imply faster growth of all incomes. The initial segments of mean income demonstrate steeper growth to maximum values for all earning capabilities S_ifrom 2 to 30. The lowered Pareto threshold suggests that everyone who could reach it in normal conditions FL=1.0 does achieve it, but much faster. When all people reach their maximum incomes, including those in the Pareto zone, the model suggests no further changes before the critical age. With growing GDP, the time when all incomes reach maximum and T_c both increase. The start point and duration of two shelves in Figure 23 both increase.

The portion of people above the Pareto threshold is presented in Figure 24. Both curves for FL=0.5 are characterized by early and steep growth. The number of people is displayed in the left panel. Here, we use the males’ age pyramid and all observed fluctuations are associated with the varying number of population rather that with the model parameters. It is instructive that the total number of people increases with falling FL – faster income growth involves younger population into the Pareto zone. The number above M^P in a given age does not affect the mean income since the portion does not depend on the total number. That is why the mean income curves in Figure 23 are smooth.

When the Pareto threshold is retained at the same level for all FL, the number of people is much lower for FL=0.5, as Figure 25 demonstrates. This might be the reason of very low women’s representation in the top incomes in the 20^th century. For FL=0.5, the number of people is just marginally above zero in 1930 and 2011, as was observed in the females’ distribution in Figure 18. So, the lowered sizes of work instruments available for women in the U.S. result in a very low number of women with top incomes.

Figure 25. The number of people above the Pareto threshold as a function of age in 1930 and 2011. The size of work instrument is multiplied by a factor FL=0.5, 1.0, and 1.5. The Pareto thresholds M^P=0.43 for all cases.

In Section 3.2, we demonstrate that by change in the size of instrument people use to earn money and synchronized change (or no change) in the Pareto threshold our model is able to qualitatively explain some striking differences in income distribution as observed for men and women. This is a good basis for accurate quantitative prediction of principal features observed in the males’ and females’ PID and their derivatives. The age dependence of the mean income and the portion above the Pareto threshold are likely the most prominent feature which demonstrates secular evolution coherent with the growth in real GDP per capita. At first glance, male population in the U.S. demonstrates simpler behaviour. We begin gender-dependent modelling with prediction of men’s personal incomes.

1.2.Male model

There are some new features we can predict using the original setting of our model. Figure 14 shows that the male mean income curves have the same critical age and rate of growth as the total population. In 2013, the critical age is slightly larger for men. One may suggest that men in the U.S. economy use larger work capitals than women. In essence, their work instruments create the size distribution defined in Section 1. Within our framework, men’s incomes can be modeled with standard instruments.

In Figure 26, we present the predicted and observed mean income curves for 1962, 1977, 1987, and 2011. All defining parameters are the same as in the original model together with 1930 as the start year. There are no income microdata before 1962 and we compare our predictions with actual measurements. The fit between the curves depends on age and year. For the model, the most important part of income evolution is before the critical age. This segment is the most sensitive to defining parameters including the critical age. Therefore, the almost perfect match observed before the critical age during the period from 1962 to 2014 proves that our model predicts personal incomes of male population in the U.S. with incredible accuracy.

The change in the slope and shape of the initial segments is described precisely as a function of GDP. Real economic growth leads to slower relative income growth as was already found in the overall model. The growth in critical age, i.e. the age of peak mean income, has been growing since 43 years of age in 1962 and currently is above 56 years of age. We expect further increase in the age of peak income.

There is a common feature observed in all measured curves above the critical age, which is best highlighted when compared to the predicted curves following the exponential fall defined by T_A=60 years of work experience and A=0.65 (see equation (18) for details). At the age from 64 in 1962 to 68 in 2011, the measured curves experience a sharp drop to the level between 0.4 and 0.5. The period between T_c(t) and this specific age decreases with time – from 20 years in 1962 to 12 years in 2011. In Figure 13, this effect was ironed out by smoothing with MA(7). As we discussed for female population, the reason behind this drop is likely associated with retirement. The retirees have constant income as a share of their work incomes. Then the time and amplitude of the drop can be explained by the performance of social security system.

Figure 26. Comparison of measured and predicted mean income as a function of age. Selected years between 1962 and 2011 are presented. The measured and predicted curves start to diverge above 64 in 1962 and above 68 years of age in 2011.

The overall fit between the predicted and observed mean income suggests that the portion of people above the Pareto threshold should also be accurately estimated. The overall model exactly predicts the number of people of a given age above the Pareto threshold. One can directly calculate their total sub-critical income as a sum of all M_ij(τ,t)>M^P, as if they did not move to the super-critical regime of income distribution. The net gain obtained by these people when they move to the super-critical power law distribution can be calculated since we have the number of people for each age and the overall index k.

Obviously, the net gain is constant for a given M^P since the sub- and super-critical total incomes are fixed. For the original model the ratio of the super and subcritical total incomes is 1.33 for M^P=0.43 [Kitov, 2009]. As a result, we do not need to calculate individual incomes in the super-critical power law distribution to get an estimate of the total contribution of rich people to the mean income. We just need to multiply all M_ij(τ,t)>M^P by a factor of 1.33. Hence, when the mean income dependence on age is accurately predicted we believe that the number of people above the Pareto threshold is also accurately described as a function of age.

Figure 27. Comparison of measured and predicted number of people above the Pareto threshold. Actual thresholds are $11,000 in 1962 and $87,000 in 2010.

Figure 27 illustrates this statement. In the left panel, the measured and predicted number of males is presented as a function of age for 1962 and 2010. The overall fit is more than excellent considering only one parameter (real GDP) describing the whole variety of changes (e.g., inflationary periods, recessions, high and low oil prices, changes in fiscal and budgetary rules, varying accuracy of all involved measurements, revisions to all involved variables among many others) during the past half-century. Moreover, to predict the number of the elder males the model involves their almost 50 year history of work experience, which includes the real GDP per capita time series since 1910. Figure 4 illustrates the effect of work experience history on income in a given year. In the right panel of Figure 27, we demonstrated the difference in the critical ages. Unfortunately, the age estimates are subject to bias because of high-amplitude fluctuations, which are related to the data scarcity at higher incomes and, partially, are induced by topcoding.

The drop at the age of retirement is not seen in Figure 27. This observation suggests that retirement does not affect the processes in the Pareto distribution. At the same time, the fall in the number of people above the threshold beyond the retirement age is accurately predicted by our model, which uses the capability to earn money and the size of work capital as defining parameters evolving as the square root of the real GDP per capita. Therefore, the effect of retirement is likely an artificial feature related to sub-critical incomes, which should be incorporated into the model as it is.

1.3.Female model

The difference between the male and female income distribution in the U.S. is a well-established fact. We do not consider the reasons behind the catastrophic disparity, but have to stress that the work capabilities used in the model to predict personal incomes are likely the same for men and women. From technical point of view, this difference allows to improve the model and introduce new features which were not observed in the overall PID and its aggregates and derivatives. The updated model successfully meets a number challenges.

There are features in the females’ curves in Section 2, which likely manifest some changes in the parameters considered in the original setting as constant. The convergence trend of the male and female PIDs observed since the earlier 1960s suggests that the size of work instruments available for women has been growing. It does not reach the level of the males’ instruments, however. So far, we used fixed relative sizes of work instruments. So, our model has to include a new option allowing the size change according to some predefined time or real GDP function.

It is reasonable to start with an approximate estimate of FL which fits best the observed features for different years. We have calculated a number of models with FL changing from 0.2 to 1.0 with a 0.05 step and all other parameters as in the original model. As we know from Section 2, the most sensitive part in the mean income dependence on age is the initial segment. Figure 28 depicts the measured and predicted mean income curves for 1962 and 1977 as obtained with FL=0.45. This FL value is the best to describe the dynamics of mean income growth in the youngest population and also demonstrates the feature of constant mean income before the critical age. The predicted curves start to fall long before the critical age, however. But T_c is controlled by a different dependence on real GDP per capita and does not influence the initial growth. So, the estimate of FL=0.45 is not compromised by the deviation of the measured and predicted critical ages, which we have address next.

In Figure 29 we present similar curves for 1996 and 2011, but for FL=0.65. The measured curves are accurately approximated by the model between the start point and the critical age. The fall is also well predicted but there are some deviations in 1996, which could have some connection to retirement, as discussed in the previous Subsection. Comparing the curves in Figures 28 and 29 one can conclude that FL has to grow from 0.45 in 1962 to 0.65 in 2011 in order to fit the rate and duration of the initial growth.

Figure 28. The observed and predicted mean income for 1962 (left panel) and 1977 (right panel). In both models FL=0.45.

The simplest way to match the critical age observed between 1962 and 1977 is to rise the initial value T_c(1929). Then the predicted value in 1962 and 1977 has to change accordingly as the square root of real GDP per capita. Figure 30 displays the modified mean income curves (green dotted lines) for the same years as in Figure 28 and 29, which now fit observations in 1962 and 1977 before and beyond the critical age. For comparison, we have also drawn the curves from the original model (red dotted lines). The change in T_c is not justified by any reasonable relationship, however. In addition, the curves predicted for 1996 and 2011 with the higher initial value of T_cdo not fit observations neither in the initial segment not in the critical age. The shelf exists in all models, but its length is not well predicted. Considering the large number of contradictions we met when modelling the female mean income it is necessary to extend our original model to match all observed changes in a consistent way.

Figure 29. The observed and predicted mean income for 1996 (left panel) and 2011 (right panel). In both models FL=0.65. Notice the fit between the observed and measured critical ages.

Figure 30. The observed and predicted mean income as a function of age for selected years between 1962 and 2011. Two models are shown: standard model with FL=0.45 and T_c =19.07 years in 1929 (red dotted line); standard model with FL=0.45 and T_c=26.0 years in 1929 (green dotted line).

A lower FL in the earlier years provides an extended “no-change” period before the critical age. The difference between the observed and predicted critical ages in Figure 28 suggests that the concept of critical age belongs to the super-critical distribution rather than to the low-middle incomes predicted by the model. Moreover, the critical age for women observed between 1962 and the earlier 1980s is almost constant and close to the retirement age. It is reasonable to suggest that there are two critical ages – one for the super-critical distribution, T_c, and another for the sub-critical distribution, T_S. The former variable depends on GDP. The latter one – is practically constant and may correlate with the retirement age. It is easy to model the retirement effect by an exponential fall in the portion of people in the labour force [Munell, 2011]. Using the basic concept of (18) one can write the following equation:

η = −lnB / (T_B – T_S) (23)

where B is the constant relative level of income rate at age T_B>T_S. Both constants in (23) have to be estimated from data. Then the evolution of sub-critical incomes above the retirement age is described by (23) and the evolution of super-critical incomes is defined by (18). When a super-critical income falls below the Pareto threshold it starts to follow (23). As we know, the portion of people in the Pareto zone falls exponentially and

The effect of T_S in the overall distribution was masked by income dominance of male population, data roughness, and by the fact that the most recent period was modeled. As a result, T_S is missing from the original model. The use of the detailed IPUMS income microdata available from 1962 has revealed this effect for both genders. Figure 26 indicates that this effect cannot be neglected even in the males PIDS and we consider it later on.

Having the new concept of critical age, T_S, for sub-critical incomes we can model now the lengthy shelf observed in the mean income curves. Three parameters in (23) have to be fixed in the model: T_S, T_B, and B. The age of retirement may vary by a few years over the whole modelled period between 1962 and 2011 as we see in the measured mean income curves. Without loss of generality, from the lengthy period of income measurements, from the estimated age of women’s retirement, as well as from the results of extensive modelling we fix T_s to 43 years of work experience or 58 years of age. This is the start of retirement process, which is described by an exponential fall in the portion of people in labour force. The estimated age of retirement is defined by the portion of 50% in the labour force [Munell, 2011]. To match observations we vary A and B in (18) and (23) and obtain the best fit estimate by a simple regression procedure in the age ranging from 15 to 70 years. Larger deviations observed beyond 70 years of age are likely related to poor measurements and should be neglected as noise.

From Figure 30, we have estimated the overall increase in FL from 1962 to 2011. From the IPUMS data, one cannot estimate the exact form of functional dependence of women’s L_j on time or real GDP per capita before 1962. It is reasonable to choose 1960 as the start year and to reduce all initial values used in the original model starting in 1930 to 1960. They are as follows: τ₀=1960, T_c=25.92 years, and α= 0.0795, Y(1960)=1. For the dependence of FL on time, we suggest the simplest linear time growth between 1962 and 2011. It makes a 0.004 annual step in FL starting from 0.45 in 1960. We understand that the actual FL dependence on time may deviate from the simple linear approximation, but the accuracy of the IPUMS data is not good enough to estimate exact FL for each year between 1962 and 2011.

There is one model parameter pending adjustment – the Pareto threshold for women, which likely increases over time. As we discussed in Section 3.1, the Pareto threshold may have a direct link to the size of work instrument. Interestingly, the reduction factor FL for women can be best estimated from the rate of growth at the initial stage before 30 years of age. The Pareto distribution is most prominent for the ages above 30 and below 60. But they are connected by some immanent bond, which manifests itself in a slightly different way for women and men.

Observations in Figures 9 and 22 suggest that the power law approximation works from different incomes for males and females. For example, in 1962 the Pareto threshold is $9,000 for males and $6,000 for females. If to consider the males’ threshold as related to normal sizes of work instruments, then the females’ threshold should be reduced by a factor of 2/3. It makes M^P=0.29 in 1960 and then this threshold should increase to the level close 0.43 in 2011. In order to match the linear growth in FL, we suggest a similar time function for M^P for women. The initial value M^P(1960)=0.29 and then it rises at a rate of 0.002 per year, making 0.39 in 2010. Then the growth in M^P and FL are synchronized.

There are several novelties proposed in Section 3 to accommodate a few specific income distribution features observed for females. First of all, we have found that the processes in the sub-critical and super-critical income zones have different critical ages, T_c and T_S, and indexes of exponential fall - relationships (18) and (23). Essentially, the high-income dynamic processes are decoupled from those in the low-middle income zone. The only connection between them is the exchange of people reaching the Pareto threshold from below and above. The joint distribution of incomes driven by different processes in two income zones is able to describe all features observed in the mean income curves for females.

The multivariable matching process results in the best fit model over all involved parameters. Figure 31 depicts the observed and predicted mean income as a function of age for four years between 1962 and 2011. The fit of the initial segments is the same as in Figure 28 and 29 because the reduction factor FL for the size of instruments to earn money changes from 0.45 in 1960 to 0.65 in 2011. The shelf modelled in Figure 30 by different critical times T_c now consists of two segments as related to the critical ages in the sub-critical (T_S) and super-critical (T_c) zones. Since T_S is constant, the duration of the second leg, i.e. T_S-T_c, has been decreasing with time while the duration of the first leg has been increasing. The second leg of the shelf has a length of a few years in 2011. Following this trend, the second leg will disappear in the future when T_c>T_S. The U.S. political and economic authorities may increase the age of retirement, however.

Figure 31. The observed and predicted mean income as a function of age for selected years between 1962 and 2011.

The overall fit between the observed and predicted mean income curves in Figure 31 is much better than that obtained by the original model. In 1993, we notice a slightly lower agreement between the observed and measured mean income during the second leg of the shelf. This is likely the result of the linear approximation of FL and M^P. In reality, these parameters are different from those predicted by linear time function. It is worth noting that this discrepancy disappears in 2011 because of the overall convergence between the males and females characteristics. In a few decades, the male and female PIDs should be identical. Meanwhile, it is instructive to apply the upgraded model to males.

1.4.Male model upgraded

The relative sizes of work instruments L_jand the Pareto threshold M^P for males do not change with time as well as the age of retirement T_S. This fact facilitates the use of the upgraded model. We have to adjust only the levels A and B in (18) and (23) in order to obtain the best fit model. Figure 32 depicts the measured and predicted mean income curves for six years between 1962 and 2011. The overall fit is better than that achieved in Section 3.2 (see Figure 26). The fit below the critical age T_c is the same, however, since it is controlled by the parameters, which were not changed in the upgraded model.

There are two distinct segments beyond the critical age for the super-critical incomes, i.e. those distributed according to the Pareto law. First segment spans the ages between T_cand T_S and thus shortens with time. Because of the larger portion of males above the Pareto threshold the first segment demonstrates a clear fall, which is not well seen in Figure 31. The slope of this segment also increases with time since the exponent index γ̃ in (18) depends of T_c(τ).The second segment of the mean income curves is driven by to processes of exponential fall. The highest incomes continue to fall as in the first segment and the sub-critical incomes start their fall beyond the age of retirement. The joint effect of two processes is expressed by an expedite drop beyond T_S.

The first segment related to the super-critical incomes is predicted with an excellent accuracy for all presented years as well as the trajectories before T_c. This is an important verification of the model predictive power. The changing slope and length of the pre-critical stage (t<T_c) and same features of the first segment - all change over time as defined by their dependence on one external variable – real GDP per capita. The predicted curves not only fit the measured ones for the selected years. They demonstrate synchronized change with time (real GDP) – the ultimate demand of any dynamic model in physics. Hence, we conclude that our microeconomic model is a physical one rather than economic.

The prediction of the second segment is complicated by a few shortcomings. Firstly, the age of retirement changes with time depending not only on legislation but also on economic situation. After the last financial and economic crisis, many discouraged people have to leave the labour force. The rate of participation in labour force has fallen from 66.4% in 2007 to 62.4% in September 2015. (During the same period the rate of unemployment has changed from 4.6% to 5.1%, with an extremely high rate in-between, however.) As a result the age of retirement may drop. Secondly, the accuracy of income measurements is significantly lower for the eldest population because of the overall under-representation. Thirdly, the population pyramid for the elderly fluctuates as a consequence of WWII and the post-war baby boom. Our model can adjust all these aspects and predict the second segment better than its current version. This would be non-physical amendments, however, with are not the priority of quantitative modelling.