Personal income distribution in the US

We are going to revisit our model for personal income distribution, PID. It was first formalized in 2003 and used income distributions through 2001. We had to convert all reports published by the Census Bureau in pdf format between 1947 and 1993 into excel tables. It took a month of hand work together with proof reading. These reports are not converted into digital format yet.

In 2006, we used new data (through 2005) and re-estimated the model. In 2010, we published a book on personal income distribution using data through 2008. It is a good time to refresh the model and evaluate its performance since 2001 with ten more years of data. All major results will be presented in this blog. 
We start with presenting original data. The distribution of personal incomes since 1994 is characterized by a higher resolution – income bins are only $2500 wide. Our model assumes that the overall income distribution depends on the age pyramid and the level of real GDP per capita. However, the evolution of PID is slow and at a twenty year horizon one actually sees a frozen PID. The frozen PID results in an almost constant Gini ratio over time, which is actually reported by the Census Bureau.
We illustrate PID in a few figures below. Figure1 presents all PID published since 1994 between $0 and $100,000 as they are.  We have included all people without income into the bin between $0 and $2500. One can observe that the number of people in higher income bins increases with time as well as the number of people with incomes above $100,000 shown in Figure 2. The portion of people with incomes above $100,000 has been increasing by 0.3% per year since 1994. Figure 3 shows the number of people with income above $100,000 as a function of work experience. The fastest growth is observed for the groups between 30 and 40 years of work experience, i.e. between 45 and 55 years of age.
Figure 4 depicts the population density functions, PDFs, for the years between 1994 and 2010. First, the estimates presented in Figure 1 were normalized to the total population for a given year. Then we reduced the income scale for individual years, i.e. from 1995 to 2010, by the total growth of real GDP. This allows normalizing the curves to the total income, i.e. we reduce all scales to that of 1994. Finally, we normalize the portions of populations in given bins to their widths for individual years and obtain the population density functions. Figure 4 proves that the distribution of personal incomes has not been changing over time in relative terms, i.e. a given portion of population always has a given portion of total income. From the PIDs one can always build the relevant Lorenz curves and estimate Gini ratios. For higher incomes, the distribution has to be described by the Pareto distribution. Figure 5 shows that the PDFs at higher incomes do follow a common power law with an exponent of -3.9.  
Our first assessment of the income data obtained after 2001 is that they do follow up the previously obtained relationships. We expect that our model for personal income distribution should perform well.

Figure 1.  Personal income distributions from 1994 to 2010.
Figure 2. Portion of people with income above $100,000. The portion increases by 0.3% per year. 
Figure 3. The number of people with income above $100,000 as a function of work experience. The fastest growth is observed for the groups between 30 and 40 years of work experience, i.e. between 45 and 55 years of age.

Figure 4. Population density function, i.e. the number of people in a given bin normalized to the total number of people and the width of income bin, as a function of income reduced by the overall GDP growth. 
Figure 5. The Pareto distribution at higher incomes.


Krugman on the current slump

Paul Krugman has presented a graph with real GDP for the UK. It illustrates that the current crisis is worse than it was in 1929. I’ve also downloaded data from the Maddison historical data and the Conference Board total economic database, which has inherited the Groningen (read Maddison) database.
The idea was to compare GDP per capita estimates for the same periods in order to remove the effect of population growth. Surprisingly, I’ve got a different result. Figure 1 below demonstrates that the current evolution of real GDP in the UK is much better than after 1929. Moreover, the estimates of per capita GDP after 1929 show a deeper recession than in 2007.  Unlike Krugman, I do not add GDP projections for 2012 to 2014. 
In Italy, real GDP has a deeper fall after 2007 than after 1929 (Figure 2) but the estimates of real GDP per capita say that the quick recovery of real GDP was definitely driven by increasing population. The state of per capita curve in 2010 was very close to that in 1932, i.e. three years after start of the crisis.  

Figure 1. The evolution of real GDP and GDP per capita in the UK after the 1929 and 2007 crisis.  

Figure 2. The evolution of real GDP and GDP per capita in Italy after the 1929 and 2007 crisis.  

On wise monetary policy and the absence of liquidity trap in Japan

We have been following inflation in Japan since 2005 when our first paper on the Japanese economy was published. We have revisited inflation in Japan in 2010 and confirmed the predictions of deflation as expressed by the negative GDP deflator. In this blog, we also reported on deflation (both CPI and GDP deflator) several times. Here we validate our predictions of the rate of consumer price inflation (CPI) by the estimate for 2011. The Japan Bureau of Statistics has estimated the rate of CPI inflation as -0.3%.

The case of Japan is the best illustration of our concept linking inflation to the change in labour force. We assume that there was neither liquidity trap in Japan nor mistakes in monetary policy. The evolution of inflation is completely driven by the change in labour force. This is an unfortunate situation for Japan since the level of labour force can only fall in the long run due to the quickly decreasing working age population.   

Previously, we carried out an estimation of empirical relationship between the change rate of labour force, dLF(t)/LF(t), and inflation, p(t).  First, we test the existence of a link between inflation and labour force. Because of the structural (measurement related?) break in the 1980s, we have chosen the period after 1982 for linear regression. By varying the lag between the labour force and inflation one can obtain the best-fit coefficients for the prediction of CPI inflation, p(t),  according to the following relationship: 

p(t) = 1.43dLF(t-t0)/LF(t-t0) + 0.000         (1) 

where the time lag t0=0 years; standard errors for both coefficients are shown in brackets.  Figure 1 (upper panel) depicts this best-fit case. There is no time lag between the inflation series and the labour force change series in Japan. Free term in (1), defining the level of price inflation in the absence of labour force change, is practically undistinguishable from zero.

A more precise and reliable method to compare observed and predicted inflation consists in the comparison of cumulative curves. Short-term oscillations and uncorrelated noise in data as induced by inaccurate measurements and the inevitable bias in all definitions should be smoothed out in cumulative curves. Any actual deviation between two cumulative curves persists in time if measured values are not matched by the defining relationship.

The predicted cumulative values shown in the lower panel of Figure 1 are very sensitive to the free term in (1). For Japan, the cumulative curves are characterized by complex shapes. There are periods of intensive inflation and a deflationary period. The labour force change, defining the predicted inflation curve, follows all the turns in the measured cumulative inflation.

One can conclude that relationship (1) is valid and the labour force change is the driving force of inflation. Statistically, the evolution of the overall level of consumer prices in Japan is fully defined by the change in labour force. Even the annual curves have Rsq=0.73 with all fluctuations induced by the change in labor force. The cumulative curves are characterized by Rsq=0.99. Hence, no other variable or process can affect the change in price. Otherwise, the statistically reliable link would not exist. 
Effectively, this means that the Japanese monetary authorities can not create conditions for positive inflation and thus there is no liquidity trap.  The problem of deflation can be resolved only in the framework of increasing population and Figure 3 shows that the next forty years will be characterized by price deflation (both CPI adn GDP deflator) when population projections are used to extrapolated labour force.

Figure 1. Measured inflation (CPI) and that predicted from the change rate of labour force. Upper panel:  Annual curves. Lower panel: Cumulative curves between 1982 and 2009. A good agreement between the cumulative curves illustrates the predictive power of our model.

Figure 2. Scatter plot: predicted vs. measured rate of CPI inflation.

Figure 3. Inflation projection  for Japan: CPI and the GDP deflator

Why we insist that personal income inequality does not change

We have already reported that the personal income distribution in the USA does not change with time when normalized to the total population and total income. In other words, the relative distribution of personal income in the United States has not been changing since the start of income measurements in 1947. The accuracy of early measurements is not good enough, however, and we have to rely of the most recent results.
The US Census Bureau routinely reports income estimates obtained during the Annual Social and Economic Supplement of the Current Population Surveys. We have retrieved the population distribution over mean income in the range from $0 to $250,000 which is available only from 2000. The relevant measurements of the number of people in a given income range were carried out in $2500 bins between $0 and $100,000 and $50000 bins between $100,000 and $250,000. In order to suppress the influence of the width we have calculated the population density, i.e. the ratio of the number of people in a given bin and its width. The personal income is measured in current dollars and thus we have to reduce all incomes by the total change of the GDP deflator (one may also use CPI which gives a 20 per cent higher inflation rate) since 2000 to a given year. Figure 1 shows the result of normalization for 2000, 2005, and 2010. In relative terms, the income distribution has not been changing since 2000. At higher incomes, all three curves are practically identical. This observation is validated by the estimates of Gini ratio provided by the Census Bureau.
Figure 1. The population density function, PDF, as a function of mean income as normalized to the total personal income for a given year. At higher incomes,  the curves are practically identical.


Two more graphs on GDP in the USA

Two more graphs on real GDP in the USA. In the forth quarter of 2011, the level of real GDP was higher ($13,422 .4 billion) than that in the fourth quarter of 2007 ($13,326 billion) as Figure 1 shows. Figure 2 demonstrates that  the increasing population did not allow real GDP per capita t reach the level of 2007: $42,727 vs. $43,791. In seems to be the task for 2012 to 2014 if no recession will occur.  

Figure 1. Real GDP

Figure 2. Real GDP per capita

Real GDP and GDP deflator in 2011

Here are some quick notes on the new estimates of real GDP and GDP deflator for 2011.  Figurer 1 shows the rate of growth of real GDP, dlnGDP/dt, at annual and quarterly basis. In 2011, the rate is 0.017 1/y, i.e. 1.7% per year, despite the rate growth in the fourth quarter of 2.7% (SAAR). Previously, we predicted a small recession in 2012 and 2013. This prediction will be updated soon when the 2010 census results are published and incorporated into the so called postcensal population estimates.
Figure 2 shows the growth of total population for the purpose of per head calculations. Please notice a large step in population between 1999 and 2000 as associated with the error of closure, i.e. the difference between intercensal estimate for 2000 and the number enumerated in the 2000 census.  Figure 3 presents the rate of growth of real GDP per capita , dlnG/dt. In 2011, the rate of growth was 0.0098 1/y. This means that the total increase in population was of 0.7%.
Figure 4 shows the GDP deflator or price inflation associated with the economy as a whole. This is the most comprehensive measure of inflation as we discussed many times in this blog. For 2011, the GDP deflator is 2.1%. This is larger than we predicted but the last quarter signals about upcoming deflation, as we foresaw six years ago. Currently, the FRB also foresees very low inflation rates through 2014.
According to our concept of GDP growth,  real GDP per capita has a inertial component which is expressed in a constant annual increment, dG=const. Figure 5 updates the graph showing the evolution of real GDP per capita in the USA. One can observe a gradual return to the constant level of annual increment. This works as inertial movement in physics. However, one can expect some more years of dG less than average, dG<$490 (2011 US dollars).  It is worth noting, that there is no output gap when real GDP per cpaita is considered. The evolution of dG exactly follows it long term trend and the years after 2007 serve to return dG to the trend from its highs in the late 1990s.
We have to notice that the estimates of the nominal GDP and GDP deflator are subject to revision which may be as high as several per cent (+2.1% for 2001). However, the long term trends in all presented variables fit our concept and predictions. 

Figure 1. The growth rate of real GDP: annual and quarterly (annualized). MA(4) for the quarterly time series. For 2011, dlnGDP/dt=0.017  1/y.

Figure 2. The evolution of total resident population. Notice the jump between 1999 and 2000 – the closure error.

Figure 3. The growth rate of real GDP per capita. For 2011, the rate is (dlnG/dt=) 0.0098 1/y. 
Figure 4. Annual and quarterly (annualized) price deflator of GDP. In the last quarter of 2011 the GDP deflator dropped to 0.004 1/y. This is likely a turn to deflation.

Figure 5. The increment of real GDP per capita since 1950. As predicted, the trend returns to a zero slope. There is no output gap.

Unemployment in Spain will be increasing further

Here we revisit the rate of unemployment, ut, in Spain using its dependence on the change in labor force, lt=dLF/LFdt. There is a new estimate of 22.8% for the unemployment rate in 2011. In May 2011, we quantitatively predicted that this rate should only be growing. It may reach 29% if the link between the rate of unemployment and the rate of labor force change is correct, as has been observed since 1980.

Previously, it was found that Spain is characterized by the same relationship between unemployment and labor force as other developed countries. For Spain, we used data provided by the OECD. Figure 1 depicts unemployment and the change rate of labor force between 1960 and 2011. In line with the OECD description of the breaks in the labor force series:

Series breaks: In 2005, changes in the questionnaire and the implementation of CATI system in the field work affected the estimates. The 2005 questionnaire produced an additional increase of employment (132 000) and a decrease of unemployment (78 000). From 2001, the new unemployment definition established by the European Commission in 2000 has been introduced. From 1994, persons employed in the “Guardia Civil” are not included in the armed forces. As an indication, this category represented 59 600 people in 1994. In 1976, the lower age limit for inclusion in the Labour Force Survey was raised from 14 to 16, at the same time other modifications to the survey were introduced.

there are two spikes in the dLF/LF series near 1976 and 2001 as related to step revisions to the level. The spike around 1988 has no explanation in terms of the revisions to labor force, but is of the same amplitude. One can not exclude the opportunity that this spike is related to the processes of joining the EU in 1986.

As expected, the same functional form of dependence is valid for Spain. The estimation method is based on trial-and-error approach and seeks for the fit between annual curves. The final model is as follows

ut = -7.0lt + 0.31; t>1986

Figure 2 depicts observed and predicted curves. Before 1986, the curves diverge and a different model is likely holds. Because of high-amplitude oscillations in the original time series for the rate of labour force change, lt, we have to smooth it by MA(3). For the period after 1986, R2=0.7. Thus, the change in labor force has been driving the rate of unemployment in Spain. The negative coefficient implies that unemployment is Spain goes down when labor force starts to increase.

As has been predicted by our model, the rate of unemployment has increased in 2011. This is not the end of the sad story on unemployment in Spain. Figure 2 evidences that it will likely be growing further with the decreasing labor force.

Figure 1. Unemployment rate, u, and the rate of labor force change, l, in Spain according to the definition introduced by the OECD.

Figure 2. Prediction of inflation by labor force. Due to high variation in the estimates of labor force we have smoothed it with MA(3). For the observed and predicted curves, R2=0.7 for the period between 1986 and 2011.


Wal-Mart share in 2012

We estimated our price model for Wal-Mart Stores (NYSE: WMT) nine months ago. The model is based on the decomposition of a share price into a sum of two selected consumer price indices. This is a new model defined by the (seasonally not adjusted) index of hospital and related services (HOSP) and the price index of miscellaneous personal services (MISS), as reported by the US BLS. The former CPI component leads the share price by 10 months and the latter one evolves in sync with the price. Figure 1 depicts the overall evolution of both involved indices through December 2011. A very specific feature of both indices is their linearity over time: they are close to straight lines. 
In this post, we re-estimate the WMT share price using new data through December 2011. This allows validating the initial model and demonstrating its reliability. The previously obtained defining components are the same and provide the best fit model between June 2010 and March 2010 with only one month change in the lag for the HOST index.  All coefficients in (1) are only slightly different for the new model (see below).  The slope of the time trend is negative. The best-fit 2-C model for WMT(t) is as follows: 
WMT(t) =  0.50HOSP(t-10) + 1.42MISS(t)  - 28.39(t-1990) – 158.12 (January 2011)  (1)
WMT(t) =  0.46HOSP(t-9) + 1.49MISS(t)  - 28.03(t-1990) – 165.50 (March 2011)
WMT(t) =  0.46HOSP(t-9) + 1.30MISS(t)  - 26.06(t-1990) – 141.92 (December 2011)
where t is calendar time. The predicted curve in Figure 2 evolves in sync with the observed price. The residual error is $2.13 for the period between June 2003 and December 2011. With both indices growing along their respective trends one can expect a slight increase to the level of $60 to $65 per share in 2012Q1. Figure 3 presents the residual model error.  
Figure 1. Evolution of the price of HOSP and MISS. 
Figure 2. Observed and predicted WMT share prices.
Figure 3. Residual error of the model.

Quarterly report: Loews share price model

This is a quarterly report on the performance of our share price model for Loews Corporation (NYSE: L). The model is based on the decomposition into a weighted sum of two consumer price indices (selected from a larger number of CPIs), linear trend and constant, all coefficients and time lags to be estimated by a LSQ procedure. Here we test the previous model and make a regular update using new data. All in all, the original model is valid since October 2008 and does not show any sign of future changes. This is a reliable model valid during the past 50 months!  
A preliminary model for Loews Corp. was obtained in September 2009 and covered the period from October 2008. This old model included the index of food without beverages (FB) and the index of transportation service (TS). The most recent model also used the monthly closing prices as of April 2011 and the CPI estimates published on April 14, 2011. The defining indices were almost the same: the index of food (F) and the TS index. Figure 1 depicts the evolution of the indices which provide the best fit model, i.e. the lowermost RMS residual error, between July 2003 and December 2011.  The F index leads by 5 months and the TS index by 4 months.  When new data through December 2011 are used, the model does not show any tangible change - only coefficients have been slightly drifting:  
L(t) = -2.03F(t-5) – 2.12TS(t-4) +28.23(t-1990) + 448.98, March 2011
L(t) = -2.01F(t-5) – 2.09TS(t-4) +27.96(t-1990) +440.65, September 2011
L(t) = -2.03F(t-5) – 2.02TS(t-4) +27.65(t-1990) +431.99, December 2011      
where L(t) is the share price in US dollars, t is calendar time. The new model is depicted in Figure 2 together with high and low monthly prices as a proxy to the uncertainty bound of the share price. The predicted curve leads the observed one by 4 months. The residual error is of $2.42 for the period between July 2003 and December 2011.  In the first quarter of 2012, the model foresees essentially no change. It is worth noting that the model obtained in March 2011, accurately predicted the small fall observed in the second and third quarters of 2011.   
Figure 1. Evolution of the price indices F and TS.
Figure 2. Observed and predicted share prices.

Why the Economic Projections of Federal Reserve Board are inconsistent

The FRB members have recently projected the evolution of key macroeconomic variables including real GDP and the rate of unemployment. In our blog , we have developed a very accurate model linking the rate of unemployment in the US to the rate of real GDP (per capita) growth: (A series of posts has resulted in a working paper.) The following relationship was estimated:

du = -0.62dlnG + 1.09,  (1)

When integrated between t0 and t, equation (1) can be rewritten in the following form:

u(t) = u(t0) + bln[G/G0] +a(t-t0) + c  (2)

Without loss of generality, we assume t0=0. The intercept c≡0, as is clear for t=t0. Instead of integrating (2), we calculate cumulative sums of the annual estimates of du and lnG with appropriate initial conditions. The cumulative sum of du’s is the time series of the unemployment rate. Figure 1 depicts the measured and observed curves for the period between 1958 and 2011. The agreement is excellent and has been obtained by a formal statistical method (LSQR).

 From (1) it follows that higher rates of GDP growth decrease the rate of unemployment. The FRB has projected real GDP with the highest rates of 2.7% in 2012, 3.2% in 2013, and 4% in 2014. We reduce these rates by 1% per year to estimate the per capita rate of growth, i.e. the growth in population is 1% per year. Using (2) we calculate the rate of uneployment which will correspond to the projected real GDP.
Figure 1 also depicts these predicted rates for 2012 to 2014 by open circles. The rates of unemployment projected by the FRB are shown by red circles. There is a significant deviation between the predicted and projected rates, which likely manifests the inconsistency in the FRB member's models of unemployment.

One may check these projections in 2015. 

Figure 1. The observed and predicted rate of unemployment in the USA between 1958 and 2010.The projected rate of unemployment (middle point of the projections) is shown by red circles. 

Who is responsible for income inequality? Blame old men.

I've plotted a series of mean personal income estimates borrowed from the Census Bureau. There are 10-year age bins with data from 1967 for men and women separately. The first plot shows that the male mean income is much higher than that of female. The gap between them has been slightly decreasing since 1974 but not spectacularly. Therefore, a higher rate of income growth for women results in decreasing income inequality, i.e. convergence of mean incomes.
In order to highlight the age and sex groups growing at the the highest rates all mean values in given age groups are normalized to 1967 (except the youngest group normalized to 1974). These plot show the following empirical results:
1. Young women (especially between 25 and 44) have been effectively closing the income gap with men.
2. Younger male groups (from 15 to 34) have suffered absolute decrease in mean income since 1974!
3. For men, the highest rates of income growth belongs to the eldest group. Hence, one has to blame them for increasing inequality. Some of them are blogging on inequality.


Income inequality paradox - family vs. personal median income

One can always find a good graph to illustrate increasing income inequality in the US. Lane Kenworthy and then Paul Krugman have demonstrated a dramatic deviation between the median family income and GDP per capita.  We have already posted on the problems behind the definition of income and GDP. Here we illustrate another paradox of the data published by the US Census Bureau. From the same source we retrieved the following time series: median personal and family income (chained $), GDP per capita (chained $) and Gini ratios for family and personal income distribution. Figure 1 shows the median personal and family income as normalized to 1974, and the GDP per capita time series also normalized to 1974. This is practically the plot from Lane Kenworthy except the personal income median. Instructively,  both median curves are very similar with small deviations likely associated with the estimation procedure than with actual changes.

Figure 1. Median incomes and GDP per capita in the US.

Now we plot another values also published by the Census Bureau. Figure 2 depicts Gini ratios for family and personal income distribution as calculated by the Bureau.

Figure 2. Gini ratios for personal and family income distributions measured by the Census Bureau.

A big surprise - Gini ratio for personal income has been decreasing since 1994 (no estimates before this point). So, the deviation between the GDP per capita and median income does not support increasing inequality.  More likely, it is something wrong with the Census Bureau.


A model for Harley-Davidson share price: three years of success

This is a quarterly report on the performance of our pricing model. Harley-Davidson (HOG) is one of the best illustrations of our concept (see a brief description of the concept in Appendix) linking stock prices to CPI components. For HOG, the model is stable for many years. The first model was obtained in September 2009 and covered the period from October 2008. Here we revisit the HOG model using the monthly closing price for December 2001 and the CPI estimates published in January 2012.   

For HOG, the defining indices are as follows: the index of rent of primary residence (RPR) and the index of owners' equivalent rent of residence (ORPR). Both CPI components are leading the share price. Figure 1 depicts the evolution of the indices which provide the best fit model, i.e. the lowermost RMS residual error, between July 2008 and December 2011.  The models are as follows: 

HOG(t) = -13.82RPR(t-3) +12.77ORPR(t-4) +17.82(t-1990) – 163.94, before September 2009
HOG(t) = -11.30RPR(t-3) + 9.83ORPR(t-3) +17.53(t-1990) – 36.34, after September 2009
HOG(t) = -11.27RPR(t-3) + 9.55ORPR(t-3) +19.35(t-1990) – 8.57, September 2011
HOG(t) = -11.09RPR(t-4) + 9.40ORPR(t-4) +18.93(t-1990) – 7.53, December 2011

where HOG(t) is the share price in US dollars, t is calendar time. The model is characterised by standard deviation of $4.28 for the period between July 2003 and December 2011.   

Two recent models are depicted in Figure 2. The predicted curve in December 2011 leads the observed ones by 4 months. We do foresee a further fall in the stock price to $26 per share in the first quarter of 2012.  Figure 3 displays the residual error.

Figure 1. Evolution of the price indices ORPR and RPR.

Figure 2. Observed and predicted HOG share prices. Model for March, September, and December  2011.

Figure 3. The model residual error.  

In its general form, our pricing model is as follows: 

sp(tj) = Σbi∙CPIi(tj-ti) + c∙(tj-1990 ) + d + ej                                                              (1) 

where sp(tj) is the share price at discrete (calendar) times tj, j=1,…,J; CPIi(tj-ti) is the i-th component of the CPI with the time lag ti, i=1,..,I; bi, c and d  are empirical coefficients of the linear and constant term; ej is the residual error, which statistical properties have to be scrutinized. By definition, the bets-fit model minimizes the RMS residual error. The time lags are expected because of the delay between the change in one price (stock or goods and services) and the reaction of related prices. It is a fundamental feature of the model that the lags in (1) may be both negative and positive. In this study, we limit the largest lag to eleven months. Apparently, this is an artificial limitation and might be changed in a more elaborated model.

System (1) contains J equations for I+2 coefficients. For POM we use a time series from July 2003 to March 2011, i.e. 94 monthly readings.  Due to the negative effects of a larger set of defining CPI components their number for all models is (I=) 2. To resolve the system, we use standard methods of matrix inversion. As a rule, solutions of (1) are stable with all coefficients far from zero. In the POM model, we use 92 CPI components. They are not seasonally adjusted indices and were retrieved from the database provided by the Bureau of Labor Statistics.

Due to obvious reasons, longer time series guarantee a better resolution between defining CPIs. In general, there are two sources of uncertainty associated with the difference between observed and predicted prices. First, we have taken the monthly close prices (adjusted for splits and dividends) from a large number of recorded prices: monthly and daily open, close, high, and low prices, their combinations as well as averaged prices. Second source of uncertainty is related to all kinds of measurement errors and intrinsic stochastic properties of the CPI and its components. One should also bear in mind all uncertainties associated with the CPI definition based on a fixed basket of goods and services, which prices are tracked in few selected places.  Such measurement errors are directly mapped into the model residual errors. Both uncertainties, as related to stocks and CPI, also fluctuate from month to month. 

Quarterly report: the performance of share price model for Hewlett Packard

This is quarterly report of the performance of our share price model. Hewlett Packard (NYSE:HPQ) provides a good example of a successful share price prediction at a several month horizon.  We have already published our predictions at a four month horizon four times (July 2010, January 2011, March 2011, July 2011, and September 2011). All predictions were based on our concept of share pricing as decomposition into a weighted sum of two CPI components.  We calculated the evolution of the monthly closing price (adjusted for dividends and splits). Here we test and update the model using data through December 2011. The model is still accurate and robust. 

Originally, the long term model for HPQ share price was defined by the index of food without beverages (FB) and that of rent of primary residency (RPR). The former CPI component led the share price by 4 months and the latter one led by 5 months. Figure 1 depicts the overall evolution of both involved indices through December 2011 (this is the reason of the time lead increase by 1 month relative to previous models where CPI were not contemporaneous). Below we present four best-fit 2-C models for HPQ(t) obtained at different times:  

HPQ(t) = -3.20FB(t-4) + 2.91RPR(t-5) + 3.64(t-1990) - 50.82, July 2010
HPQ(t) = -3.34FB(t-4) + 3.41RPR(t-5) + 0.51(t-1990) - 85.44, June 2011
HPQ(t) = -3.46FB(t-4) + 3.68RPR(t-5) – 0.72(t-1990) - 99.88, September 2011
HPQ(t) = -3.40FB(t-5) + 3.60RPR(t-6) – 0.57(t-1990) – 97.72, December 2011 

where HPQ(t) is the price in US dollars, t is calendar time. All coefficients have been slightly drifting but very close. This process expresses the trade-off between the linear trend in the difference between  the defining CPIs and the time trend term in the above equtions.  

The predicted curves are shown in Figure 2 (March, September, and December 2011). From Figure 2, we predict the price to fall to $20 in the first quarter of 2012 and then rise to $25 in Q2.  

Figure 1. Evolution of the price of FB and RPR.

Figure 2. Observed and predicted HPQ share prices in March, September, and December 2011. The contemporaneous prediction is shown by solid red line.  High and low prices are shown by dashed lines.


Quarterly report on Procter and Gamble

This is a quarterly update of the share price model for Procter and Gamble as based on the decomposition into a weighted sum of two consumer price indices (to be determined), linear time trend and constant. It is shown that the model is valid since September 2009 at least and does not show any sign of possible failure. It predicts the share price at a four month horizon. 

A share price model for Procter and Gamble (NYSE: PG) was originally published in this blog in July 2010.  According to our concept, it was defined by the index of food away from home (SEFV - CUUS0000SEFV) and that of rent of primary residency (RPR); the evolution of these indices is presented in Figure 1. In the original model, the former CPI component led the share price by 3 months and the latter one led by 8 months. The upper panel of Figure 2 depicts the original model and the monthly closing prices available in July 2010. This original model was stable for the previous 11 months, i.e. for the period from September 2009.

In April 2011, we updated the original model using some new data (closing price for March 2011) and found that the same model was also applicable with a small change in the time lead for the SEFV – it was 4 months instead of 3 months in the original model.  New coefficients were also slightly different, but very close to the original ones.

The September’s update uses the monthly closing price for September 2011 and CPIs for August 2011. It validates the model obtained for the previous period but is characterized by the same time lags and a small shift in the coefficients estimated by the LSQ technique. The current update for the December 2011 closing price is obtained using the CPI data for December 2011 and thus both time shifts are one month longer. Four best-fit models for PG(t) are as follows: 

PG(t) =  -5.88SEFV(t-3) + 3.43RPR(t-8)  + 17.60(t-1990) + 174.08, July 2010

PG(t) =  -5.40SEFV(t-4) + 2.93RPR(t-8)  + 18.16(t-1990) + 187.47, March 2011

PG(t) =  -4.94SEFV(t-4) + 2.47RPR(t-8)  + 18.15(t-1990) + 184.89, September 2011

PG(t) =  -4.76SEFV(t-5) + 2.27RPR(t-9)  + 18.29(t-1990) + 187.61, December 2011

where PG(t) is the monthly closing price (dividend and split adjusted) in U.S. dollars,  t is calendar time. 

In the lowermost panel of Figure 2, the predicted curve leads the observed price by 5 months with the residual error of $2.16 for the period between July 2003 and December 2011 (see Figure 3 for the model residuals). In other words, the price of a PG share is completely defined by the behaviour of these two CPI components.  

The model does predict the share price in the past and foresaw a period of no growth in the fourth quarter of 2011. In January-March 2012, the price may fall.

Figure 1. Evolution of the price of SEFV and RPR.

Figure 2. Observed and predicted PG share prices as obtained using contemporaneous data 

Figure 3. The model residual error.

The mean income gap between white males and black females grows during the democratic presidencies

Two days ago, we compared the mean income evolution of the white and black population and demonstrated that the difference did not change mu...