A. B. Freeman School of Business
ISDS 775, Spring 1997
Large-scale Data Analysis

EARNINGS EXPECTATIONS AND SECURITY PRICES
(REPLICATION FOR 1988~91)

Gary Cao
May 10, 1997


Executive Summary

 	The study conducted in this project applies available software packages for large-scale data analysis in 
testing the investment strategy proposed by Hawkins, Chamberlin and Daniel in 1984.  The data is extracted 
from IBES database, CRSP (S&P 500), and Bloomberg (T-bill rate). To analyze the portfolios' performance 
from 1988 to 1991, we use Oracle and SAS as primary tools to create six tables and perform data analysis for 
that period. 

	We replicate HCD (1984) portfolio selection method to initiate 16 portfolios. The 16 portfolios' 
returns over 12-month period are calculated and compared with S&P 500 returns. Using regression analysis, 
we compare the risk-adjusted returns with the returns required by CAPM to determine whether the investment 
strategy can achieve excess return. 

        The above table summarizes the comparison between HCD (1984) and our replication, which is 
marked as MBA (1997).  The third volume "Market" shows the expected numbers for a random portfolio.  
The table shows that the results of our replication of HCD (1984) is mixed, as opposed to the overwhelming 
evidence in HCD (1984).  In addition, the similarity between MBA (1997) and Market means that the 
investment strategy proposed by HCD (1984) adds little value (if at all) to investors during 1988 to 1991.  
This result confirms our expectation at the beginning of the data mining process:  The market is not very 
efficient in reacting to changes in earnings estimates, but market does learn from history.  A new investment 
strategy can earn abnormal return before the whole market knows about it, and it is impossible for any 
investment strategy to do so consistently.  This is consistent the a saying about rational expectation: "You can 
fool some people for some time, but you can not fool all the people all the time."


I.  PURPOSE STATEMENT

        Large-scale data analysis, depending on the specific goals the user is going to achieve, usually requires 
collecting data from different sources, compiling them in a consistent format, and then performing various 
data analysis and presentation tasks (online analytical processing, data mining, etc..)  In this paper, we are 
going to go through the above process, replicate some tests that were done by Hawkins, Chamberlin and 
Daniel in their 1984 work Earnings Expectations and Security Prices (HCD 1984) and examine whether the 
conclusion reach by HCD can stand the test of time.

II. PREVIOUS RESEARCH AND OUR PROJECT

        HCD (1984) tried to find out whether generally available information about consensus earnings 
expectations could be used to generate risk-adjusted excess returns - a challenge to the Efficient Market 
Hypothesis (EMH).  They used the Institutional Brokers Estimate System database (denoted as IBES 
hereafter) which at the time of the research provided earning estimates for over 2,400 stocks  made by more 
than 70 brokerage firms. 

1.  Steps in HCD (1984) Research

Creation of Screen #1Portfolios

1) Select from I/B/E/S Screen #1 the 20 stocks with the largest increase in their mean estimates and 
construct the following set of 24 equally weighted portfolios initiated at quarterly intervals (i.e., at the end 
of March, June, September and December) from 1975 to 1980:
 
Performance Before Risk Adjustment

2) Calculate the 12-month holding period returns of the above Screen #1 Portfolios, and compare the 
resulted returns with those of IBES and S&P 500.
 
3) Calculate the average cumulative total returns of the 24 Screen #1 portfolios.  Repeat this step for 
different lengths of holding period (ranging from 1 to 12 months), and compare the results with the 
corresponding return data of IBES and S&P 500.
 
Long Term Results
4) During a 6-year investment horizon, roll over portfolios for 3 months, 6 months and 12 months holding 
periods, respectively.  Compare the before-transaction-cost returns with those of IBES and S&P 500.  
Repeat this step for after-transaction-cost returns.
 
Risk-adjusted Performance 
5) Run the regression to test if the Screen #1 portfolios can outperform the returns required by CAPM.
 Regression Model:	(Rp - Rf ) =  a + b * (Rm - Rf)
 where: 	Rp 	– return of Screen #1 portfolio;
 Rf 	– risk-free rate (3-month t-bill rate);
 Rm 	– market return (S&P 500 return);
 b 	– portfolio's systematic risk;
 a 	– portfolio's abnormal return after adjustment for risk.
  
In Comparison with Random Portfolios
6) Establish 1000 random portfolios, each with 20 stocks randomly selected from IBES universe.
7) After controlling for risk (by means of %MAD), compare the returns of 24 Screen #1 portfolios with 
those of random portfolios.
8) Assuming the returns of the 1,000 random portfolios are normally distributed, calculate the possibility that 
a random portfolio can outperform a Screen #1 portfolio.

2.  Conclusion and Implication of HCD (1984)

        Results from the above HCD (1984) study were:

1) Performance Before Risk Adjustment: The Screen #1 portfolios outperformed S&P500 and IBES by a 
significant margin, and this margin became larger as the holding period increased.

2) Long-term Results (Roll-over strategy): Screen #1 portfolios once again outperformed both benchmarks.  
On a before-expense basis, three-month rollover strategy offered the highest returns, while on a after-
expense basis, six-month one did.

3) Risk-adjusted Performance: Out of a total of 24, twenty Screen #1 portfolios achieved abnormal returns 
after adjustment of their systematic risk.

4) In Comparison with Random Portfolios: In general, Screen #1 portfolios outperformed random portfolios 
with similar riskiness.  And the study showed that the probability of a random portfolio beating Screen #1 
portfolio is less than 2%.

         With the above results, HCD (1984) concluded that stock market was imperfect, and that market did 
not react instantly and accurately to the publicly available information on analysts' earning expectation 
adjustment.  The investors could take advantage of revisions in consensus earnings forecasts and make 
systematically excess return without bearing excess risk.

3.  How HCD (1984) Fits into Our Project

        As shown above, HCD (1984) cast enormous doubt upon Efficient Market Hypothesis.  HCD (1984) 
used the market data from 1975-1980.  To examine if HCD (1984) can stand the test of time, we are going to 
use a different time frame, 1988~1991.  The steps we are going to replicate are Step 1, 2 and 5.  To simplify 
the process, we will only use S&P 500 as the benchmark, instead of using both S&P 500 and IBES universe.   
IBES provides us with most of data needed, with S&P return data and T-bill rate downloaded into SAS from 
other sources.

        What do we expect from our replication of HCD (1984) research before we set off?  Under EMH, 
market aggregate information very efficiently.  In this case, an investor can not consistently realize abnormal 
return by reacting to earnings estimate change.  However, if the market is not very efficient, as observed in 
HCD (1984), it takes time (a matter of years, perhaps) for the market to aggregate the information about 
earning estimate change into the price and investors could achieve abnormal return during this period of time.  
On the other hand, market does learn from history, and we expect that since the publication of HCD (1984), 
market has become more efficient in reacting to changes in earnings estimates.  If that is the case, we expect 
our following analysis to show mixed evidence, as opposed to the overwhelming results shown  in HCD 
(1984).  

III.   DATA SOURCES AND PROCESSING

1.  Data Sources

(1)  Three-month T-Bill rate (monthly, 1988 to 1992) from Bloomberg;
(2)  S&P 500 monthly return data from CRSP;
(3)  Earnings estimate data from IBES.

2.  Data Processing

1) Obtaining the IBES data

        The IBES data is originally stored as an ASCII file in a professor's RS6000 account.  The IBES data 
has the following six linked files:

a) Summary;
b) Background;
c) Company ID;
d) Adjustments;
e) SIG Code;
f) Currency.

        We use SAS program to read-in the six files into text files (raw data).  The SAS scripts are listed in 
Attachments 1.1.

2)  Loading IBES data to Oracle

	We then use Oracle as a tool to "massage" the raw data into a format that can be used in SAS 
program.  (1) We create a table space for File 5 (SIG codes); (2) We write a control file to load the raw data 
obtained through a SAS program; (3) We also have a log file to check the process; (4) After the table is 
created, we run several queries to make sure the table content is correct.  The Oracle scripts are listed in the 
Attachments 1.2.

3)  Transforming Data to SAS format

	Since different software packages have different data formats that usually are incompatible, we use 
PERL to transform the raw data in Oracle table into a format so that SAS can read and process the data.  The 
PERL scripts are listed in the Attachments 1.3.

4)  SAS Programming and Calculation

	We replicate HCD (1984) research in the following steps:

a)  We look at March, June, September, and December from 1988 to 1991 (16 quarters) and plan to calculate 
the percentage change in monthly earnings estimate (mean) where number of estimates are greater than or 
equal to three in both months (Feb.-Mar., May-June, Aug.-Sept., and Nov.-Dec.).  We choose the largest 20 
companies of percentage change to form our portfolio for each of the 16 periods.

b)  We calculate returns of our 16 portfolios in the following pattern:
March portfolio: April to next March;
June portfolio: July to next June;
September portfolio: October to next September;
December portfolio: January to next December.
	
c)  Our calculation have 16 portfolios, with 20 companies in each portfolio, and 12 monthly returns for each 
company.  Based on this data file, we can calculate the cumulative return of each portfolio.  With T-bill rate 
and S&P data, we can calculate the alpha and beta for each portfolio.


IV.  RESULTS, IMPLICATIONS, AND CONCLUSION

1.  Results

        The result is summarized in the following table (omitted).

        We have the following findings from our research:

(1)  Seven out of the sixteen portfolios have negative alpha;
(2)  Average beta is 0.98, close to the market beta;
(3)  As the following two graphs show, portfolios with higher alpha usually have lower beta, and vise versa.


Alpha for the 16 portfolios from March 1988 to December 1991
(omitted)
 
Beta for the 16 portfolios from March 1988 to December 1991
(omitted) 

2.  Conclusion:

        The above table summarizes the comparison between HCD (1984) and our replication, which is 
marked as MBA (1997).  The third volume "Market" shows the expected numbers for a random portfolio.  
The table shows that the results of our replication of HCD (1984) is mixed, as opposed to the overwhelming 
evidence in HCD (1984).  In addition, the similarity between MBA (1997) and Market means that the 
investment strategy proposed by HCD (1984) adds little value (if at all) to investors during 1988 to 1991.  
This result confirms our expectation at the beginning of the data mining process:  The market is not very 
efficient in reacting to changes in earnings estimates, but market does learn from history.  A new investment 
strategy can earn abnormal return before the whole market knows about it, and it is impossible for any 
investment strategy to do so consistently.  This is consistent the a saying about rational expectation: "You can 
fool some people for some time, but you can not fool all the people all the time."


V. LIST OF ATTACHMENTS

1.  Transforming data
1.1  SAS Programs: read-in the raw data
1.1.1  Read File 1 (summary)
1.1.2  Read File 2 (background)
1.1.3  Read File 3 (company)
1.1.4  Read File 4 (adjustments)
1.1.5  Read File 5 (sig codes)
1.1.6  Read File 6 (currency)
1.2  Oracle SQL Programs: create table, load data
1.2.1  Create Table 1 (summary)
1.2.2  Create Table 5 (SIG code)
1.2.3  Load Table 1 (summary)
1.2.4  Queries on Table 1 (summary)
1.3  PERL: transforming data to SAS format 
1.3.1  Cleaning the data for SAS

2.  Calculation by using SAS Programs
2.1  Data Combination
2.1.1  Sorting out 16 portfolios (ticker)
2.1.2  Create a data file with company CUSIP) 
2.1.3  Combining the S&P data with the IBES data
2.1.4  Loading 3-month T-bill monthly rate
2.2  Data Calculation
2.2.1  Combining three data-sets (portfolio, T-bill, and S&P) into one
2.2.2  Calculate the cumulative returns; Calculate the alpha for 16 portfolios




******(end of report)******
This page hosted by
Get your own Free Home Page