A. B. Freeman School of Business
ISDS 775, Spring 1997
Large-scale Data Analysis
EARNINGS EXPECTATIONS AND SECURITY PRICES
(REPLICATION FOR 1988~91)
Gary Cao
May 10, 1997
Executive Summary
The study conducted in this project applies available software packages for large-scale data analysis in
testing the investment strategy proposed by Hawkins, Chamberlin and Daniel in 1984. The data is extracted
from IBES database, CRSP (S&P 500), and Bloomberg (T-bill rate). To analyze the portfolios' performance
from 1988 to 1991, we use Oracle and SAS as primary tools to create six tables and perform data analysis for
that period.
We replicate HCD (1984) portfolio selection method to initiate 16 portfolios. The 16 portfolios'
returns over 12-month period are calculated and compared with S&P 500 returns. Using regression analysis,
we compare the risk-adjusted returns with the returns required by CAPM to determine whether the investment
strategy can achieve excess return.
The above table summarizes the comparison between HCD (1984) and our replication, which is
marked as MBA (1997). The third volume "Market" shows the expected numbers for a random portfolio.
The table shows that the results of our replication of HCD (1984) is mixed, as opposed to the overwhelming
evidence in HCD (1984). In addition, the similarity between MBA (1997) and Market means that the
investment strategy proposed by HCD (1984) adds little value (if at all) to investors during 1988 to 1991.
This result confirms our expectation at the beginning of the data mining process: The market is not very
efficient in reacting to changes in earnings estimates, but market does learn from history. A new investment
strategy can earn abnormal return before the whole market knows about it, and it is impossible for any
investment strategy to do so consistently. This is consistent the a saying about rational expectation: "You can
fool some people for some time, but you can not fool all the people all the time."
I. PURPOSE STATEMENT
Large-scale data analysis, depending on the specific goals the user is going to achieve, usually requires
collecting data from different sources, compiling them in a consistent format, and then performing various
data analysis and presentation tasks (online analytical processing, data mining, etc..) In this paper, we are
going to go through the above process, replicate some tests that were done by Hawkins, Chamberlin and
Daniel in their 1984 work Earnings Expectations and Security Prices (HCD 1984) and examine whether the
conclusion reach by HCD can stand the test of time.
II. PREVIOUS RESEARCH AND OUR PROJECT
HCD (1984) tried to find out whether generally available information about consensus earnings
expectations could be used to generate risk-adjusted excess returns - a challenge to the Efficient Market
Hypothesis (EMH). They used the Institutional Brokers Estimate System database (denoted as IBES
hereafter) which at the time of the research provided earning estimates for over 2,400 stocks made by more
than 70 brokerage firms.
1. Steps in HCD (1984) Research
Creation of Screen #1Portfolios
1) Select from I/B/E/S Screen #1 the 20 stocks with the largest increase in their mean estimates and
construct the following set of 24 equally weighted portfolios initiated at quarterly intervals (i.e., at the end
of March, June, September and December) from 1975 to 1980:
Performance Before Risk Adjustment
2) Calculate the 12-month holding period returns of the above Screen #1 Portfolios, and compare the
resulted returns with those of IBES and S&P 500.
3) Calculate the average cumulative total returns of the 24 Screen #1 portfolios. Repeat this step for
different lengths of holding period (ranging from 1 to 12 months), and compare the results with the
corresponding return data of IBES and S&P 500.
Long Term Results
4) During a 6-year investment horizon, roll over portfolios for 3 months, 6 months and 12 months holding
periods, respectively. Compare the before-transaction-cost returns with those of IBES and S&P 500.
Repeat this step for after-transaction-cost returns.
Risk-adjusted Performance
5) Run the regression to test if the Screen #1 portfolios can outperform the returns required by CAPM.
Regression Model: (Rp - Rf ) = a + b * (Rm - Rf)
where: Rp – return of Screen #1 portfolio;
Rf – risk-free rate (3-month t-bill rate);
Rm – market return (S&P 500 return);
b – portfolio's systematic risk;
a – portfolio's abnormal return after adjustment for risk.
In Comparison with Random Portfolios
6) Establish 1000 random portfolios, each with 20 stocks randomly selected from IBES universe.
7) After controlling for risk (by means of %MAD), compare the returns of 24 Screen #1 portfolios with
those of random portfolios.
8) Assuming the returns of the 1,000 random portfolios are normally distributed, calculate the possibility that
a random portfolio can outperform a Screen #1 portfolio.
2. Conclusion and Implication of HCD (1984)
Results from the above HCD (1984) study were:
1) Performance Before Risk Adjustment: The Screen #1 portfolios outperformed S&P500 and IBES by a
significant margin, and this margin became larger as the holding period increased.
2) Long-term Results (Roll-over strategy): Screen #1 portfolios once again outperformed both benchmarks.
On a before-expense basis, three-month rollover strategy offered the highest returns, while on a after-
expense basis, six-month one did.
3) Risk-adjusted Performance: Out of a total of 24, twenty Screen #1 portfolios achieved abnormal returns
after adjustment of their systematic risk.
4) In Comparison with Random Portfolios: In general, Screen #1 portfolios outperformed random portfolios
with similar riskiness. And the study showed that the probability of a random portfolio beating Screen #1
portfolio is less than 2%.
With the above results, HCD (1984) concluded that stock market was imperfect, and that market did
not react instantly and accurately to the publicly available information on analysts' earning expectation
adjustment. The investors could take advantage of revisions in consensus earnings forecasts and make
systematically excess return without bearing excess risk.
3. How HCD (1984) Fits into Our Project
As shown above, HCD (1984) cast enormous doubt upon Efficient Market Hypothesis. HCD (1984)
used the market data from 1975-1980. To examine if HCD (1984) can stand the test of time, we are going to
use a different time frame, 1988~1991. The steps we are going to replicate are Step 1, 2 and 5. To simplify
the process, we will only use S&P 500 as the benchmark, instead of using both S&P 500 and IBES universe.
IBES provides us with most of data needed, with S&P return data and T-bill rate downloaded into SAS from
other sources.
What do we expect from our replication of HCD (1984) research before we set off? Under EMH,
market aggregate information very efficiently. In this case, an investor can not consistently realize abnormal
return by reacting to earnings estimate change. However, if the market is not very efficient, as observed in
HCD (1984), it takes time (a matter of years, perhaps) for the market to aggregate the information about
earning estimate change into the price and investors could achieve abnormal return during this period of time.
On the other hand, market does learn from history, and we expect that since the publication of HCD (1984),
market has become more efficient in reacting to changes in earnings estimates. If that is the case, we expect
our following analysis to show mixed evidence, as opposed to the overwhelming results shown in HCD
(1984).
III. DATA SOURCES AND PROCESSING
1. Data Sources
(1) Three-month T-Bill rate (monthly, 1988 to 1992) from Bloomberg;
(2) S&P 500 monthly return data from CRSP;
(3) Earnings estimate data from IBES.
2. Data Processing
1) Obtaining the IBES data
The IBES data is originally stored as an ASCII file in a professor's RS6000 account. The IBES data
has the following six linked files:
a) Summary;
b) Background;
c) Company ID;
d) Adjustments;
e) SIG Code;
f) Currency.
We use SAS program to read-in the six files into text files (raw data). The SAS scripts are listed in
Attachments 1.1.
2) Loading IBES data to Oracle
We then use Oracle as a tool to "massage" the raw data into a format that can be used in SAS
program. (1) We create a table space for File 5 (SIG codes); (2) We write a control file to load the raw data
obtained through a SAS program; (3) We also have a log file to check the process; (4) After the table is
created, we run several queries to make sure the table content is correct. The Oracle scripts are listed in the
Attachments 1.2.
3) Transforming Data to SAS format
Since different software packages have different data formats that usually are incompatible, we use
PERL to transform the raw data in Oracle table into a format so that SAS can read and process the data. The
PERL scripts are listed in the Attachments 1.3.
4) SAS Programming and Calculation
We replicate HCD (1984) research in the following steps:
a) We look at March, June, September, and December from 1988 to 1991 (16 quarters) and plan to calculate
the percentage change in monthly earnings estimate (mean) where number of estimates are greater than or
equal to three in both months (Feb.-Mar., May-June, Aug.-Sept., and Nov.-Dec.). We choose the largest 20
companies of percentage change to form our portfolio for each of the 16 periods.
b) We calculate returns of our 16 portfolios in the following pattern:
March portfolio: April to next March;
June portfolio: July to next June;
September portfolio: October to next September;
December portfolio: January to next December.
c) Our calculation have 16 portfolios, with 20 companies in each portfolio, and 12 monthly returns for each
company. Based on this data file, we can calculate the cumulative return of each portfolio. With T-bill rate
and S&P data, we can calculate the alpha and beta for each portfolio.
IV. RESULTS, IMPLICATIONS, AND CONCLUSION
1. Results
The result is summarized in the following table (omitted).
We have the following findings from our research:
(1) Seven out of the sixteen portfolios have negative alpha;
(2) Average beta is 0.98, close to the market beta;
(3) As the following two graphs show, portfolios with higher alpha usually have lower beta, and vise versa.
Alpha for the 16 portfolios from March 1988 to December 1991
(omitted)
Beta for the 16 portfolios from March 1988 to December 1991
(omitted)
2. Conclusion:
The above table summarizes the comparison between HCD (1984) and our replication, which is
marked as MBA (1997). The third volume "Market" shows the expected numbers for a random portfolio.
The table shows that the results of our replication of HCD (1984) is mixed, as opposed to the overwhelming
evidence in HCD (1984). In addition, the similarity between MBA (1997) and Market means that the
investment strategy proposed by HCD (1984) adds little value (if at all) to investors during 1988 to 1991.
This result confirms our expectation at the beginning of the data mining process: The market is not very
efficient in reacting to changes in earnings estimates, but market does learn from history. A new investment
strategy can earn abnormal return before the whole market knows about it, and it is impossible for any
investment strategy to do so consistently. This is consistent the a saying about rational expectation: "You can
fool some people for some time, but you can not fool all the people all the time."
V. LIST OF ATTACHMENTS
1. Transforming data
1.1 SAS Programs: read-in the raw data
1.1.1 Read File 1 (summary)
1.1.2 Read File 2 (background)
1.1.3 Read File 3 (company)
1.1.4 Read File 4 (adjustments)
1.1.5 Read File 5 (sig codes)
1.1.6 Read File 6 (currency)
1.2 Oracle SQL Programs: create table, load data
1.2.1 Create Table 1 (summary)
1.2.2 Create Table 5 (SIG code)
1.2.3 Load Table 1 (summary)
1.2.4 Queries on Table 1 (summary)
1.3 PERL: transforming data to SAS format
1.3.1 Cleaning the data for SAS
2. Calculation by using SAS Programs
2.1 Data Combination
2.1.1 Sorting out 16 portfolios (ticker)
2.1.2 Create a data file with company CUSIP)
2.1.3 Combining the S&P data with the IBES data
2.1.4 Loading 3-month T-bill monthly rate
2.2 Data Calculation
2.2.1 Combining three data-sets (portfolio, T-bill, and S&P) into one
2.2.2 Calculate the cumulative returns; Calculate the alpha for 16 portfolios
******(end of report)******
This page hosted by Get your own Free Home Page