intro

INTRODUCTION

This project is an earnest attempt by Francis and Harish, as a part of the "Pattern Recognition" course under the guidance of Dr. John .G. Harris.

Our motivation for this project came when we were discussing about the sudden dip in the stock market after the september 11 attacks took place. So we decided that we would try to classify on the first case and subsequently to predit the next day, based on the previous day's closing. Thus we came up with the idea, a sort of financial forecasting, using three classes, as to whether a person should buy/sell/retain his stock, based on the fundamental knowledge of the good and bad trends obtained. These dimensions includes the various sectors like industrial, banking, finance, transportation, telecom, biotech, computer etc., and also the current day's market value, volume, composite, high, low,advances,declines to name a few. We studied the data recorded for all the working days in year 2000 for all the bull sectors detailed earlier.

Now, from the composite, we arrived on the rule as how to classes should be seggregated based on the sensitivity and also with the thumb rule which says that, 'the number of data in each class should be equal to atleast 10 times the number of dimensions', in the back of our mind. Also, there is a necessary factor here to be discussed. This data, and the way the classes are seen as buy, sell or retain, are majorly for a person who is an active stock dealer in lay terms, or the one is holding possesion of a large number of shares in many of the sectors we've listed. Otherwise, if it is for a person doing small time dealing, for him, the 'do nothing', or 'retain' class should be big and the 'buy' or 'sell' should be less. In otherwords, only if the stock fluctuation is huge, should one encourage to sell or buy, which is what happens in practice.But in our case, there is going to be active dealing and the cumulative threshold, will anyway, cross the limit and so the 'retain' class is going to be comparitively less. We have tried the data on different classifiers, and compared and contrasted them by varying the dimension, number of classes and size of the data in each class. We also tested our classifiers with the data from the year 1999, and it was found that if the person did trading based on our classifier, he is going to profit a lot.

FEATURES:
Like the DOW and the Nasdaq, who build their classifier based on the BULL of the market, even we followed the same trend. For the various BULL sectors discussed earlier, the features listed below were used as the dimension in our program.

Last Sale

Net Change

Share Volume

Previous Close

52 week high

52 week low

P/E Ratio

Total Shares Outstanding

Earnings Per Share (EPS)

Market Cap

Current Yield

Dividend Amount

Ex Dividend Date

Beta

HOW IT WAS DONE !!!
We Classified the stock information as Buy/Sell/Do nothing.
Definition:

NET change == Changes due to the features (X1,X2,…Xn)
Based on the history, the Classifier will can tell that we should:
Buy : if NET change is over a certain threshold value.
Sell: if NET change is below a certain threshold value.
Do nothing: if NET change is within a certain range.

Now for the Prediction, it was based on the current stock information.
Example

X1 X2 X3 X4 ... Xn NET

Nov 28 100 200 32 56 ... 21 +80

Nov 29 20 100 15 56 ... 100 +100

NET Change: 100-80 = +20
Classify as BUY
If all X’s for today is known, the Classifier can predict for tomorrow’s stock market as Increase/Decrease, based on the Classification method discussed previously.

CLASSIFIERS
Classifiers for our Decision Maker:

Bayes Classifier

Fischer as the Linear Classifier

Parzen Windows

k-NN Classifier

Neural Networks

We will discuss in detail about all these and will show you the plot in the Implementation.

	X1	X2	X3	X4	...	Xn	NET
Nov 28	100	200	32	56	...	21	+80
Nov 29	20	100	15	56	...	100	+100