Navigation
Papers by Melberg
Elster Page
Ph.D work

About this web
Why?
Who am I?
Recommended
Statistics
Mail me
Subscribe
Search papers
List of titles only
Categorised titles

General Themes
Ph.D. in progress
Economics
Russia
Political Theory
Statistics/Econometrics
Various papers

The Questions
Ph.D Work
Introduction
Cost-Benefit
Statistical Problems
Social Interaction
Centralization vs. Decentralization

Economics
Define economics!
Models, Formalism
Fluctuations, Crisis
Psychology

Statistics
Econometrics

Review of textbooks

Belief formation
Inifinite regress
Rationality

Russia
Collapse of Communism
Political Culture
Reviews

Political Science
State Intervention
Justice/Rights/Paternalism
Nationalism/Ethnic Violence

Various
Yearly reviews

Philosophy
Explanation=?
Methodology

 

[Note for bibliographic reference: Melberg, Hans O. (1997), Advanced Econometrics for Beginners - A review of Maddala (1992), http://www.oocities.org/hmelberg/papers/970519.htm]

 

Advanced Econometrics for Beginners
A review of Maddala (1992)

by Hans O. Melberg

Introduction to Econometrics
G. S. Maddala
Macmillan Publishing Company
New York, 1992 (2nd ed, first 1988)
631 pages, ISBN: 0-02-374545-2

Introduction
In the preface to Introduction to Econometrics Maddala writes that "There has been many important developments in econometrics during the last two decades, but introductory books in the field still deal mostly with what econometrics was in the 1960s. The present book is meant to familiarize students (and researchers) with some of these developments ..." (p. xv). To assess whether the book fulfils this aim, I shall first present the virtues of the book. Next, in the second part of the review, I discuss its shortcomings.

1 Virtues
This book has two major virtues. First, Maddala is very good at criticising and showing the weaknesses of the concepts presented. Second, the book deals with many new and important topics such as exogeneity, unit roots, cointegration, error correction models, model selection and specification testing. I shall examine these in turn.

1.1 Critical attitude
Whatever one may think about this book, one cannot argue that it is uncritical of traditional econometrics. I shall give several examples, starting with the problem of significance.

Most econometricians know that there is nothing sacred about the 1% and 5% significance levels. Moreover, they know that it is easy to make a variable significant by using a very large sample. Lastly, we know that statistical significance should not be confused with economic or practical significance - or even worse causal significance. Yet, somehow this knowledge does not filter through to students. One source of this problem is that textbooks do not discuss the issues in any detail. Maddala's book is an exception. He explicitly faces these problems and thereby he gives the students a feeling for the limitations of statistics (see pp. 30-32. See also p. 495).

A less known problem, is that of testing for multicollinearity (that the explanatory variables are highly related so it is difficult to distinguish between their effects). Sometimes high standard errors of the regression coefficients is taken to indicate multicollinearity. However, as Maddala shows, it is possible to have high standard errors without multicollinearity, and low standard errors even when there really is multicollinearity (p. 271). To understand this consider the formulas for the variance of the regressors in a the case of two explanatory variables. We have:

Var (b1) = var (e) / S11 (1-r212)

Var (b2) = var (e) / S22 (1-r212)

Cov (b1,b2) = Var (e) r212 / S12 (1-r212)

As we can see there are three factors which affect the standard error of a regressor:
1. Var (e)
2. S11 (for b1)
3. r212

So, even if we have multicollinearity (a high r212), the standard error may be low if Var (e) is low and S11 is high). Similarly, we may have little multicollinearity (a low r212), but a high standard error due to a high Var (e) and a low S11. Thus, using the standard error as a diagnostic for multicollinearity can be deceptive. Moreover, it is also deceptive to focus on the size of r212 as a measure of how big the problem of multicollinearity is (since it ignores the size of S11 and Var(e)). The case is even worse when you have more than three variables, as Maddala shows in a highly instructive example (p. 272). With three variables we may have low r212, r213, and r223, and still have great multicollinearity problems (For example, when x3 = x1 + x2). In sum, the standard error is not a good indicator of the existence of multicollinearity or its seriousness.

A third example of Maddala's critical attitude, is revealed by his discussions of various statistical terms. I have already mentioned his discussion of the term "significant." However, Maddala also objects to the term "the null hypothesis" (p. 81); He wants to replace the term "rational expectations" with "model consistent expectations" (p. 433); And he thinks the concept of "Granger causality" should be replaced with "precedence" since it does not really imply causality as we use the term in most other contexts (p. 397). Whether one agrees with this or not, the discussion itself reveals a critical mind at work.

One could go on to give many more examples of how Maddala develops a critical attitude in the reader. For example, he discusses the difficulties of interpreting dummy variables (p 310); the problems of using polynomial lags (p. 426); and the traditional solutions to autocorrelation (p. 244). However, I hope I have showed enough examples to convince the reader that Maddala does not simply uncritically present econometric procedures.

1.2 New topics
Being critical is good, but often easy. It is much more difficult to integrate a long list of fragmented critical remarks into a systematic critique. Finally, it is even more difficult to construct an alternative methodology which is better than the procedures being criticised. In a previous review I criticised Wonnacott and Wonnacott for presenting only fragmentary criticism. Maddala cannot be accused of the same. Much of his criticism is build on a particular view of econometrics. Moreover, he provides an alternative methodology. Both issues will now be discussed in more detail.

Maddala's view of econometrics is presented in a figure (p. 7). The key to this view is its emphasis on feedback - how theory affects data and how data affects theory formation. Traditional econometrics was mainly concerned with estimation of a given model, not the feedback between theory and data. Autocorrelation, for example, was viewed as a problem for estimation (since the estimators were no longer best, linear, unbiased estimators - BLUE). Viewed in this light, any mechanical procedure that could reduce autocorrelation would "solve" the problem. However, this does not explicitly consider the feedback process between econometrics and economic theory. Autocorrelation is a sign that something that something is missing from our theory. Thus, we should revise our theory, not simply find a "mechanical" fix for the problem. Thinking about this kind of feedback led econometricians to focus on the problem of model selection (how to choose between rival models i.e. not just how to best estimate a given model) and specification/diagnostic checking (checking the adequacy of the model - not just estimating it). It is this focus which unifies some of the criticism of traditional econometrics.

How should we choose between rival models? Should we simply look at the R2 (the coefficient of determination - indicating the percentage of variation explained by the variables)? The problem with this is that if you use the data to find the model (you simply determine which variables to include by choosing those that which gives you the highest R2), it would be circular to use R2 as a selection criterion. The data cannot be used to test the adequacy of a model when the data has been used to derive this model! Instead, Hendry suggests the following criteria for a good model: 1. Data admissible (for example, it should not predict negative prices); 2. Theory consistent (no "mechanical" fixes without justification based on economic theory); 3. Weakly exogenous regressors (more on this later); 4. Ecompassing (A model is better than another if it explains everything the first model explains and more - especially why the first was sometimes wrong); 5. Stability (the parameters should be stable); 6. Data coherent (errors should be random).

This may seem abstract, but - as I will now try to show - this approach has led to many new and interesting developments in econometrics.

Take, for example, the second criterion - theory consistency. As traditional econometrics became increasingly aware of the problem of spurious correlation (when two variables are correlated but not causally related), they began to difference the data. That is, instead of regressing y on x, they regressed yt - yt-1 on xt - xt-1. There is little wrong with that, but economic theory usually has little to say about how the difference of variables are related. Typically, economic theory tells us something about the long-run relationship - the equilibrium to which the economy is supposed to gravitate toward. For example, monetarist think that inflation - at least in the long run - is always caused by the money supply. In the short run, however, the relationship is not so clear-cut. Hence, a regression of differences need not reveal a significant relationship, while a regression in levels would. The same problem appears in the consumption function. In the long run we may consume a constant proportion of our income, but in the short run consumption often deviates from this desired ratio. A regression in differences may solve some technical estimation problems, but these models are not inspired by theory, they do not always have obvious theoretical interpretations, and we loose information about levels.

A constructive alternative to the traditional approach, is the new emphasis on Error Correction Models. This is a regression in differences, but it also includes a long term equilibrium. Consider first the following equation:

yt = a0 + a1 yt-1 + b0 xt + b1 xt-1 + et

(Note how this model resembles the unrestricted model of traditional solutions to autocorrelation. See my review of Wonnacott and Wonnacott for more on this.)

Subtract yt-1 and b0 xt-1 from both the left and right hand side of the model, and rearrange. We get:

yt - yt-1 = a0 + b0 (xt - xt-1) + (a - 1) yt-1 + (b0 + b1) xt-1 + et

Define s = (b0 + b1) / (1-a1). We then get:

yt - yt-1 = a0 + b0 (xt - xt-1) + (1 - a) (yt-1 - s xt-1) + et

This is an error correction model. It is, as we can see, based on differences, but - and this is the key for the present discussion - it also has a steady-state, long run equilibrium. This long run equilibrium is:

yt = s x + constant

And if the (testable) restriction a1 + b0+ b1 = 1 holds, the long run equilibrium is:

yt = x + constant

As an example of applied work using this model, is the consumption function estimated by Davidson, Hendry, Srba and Yeo (Economic Journal, 1978), based on quarterly data from 1958 to 1975 (d here symbolizes delta):

d4ct = 0.48 d4yt - 0.23 d1d4yt + 0.09 (c - y)t-4 + 0.006 DUM - 0.12 d4pt -0.31 d1d4 pt

The details need not bother us here (all the coefficients were significant, R2 = 0.85, DW = 2.0 and the standard error was 0.0062). The point I wanted to emphasise, was that there is a long run solution in this model - ensured by the term (c - y)t-4.

It is time to sum up. I started with the question of how to choose between different models. I then presented Hendry's list of six criteria - one of which was theory consistency. I then showed how ECM models - unlike the traditional difference models - included a long run equilibrium i.e. it more consistent with theory than the traditional difference models. This shows how the new approach to econometrics has created good alternative to the traditional models. Finally, this example illustrates how important the issue is, and that it deserves space in a textbook like Maddala's Introduction to Econometrics.

Another recent development, is the renewed interest in exogeneity - and Maddala is one of the few authors who spend more than a few sentences on exogeneity. The standard definition of an exogenous variable is one that is not correlated with the error term. As Maddala explains, this approach is unsatisfactory because it is arbitrary (we simply decide a priori which variables we want to call exogenous, and two different researchers may decide to use different variables as exogenous); It excludes - in order to achieve identification - some variables that should be included (the Liu critique); Finally, the coefficients in the equation will not be independent of the exogenous variables if people are rational forward looking individuals (The Lucas critique). The question is then what we can do about this.

The basic answer is to distinguish between different types of exogeneity and use tests to see which concept applies in a concrete situation. Leamer, for example, has suggested that we should distinguish between exogeneity in the sense of predeterminedness (i.e. when the variable is independent of the contemporaneous and future errors in the equation) and strict exogeneity (predeterminedness plus independence of past errors too). Engle, Hendry and Richard want to divide the concept of exogeneity into three: weak, strong and superexogeneity. The distinction follows from the view that exogeneity is only relevant if we first ask "Exogenous for what?" (which variables). The concepts are somewhat technical (but relatively easy. For more see Maddala pp. 392-393). The important point to note, however, is that only weak exogeneity is required for efficient estimation, while superexogeneity is required if we want to use our model to conduct policy predictions. We may then use various test to determine what kind of exogeneity we have, such as the Hausmann test.

The last point I want to discuss, is how the new approach to econometrics has widened our toolbox. When we start focusing on choosing the right model, we need to develop a different set of tests than those needed for estimating a model. Maddala's textbook reflects this. He explains a (too?) large number of new tests used to examine the quality of various models. Particularly important, of course, is it to examine the residuals of the regression since the residuals may reveal problems such as autocorrelation, heteroskedasticity, instability and many other problems indicating a need to rethink the model (remember Hendry's data coherence criteria for choosing between rival models). Maddala discusses this, and he makes the useful distinction between the residuals from the traditional regression, the predicted residuals (residuals from observations not used to estimate the model), studentizised residuals (predicted residual divided by its standard error), BLUS residuals (constructed to have zero mean, be uncorrelated and constant variance) and recursive residuals (see p. 481 for more on all of these). Once again we see how the new approach to econometrics really do have concrete and alternative proposals.

I hope I have convinced the reader that the new approach to econometrics has produced many new and interesting developments, and that Maddala's discussion of this at a relatively accessible level is a great quality of the book. There are many more topics which might be mentioned. For example, is it best to start with a very general model and simply it using data-based criteria, or should we start with a simple model and gradually increase its complexity? (p. 493ff) What is cointegration, unit-roots, integrated and stationary time series? (p. 577ff) What is the difference between trend-stationary-processes and difference-stationary processes, and what is the importance of distinguishing between the two? (p. 259) Maddala presents all these topics without being too technical, thus making it accessible (but not without effort) to undergraduates.

2 Weak points
One basic problem with this book is that the author cannot make up his mind whether he is writing a textbook for first time students, or an introduction to recent developments for students who already know some econometrics. The consequence of this is a long book trying to please both camps. For example, to satisfy the beginner students, he has included a chapter on introductory statistics. This chapter is redundant for those who know some econometrics, while it is too brief for the beginner student. Another example is the many appendices using matrix algebra. These are too complicated for the beginner, while the advanced reader is bored with two explanations of the same material - one without and one with matrix algebra. Lastly, the advanced reader will not be pleased with the often repeated phrase "this topic/proof is beyond the scope of this book" (see, for example, p. 75, 162, 230, 367, 368, 403, 476, 483, 526, 584). In this way the book sometimes falls between two chairs - frightening the beginner and boring the more advanced student.

One might argue that the instructor/the reader is free to skip the chapters he thinks are too easy or advanced. This is true, and I suspect this is what will happen. Students will be assigned one standard textbook (not Maddala), and some chapters from Maddala will be used in addition to this (such as chapter 12 on Diagnostic Checking, Model Selection, and Specification Testing and chapter 14 on Vector Autoregression, Unit Roots, and Cointegration). This is perfectly acceptable, but somewhat unfortunate. It is unfortunate because Maddala could have written an excellent introduction to econometrics for beginners. By cutting the appendices and reducing the number of topics, this could have been a perfect critical introduction to econometrics. On the other hand, by eliminating the elementary statistics and enlarging the discussion of new topics, this could have been a very good book for students who have already done introductory econometrics but who wants an update on recent developments. As it is we have two books in one, and the sum of this is not as excellent as the separate books would have been if they were isolated.

Pedagogically the book is average. The language is clear and concise, but at least three improvements could be made. First, to make much more extensive use of figures. For example, Wonnacott and Wonnacott make great use of figures when explaining the problem of simultaneous equations and autocorrelation; Maddala does not - though, he occasionally uses figures very effectively (see for example p. 90). Second, the discussion of theory could have been much better integrated with empirical examples and well-known real-life situations. For example, Maddala has a good discussion of type I and type II errors (pp. 30-31), but he nowhere presents the intuitive simple explanation that a type I error occurs when we convict an innocent person, while a type II error is equivalent to failing to convict a person who is guilty. Similarly, Maddala's presentation of the consequences of omitting relevant variables is theoretical (p. 161ff) - unlike Wonnacott and Wonnacott who presents a wonderfully intuitive example (involving yield, rainfall and temperature. See WW p. 96). Of course, Maddala also presents examples, but the point is that these are isolated from the theory. His strategy is to present a theory and then to give several examples. A better approach, I think is to first present an example showing the problem and then present the general theory. Third, and last, the layout of the book could be improved. The book is very uniform with the same fonts and backgrounds everywhere. A better alternative would be to isolate some of the many examples in (optional) boxes with dark background, to highlight important material in frames (as he does in the beginning, but for some reason stops doing later on), and to present some of the data-material in a smaller fonts.

On a more positive note, Maddala has included very good summaries after each chapter, and he generates interest by presenting authors with rival views (conflict is always a good way to fight boredom!). He also gives excellent references for further reading on almost every single topic in the book.

Finally, I have some small quibbles. I did not understand why he discussed superexogeneity before strong-exogeneity (pp. 392-3). Intuitively it should be the other way around since superexogeneity is the "strongest" concept. I am also sceptical of the utility of discussing the method of moments as a method for estimating the regression coefficients. The last square method is certainly needed, but why spend time on the method of moments? I also think it would be better to use the notation of deviation (small x, not big) right away, at least before deriving the variances and covariance of the regression coefficients (p. 77). Another small point is the discussion of slope dummies which is made unnecessary complicated by the inclusion of intercept dummies (p. 313). It is better to keep the two separate before bringing them together. There is also, I think, a small contradiction between the advice of using the reverse regression to discover discrimination (p. 72) , and the later recommendation only to use the reverse regression when this makes sense in terms of the direction of causation (p. 75) - salaries do not cause qualifications, but the regression is still informative. Lastly, I was a bit surprised at Maddala's somewhat non-standard formula for the t-statistic (with the square root of n in the numerator and not in the denominator as is more common). However, these are minor quibbles.

Altogether
Maddala has produced a good book. It is critical, comprehensive, and it deals with many new developments in a way which many undergraduates may understand. Nevertheless, I would not recommend it as a textbook for beginners with little background in statistics/econometrics. It is simply a bit too frightening for beginners - sometimes too brief, yet overall too long, and sometimes too advanced. Still, beginners could benefit from individual chapters, and the book is good (but long) as a critical review after a basic course in econometrics. As such this book is recommended.



[Note for bibliographic reference: Melberg, Hans O. (1997), Advanced Econometrics for beginners - A review of Maddala (1992), http://www.oocities.org/hmelberg/papers/970519.htm]