Navigation
Papers by Melberg
Elster Page
Ph.D work

About this web
Why?
Who am I?
Recommended
Statistics
Mail me
Subscribe
Search papers
List of titles only
Categorised titles

General Themes
Ph.D. in progress
Economics
Russia
Political Theory
Statistics/Econometrics
Various papers

The Questions
Ph.D Work
Introduction
Cost-Benefit
Statistical Problems
Social Interaction
Centralization vs. Decentralization

Economics
Define economics!
Models, Formalism
Fluctuations, Crisis
Psychology

Statistics
Econometrics

Review of textbooks

Belief formation
Inifinite regress
Rationality

Russia
Collapse of Communism
Political Culture
Reviews

Political Science
State Intervention
Justice/Rights/Paternalism
Nationalism/Ethnic Violence

Various
Yearly reviews

Philosophy
Explanation=?
Methodology

 

[Note for bibliographic reference: Melberg, Hans O. (1998), Visual presentation of non-linear correlation in n-dimensions: A speculation, www.oocities.org/hmelberg/papers/980219.htm]


Visual presentation of non-linear correlation in n-dimensions
A speculation


by Hans O. Melberg


Introduction
This is a speculation in the true meaning of the word. I do not know whether what I say can be proven or whether it is useful. Maybe it is all a big misunderstanding, in which case I simply reveal my ignorance. Yet, I want to present an idea I had while trying to visualize a figure of more than three dimensions. I then began to think that such visualizations were potentially important in discovering previously unknown patterns of causation and that computer programs could make more imaginative use of visualizations to show these connections.

An example
For instance, the quality of a soccer player depends on more than three variables; his ability to run fast, to handle the ball well, to read the game, to keep his pace over time and several other variables. Note that it does not help if you are only good at one thing (e.g. running fast) if you cannot handle the ball. Moreover, there may be complex interaction effects; the value of ball-handeling skills may depend on two other variables.

Now, traditional analysis could simply run a multiple regression, including interaction variables, to examine these relationships. But it is hard to create a visualized map of all the variables. Moreover, if the pattern of causation is complex enough a simple regression may not reveal much, or it may require too many observations to reveal a reliable pattern. Maybe, just maybe, a visalization of many dimensions could help reduce this problem. Before I go into detail, and before I give more examples, a little history may be useful.

History
It is easy to see the relationship between temperature and the volume of a gas. One simply does an experiment and plots the relevant data in a figure: temperature on the horizontal axis and volume on the vertical axis. The same procedure can be repeated for any other two variables that are related in a simple fashion. Now, even if this procedure is obvious today, it has not always been so. In the 17th century R. Descartes discovered that it was very useful to combine two variables in a horizontal/vertical diagram (a Cartesian diagram/coordinate system). Moreover, this discovery made it much easier to see patterns - even infer causality - than it had been previously. Of course, large amounts of pure data could be used without the figure, but the visalization made it much easier to find patterns.

What patterns? First there is the simple linear relationship, a constant increase for every unit of increase in the dependent variable. Then there are many types of non-linear relationships; exponential (like population growth), hyperbolic, quadratic and so on.

If you believe that more than two variables are related in a system, one may make a three dimensial map to visualize the relationship. Traditionally this was a bit difficult, but after the computer revolution this is no longer a problem. Using three dimensional maps one can discover peaks and valleys which would be hard to find simply by looking at the raw data.

Now, if one wants to use more than three dimensions, one runs into a problem: It is hard to visualize. At least, this is what many people say. I am not so sure. Part of the problem is that computer generated figures for analysing data have mainly been extensions of pen and paper figures. However, computer technology offers so many more opportunities to discover patterns in data. Three of the major opportunities being movements, shape and sound. Moreover, even if we stick to traditional pen and paper approaches I think it is possible to present much more information in figures, and thereby - hopefully - discover new general-classes of patterns (this time in n-dimensions).

Visualizing many dimensions: Some ideas
Imagine that the happiness of a person depends on many variables and that these variables interact in many ways. Here is one suggestion for visualizing this in a figure:
- y-axis: happiness
- x-axis: number of friends
- rotation of the box used to mark a point
- size of box
- colour (intensity) of box
- speed of blinking


There are also many other ways of doing this. The boxes could be moving in a circle, we could add one dimension by making the height of a box depend on one vaiable while the width depends on another. We could use rotating arrows (of different lengths and rotation speeds) which were attached to a circle of different size, colour intensity and speed of blinking. More imaginative ideas include presenting data in some kind of a planet system (rotation of each planet, distance betwen planets, rotation of planets around other planets). Alternatively, we could use sound much more actively and/or making the figure much more interactive and dynamic (allowing the user to "travell" into the figure and listening to the sounds associated with different values at that spot, and/or allowing sequences of pictures to be presented after each other to see if this revealed a pattern).

One last idea, which is not mine, for presenting multi-dimensional data, is Chernov faces. In short, the various dimensions are reflected in the various features in a face (size of ears, degree of smile, amount of hair, etc). This seems maybe to be a better idea than mine since it is intuitively easier to understand (see E. Tufte's book about data-visaluzation for more on this).

What is the point?
I do not know how useful these visualizations are. The figures may reveal a system of linear correlation, but this could by done in a simple ANOVA diagram too. I think the new contribution of these figures would be to make it easier to discover non-linear and unknown patterns (maybe indicating causation) which would be hard to discover using traditional techniques.

How do I know that there are classes of non-linear correlations that can be discovered this way? Can I give some concrete examples of where this procedure would be useful? In short, I don't know. I have previously mentioned analysis of soccer players and happiness, but these were more illustrations than really good examples. A better example might be mixing metals and chemicals to create a new compound with special properties. For instance, in his reply to criticism from D. Hendry, M. Friedman said that he was skeptical of multiple regression. The reason was his experience with multiple regression when he was trying to find a compound metal with some required properties (as strong as possible, as light as possible and more). I do not want to argue that this is a good argument against D. Hendry, but I do think multi-dimensional visualizations might be of help in developing compound metals (and chemicals?). I also think the procedure could be used in the social sciences.

Conclusion
As I said in the introduction, this has been a speculation in the true meaning of the word. I may not be original, it may not be useful, and it need not be taken seriously. The way to go from here to make a computer program to test whether the visalzations can be useful in real examples. Only then would it deserve to be taken seriously.




[Note for bibliographic reference: Melberg, Hans O. (1998), Visual presentation of non-linear correlation in n-dimensions: A speculation, www.oocities.org/hmelberg/papers/980219.htm]