Matrices
 Introduction

Matrices are extremely handy for writing fast 3D programs. As you'll see they are just a 4x4 list of numbers, but they do have 2 very important properties:

A tranformation is simply a way of taking a set of points and modifying them in some way to get a new set of points. For example, if the user moves 10 units forward in a certain direction then the net result is the same as if all objects in the world moved 10 units in the opposite direction.

In 3D computer graphics it is often convenient to assume that the view-point is always at the origin <0,0,0>. Not only does it make it easier to render a scene but it often makes it a lot faster as well. Instead of having the user move around the world we can keep the user at the origin and make the objects move around instead. From the users point of view it will look exactly the same. Matrices are a fast and convenient tool to perform these transformations.

A Point in Space

As mentioned above, a point or vector can be represented in 3D space as <x,y,z>. When using matrix math it helps to represent it as <x,y,z,w>. That extra w there helps make things like movement easier to deal with. If we have a point in 3D space we can convert it to this new format by setting w to 1. In this text I'll be representing all points in space as a collumn vector:

|x|
|y|
|z|
|w| <- w=1
Modifying the Position of a Point

Let's say we want to take any point <x,y,z,w> given and do something to it to get a new point. A row vector can be used to represent how we want to change a point:

These values                      |x|
contain the                       |y|        This is our
info on how we ----> |A B C D|  . |z|   <--- point in 3D
want to change                    |w|        space
the point
To figure out how the row vector |A B C D| changes a point, we can visualize lying the point's collumn vector on it's side on top of it like this:
| x  y  z  w |
| A  B  C  D |     = (x * A) + (y * B) + (z * C) + (w * D)
Compare this to the artical on basic 3D math and you'll see that we are in fact taking the dot product of the two vectors. What we do above is mutiply each top item by the item under it and add the results up to get the answer.

Let's say we want to be able to take any point <x,y,z,w> and figure out the coordinates for the point <x',y',z',1> which is exactly 4 units to the "right" of it, ie the point which is 4 units further along the x axis. We can do this by using 4 row vectors. The first one will calculate the new x point (x'), the next one the new y point (y') and so on. First let's look at calculating x'.

We know that the new x point can be calculated like this: x' = x + 4, so a row vector to calculate this would be:

                     | 1  0  0  4 |
i.e. when we multiply this out by a point <x,y,z,w> we'll get:
(x * 1) + (y * 0) + (z * 0) + (w * 4) = x + 4
We also know that y' = y, z' = z and w = 1, so we can calculate the row vectors for each of the other values, and stack them all on top of each other:
x row vector ---->    | 1  0  0  4 |    x'=1*x + 0*y + 0*z + 4 = x + 4
y row vector ---->    | 0  1  0  0 |    x'=0*x + 1*y + 0*z + 0 = y
z row vector ---->    | 0  0  1  0 |    x'=0*x + 0*y + 1*z + 0 = z
1 row vector ---->    | 0  0  0  1 |    x'=0*x + 0*y + 0*z + 1 = 1
And VOILA! That's what a matrix is! To take a point <x,y,z,w> and calculate the new point we just mutiply the point by each of the row vectors.

Here's a more generic representation of what we are doing:

|x'|         | M11  M12  M13  M14 |    | x |
|y'|    =    | M21  M22  M23  M24 |    | y |
|z'|         | M31  M32  M33  M34 |    | z |
|w'|         | M41  M42  M43  M44 |    | w |
In this case <x,y,z,w> is the point we are popping in, <x',y',z',w'> is the point we'll be getting out, and each number in the matrix is represented by Mij, where i = row number and j = collumn number.

Following the line of reasoning above, we can figure out how to calculate the new point based on the given point and the matrix:

x' = (x * M11) + (y * M12) + (z * M13) + M14
y' = (x * M21) + (y * M22) + (z * M23) + M24
z' = (x * M31) + (y * M32) + (z * M33) + M34
w' = (x * M41) + (y * M42) + (z * M43) + M44
In practice, we don't really need that last equation, we know that it will always equal 1 so we don't have to calculate it (well it better equal 1, otherwise we've done something wrong!!).

Plugging a point into a matrix like this and getting a new point out is called "tranforming" a point by a matrix, just keep that in mind so you know what to call your procedure to do it!

A few of the more commonly used matrices follow. Now that you know how matrices work you should be able to derive these yourself, but heck I may as well save you the trouble.

Doing nothing

You often need a matrix to pop out exactly the same point as was plugged in. The matrix to do this is called the identity matrix:

|  1    0    0    0 |            x' = x
|  0    1    0    0 |            y' = y
|  0    0    1    0 |            z' = z
|  0    0    0    1 |
Translation

A translation is simply adding (or subtracting) a point to any given point. Let's say you want to add TX to x, TY to y and TZ to z. The matrix to do this is:

|  1    0    0    tx |            x' = x + tx
|  0    1    0    ty |            y' = y + ty
|  0    0    1    tz |            z' = z + tz
|  0    0    0    1  |
Scaling

Sometimes we may need to scale a point, ie mutiply each axis by a given number. This is handy for things like zoom effects.

|  sx   0    0    0 |            x' = sx * x
|  0    sy   0    0 |            y' = sy * y
|  0    0    sz   0 |            z' = sz * tz
|  0    0    0    1 |
Of course, if you only want to scale along the X axis (say) then you simply set SX to the scale value and set SY and SZ to 1.

Basic Rotations

Things start getting a wee bit tricky here. For starters we can rotate things around the x axis, the y axis, or the z axis. Notice how each axis forms a line in 3D space? Well we can also rotate things around any arbitrary line. This is handy if we want to do effects like objects rotating about their own axis (I'll discuss this a bit later).

The matrix for rotating around the x axis by angle é is:

|  1      0      0     0 |            x' = x
|  0    cos é  -sin é  0 |            y' = (cos é) * y - (sin é) * z
|  0    sin é   cos é  0 |            z' = (sin é) * y + (cos é) * z
|  0      0      0     1 |
The matrix for rotating around the y axis by angle é is:
|  cos é   0    sin é   0 |            x' = (cos é) * x + (sin é) * z
|    0     1     0      0 |            y' = y
|  -sin é  0    cos é   0 |            z' = -(sin é) * x + (cos é) * z
|    0      0    0      1 |
And the matrix for rotating around the zaxis by angle é is:
|  cos é   -sin é   0    0 |            x' = (cos é) * x - (sin é) * y
|  sin é    cos é   0    0 |            y' = (sin é) * x + (cos é) * y
|    0       0      1    0 |            z' = z
|    0       0      0    1 |
Mirroring

Mirroring involves flipping all points about an arbitrary plane in 3D space. This is illustrated in figure 1.

Mirroring a point about a plane
From the diagram we can easily see that to mirror a point we need to: The distance of a point from a plane can be calculated easily with the plane equation:
dist = p * normal + k
(Normally this equation assumes that the plane normal is a unit vector. In this particular case however this is not a requirement).

Based on the information above we can see that the final mirror equation is:

p' = p - 2 * dist * normal
   = p - 2 * (p * normal + k) * normal
   = p - 2 * (p.x*normal.x + p.y*normal.y + p.z*normal.z + k) * normal
Expanding this out we get the equation for each element in p':
p'.x = p.x - 2*p.x*nx*nx + 2*p.y*ny*nx + 2*p.z*nz*nx + 2*k*nx
p'.y = p.y - 2*p.x*nx*ny + 2*p.y*ny*ny + 2*p.z*nz*ny + 2*k*ny
p'.z = p.z - 2*p.x*nx*nz + 2*p.y*ny*nz + 2*p.z*nz*nz + 2*k*nz
where <nx,ny,nz>=normal. Thus the final mirror matrix for any given plane p*<nx,ny,nz>+k=0 is:
| 1-2*nx*nx    -2*nx*ny     -2*nx*nz     -2*nx*k |
|  -2*ny*nx   1-2*ny*ny     -2*ny*nz     -2*ny*k |
|  -2*nz*nx    -2*nz*ny    1-2*nz*nz     -2*nz*k |
|      0           0            0            1   |
(Note the common terms: m[0][1]=m[1][0], m[0][2]=m[2][0], and m[1][2]=m[2][1]).

Multiplying Matrices

So now that we know how to represent each of the different types of matrix operations, how do we combine them? By multiplying two matrices together:

P = B x A
If transforming a point by A gives one effect, and transforming it by B gives another, then transforming it by P alone gives you the same result as if you had transformed it by A then by B. You can keep mutiplying a matrix by as many other matrices as you like, and each time the end product will contain the info for all of them in the correct order.

To multiply B by A, you treat each collumn in A as a seperate collumn vector, and transform it by matrix B to get the new collumn. Let's look at doing it for the first collumn in A:

|P11|       |B11  B12  B13  B14|     |A11|
|P21|   =   |B21  B22  B23  B24|     |A21|
|P31|       |B31  B32  B33  B34|     |A31|
|P41|       |B41  B42  B43  B44|     |A41|
So that first collumn of P will be:
| (A11 * B11) + (A21 * B12) + (A31 * B13) + (A41 * B14) |
| (A11 * B21) + (A21 * B22) + (A31 * B23) + (A41 * B24) |
| (A11 * B31) + (A21 * B32) + (A31 * B33) + (A41 * B34) |
| (A11 * B41) + (A21 * B42) + (A31 * B43) + (A41 * B44) |
We need to split the A matrix up into it's 4 collumn vectors and transform each collumn vector by matrix B to get 4 new collumn vectors:
|P11|       |B11  B12  B13  B14|     |A11|
|P21|   =   |B21  B22  B23  B24|     |A21|
|P31|       |B31  B32  B33  B34|     |A31|
|P41|       |B41  B42  B43  B44|     |A41|
|P12|       |B11  B12  B13  B14|     |A12|
|P22|   =   |B21  B22  B23  B24|     |A22|
|P32|       |B31  B32  B33  B34|     |A32|
|P42|       |B41  B42  B43  B44|     |A42|
|P13|       |B11  B12  B13  B14|     |A13|
|P23|   =   |B21  B22  B23  B24|     |A23|
|P33|       |B31  B32  B33  B34|     |A33|
|P43|       |B41  B42  B43  B44|     |A43|
|P14|       |B11  B12  B13  B14|     |A14|
|P24|   =   |B21  B22  B23  B24|     |A24|
|P34|       |B31  B32  B33  B34|     |A34|
|P44|       |B41  B42  B43  B44|     |A44|
The resulting matrix is then made by combing these 4 collumn vectors back into a single matrix:
          | P11  P12  P13  P14 |
B x A =   | P21  P22  P23  P24 |
          | P31  P32  P33  P34 |
          | P41  P42  P43  P44 |
One thing to keep in mind is that the bottom row in any matrix will always be [0 0 0 1], so we can use this to slighty speed up matrix multiplication. Here's a general algorithm which will multiply 2 matrices:
Let A and B be our matrices,P will be the result (B x A).
var i,,j : integer;
for i := 0 to 2 do
    for j := 0 to 3 do
        begin
            P[i][j] := 0;
            for k := 0 to 3 do
                P[i][j] := P[i][j] + B[i][k] * A[k][j];
        end;
P[3][0] := 0; { Set the bottom row to 0 0 0 1 }
P[3][1] := 0;
P[3][2] := 0;
P[3][3] := 1;
The Inverse Matrix

Let's say we have two matrices A and B. If the following is true:

                | 1   0   0   0 |
A x B = B x A = | 0   1   0   0 |
                | 0   0   1   0 |
                | 0   0   0   1 |
then we say that B is the _inverse_ of A (and visa-versa). If you transform a point P1 by matrix A then you'll get a new point P2. If you then transform P2 by matrix B, it'll return P1. The two matrixes have the same effect but in "opposite" directions, eg if A moves all points 5 spaces to the "left" then B will move them 5 spaces to the "right". Similarly if A rotates space around a particular axis one way then B will rotate by the same amount around the same axis in the opposite direction.

There are several methods for calculating the inverse of a matrix, for large matrices the Gaussian method is preferred. Gaussian uses a technique called of "row reduction", first we create a large matrix like so:

    This is the matrix
    we are trying to                This is the identity
    find the inverse of                    matrix
|    A11  A12  A13  A14               1   0   0   0       |
|    A21  A22  A23  A24               0   1   0   0       |
|    A31  A32  A33  A34               0   0   1   0       |
|    A41  A42  A43  A44               0   0   0   1       |
Our goal is to perform various "elementary row operations" on this matrix to put it in the following form:
    Identity matrix              The inverse
|    1   0   0   0           B11  B12  B13  B14    |
|    0   1   0   0           B21  B22  B23  B24    |
|    0   0   1   0           B31  B32  B33  B34    |
|    0   0   0   1           B41  B42  B43  B44    |
In the above matrices A is our initial matrix and B will be the inverse of A.

There are three kinds of elementary row operations we can use:

Here is the basic algorithm to figure out which steps to perform as well as their order (this algorithm assumes that the bottom row of a matrix is always [0 0 0 1]) :
var i,j,row : integer;
    multiple,divisor : real;
    A : array[0..3][0..7] of real;
Set values A[0..3][0..3] to our 4x4 matrix
Set values A[0..3][4..7] to the 4x4 identity matrix
{ Loop through each row }
for row := 0 to 3 do
    begin
        { Make sure this row doesn't have a 0 in A[row][row] (use RULE 1) }
        if A[row][row] = 0 then
            begin
                find a row i where (i>row) and (i<3) and (A[i][row] <> 0)
                interchange rows 'i' and 'row'
            end;
         { Divide this row by a constant so that A[row][row] = 1 (use RULE 2) }
         divisor := A[row][row];
         for j := 0 to 7 do
             A[row][j] := A[row][j] / divisor;
 
         { Make all other elements in collumn 'row' a 0 (use RULE 3) }
         for i := 0 to 2 do
             if i <> row then
                 begin
                     multiple := A[i][row];
                     for j := 0 to 7 do
                         A[i][j] := A[i][j] - multiple * A[row][j];
                 end;
     end;
Return the matrix in the values A[0..3][4..7]
There are cases where a matrix doesn't have an inverse. If this happens then the first part of the algorithm above will fail, you won't be able to find a row where A[i][row] != 0. However, if you stick to the "normal" 3D operations discussed in this article (translation, rotation, scaling etc) then the matrices produced will always have an inverse.

One other very important thing to remember is that due to round-off errors you will often get values in the matrix such as 0.000001 which are supposed to be 0. When looking for a row where A[i][row] != 0 you must keep this in mind, and instead check that the absolute value is larger than some small number (eg 0.0001).

Rotating Around an Arbitrary Axis

Let's say we need to rotate space by about an arbitrary axis defined by the line o + kd (where o is the origin point <ox,oy,oz> and d is the directional vector <dx,dy,dz>). Here are the steps needed to perform this rotation:

1) First translate all of space so that the line of rotation starts at the origin. Do this by subtracting the line's origin point from all points. Here is the matrix needed, along with it's inverse:

    |   1    0    0   -ox  |            |   1    0    0   ox  |
F = |   0    1    0   -oy  |       F' = |   0    1    0   oy  |
    |   0    0    1   -oz  |            |   0    0    1   oz  |
    |   0    0    0    1   |            |   0    0    0   1   |
2) Next rotate space around the z axis until the line of rotation lies in the x/z plane:
     |   dx/v   dy/v   0     0  |            |   dx/v  -dy/v   0     0  |
G =  |  -dy/v   dx/v   0     0  |       G' = |   dy/v   dx/v   0     0  |
     |   0       0     1     0  |            |    0      0     1     0  |
     |   0       0     0     1  |            |    0      0     0     1  |
where v = SQRT(dx*dx + dy*dy). You will now have a new line of rotation with one end at the origin and the other end at the point <v,0,dz>.

3) Now rotate space around the y axis until the line of rotation lies along the z axis:

    |   dz/w   0    -v/w   0  |            |   dz/w    0    v/w    0  |
H = |    0     1     0     0  |       H' = |    0      1     0     0  |
    |   v/w    0    dz/w   0  |            |   -v/w    0    dz/w   0  |
    |    0     0     0     1  |            |    0      0     0     1  |
where w = SQRT(v*v + dz*dz) = SQRT(dx*dx + dy*dy + dz*dz).

4) At this point the axis of rotation is a line lying along the z axis between the points <0,0,0> and <0,0,w>, so you can rotate space around the z axis by :

    |   cos A     -sin A     0       0  |
W = |   sin A      cos A     0       0  |
    |     0          0       1       0  |
    |     0          0       0       1  |
5) So to do the entire rotation you must first transform space by F,G and H to align the axis of rotation along the z axis, then you do the actual rotation around the z axis using matrix W, then you transform space by H', G' and F' to put everything back into place. By multiplying these matrices together you can create a single matrix to do the entire rotation:
P = F' x G'x H' x W x H x G x F
Any points transformed by this matrix will be rotated about the line by an angle of A.

Note that this method is NOT a good way of rotating points about an arbitrary axis. The term 1/v in the G and G' matrices can result in a singularity. This can easily be fixed by replacing G and G' with the identity matrix when v is 0, but I'll be including a much faster and safer matrix in the final Win95GPE release.

Solving Simultaneous Equations

Imagine that we have the following equations:

a*x + b*y + c*z + d = 0
e*x + f*y + g*z + h = 0
i*x + j*y + k*z + l = 0
Now imagine that we need to find a point <x,y,z> which satisfies all 4 equations. These equations contain 12 unknowns, so it's clear that it would be very difficult to solve using "regular" algebra.

Matrices on the other hand provide us with a very useful tool for solving such problems. We can create a matrix like this:

|0|       |a  b  c  d |     |x|
|0|   =   |e  f  g  h |     |y|
|0|       |i  j  k  l |     |z|
|1|       |0  0  0  1 |     |1|
When we expand this out (as I showed earlier on in this article) we get the 4 equations we are trying to solve. We also know that we can use the inverse matrix to rearrange the equation:
                       -1
|x|       |a  b  c  d |     |0|
|y|   =   |e  f  g  h |     |0|
|z|       |i  j  k  l |     |0|
|1|       |0  0  0  1 |     |1|
These equations will expand out to the following:
x = A*0 + B*0 + C*0 + D*1 = D
y = E*0 + F*0 + G*0 + H*1 = H
z = I*0 + J*0 + K*0 + L*1 = L
(I've captitalised the letters since the inverse matrix will typically contain different values than the original).

In summary, if we create a matrix from 3 parametric equations and then calculate the inverse of that matrix then the 4th collumn will contain a point which satifies all 3 equations. If the inverse matrix operation fails then we know that the problem has no solution (or an infinate number of solutions).

Incidently the fact that we'll only wind up using the 4th collumn can help in optimising this procedure since we don't need to calculate the entire matrix.

2D Matrices

In this text I've been using 4x4 matrices for 3D coordinate geometry, but it's just as easy to use a smaller matrix for 2D (or higher), we simply drop the z term.

2D matrices are of the form:

|x'|       |M11  M12  M13 |     |x|
|y'|   =   |M21  M22  M23 |     |y|
|1 |       | 0    0    1  |     |1|
ie x' = M11*x + M12 * y + M13
   y' = M21*x + M22 * y + M23
The 2D identity matrix is:
         | 1    0    0  |              x' = x
         | 0    1    0  |              y' = y
         | 0    0    1  |
To translate by <tx, ty>:
         | 1    0    tx |               x' = x + tx
         | 0    1    ty |               y' = y + ty
         | 0    0    1  |
To scale by <sx, sy>:
         | sx   0    0  |               x' = x * sx
         | 0    sy   0  |               y' = y * sy
         | 0    0    1  |
And to rotate about the origin by A radians:
         | cos(A)   -sin(A)    0  |               x' = cos(A)*x  - sin(A)*y
         | sin(A)    cos(A)    0  |               y' = sin(A)*x  + cos(A)*y
         |   0         0       1  |
Practical Applications

This section is yet to be written. With all the math out of the way I'll be showing how to actually apply this knowledge to a 3D engine.

Important Note to Direct3D Users: Direct3D matrices are similar, but in the representation used by the help file the "x" and "y" dimensions are swapped around. Thus they show the translation matrix to be:

| 1  0  0  0 |
| 0  1  0  0 |
| 0  0  1  0 |
| tx ty tz 1 |
and so on.

 
 
 
Copyright (c) 1997 Mark Feldman (pcgpe@oocities.com) - All Rights Reserved
This article is part of The Win95 Game Programmer's Encyclopedia
Please retain this footer if you distribute this file.
 
Back to Graphics