Texture Mapping

Introduction

In the PCGPE Sean Barrett wrote an extremely good article on texture mapping. In this version I would like to elaborate on many of the ideas he discussed. (The original article had a few minor bugs which have been corrected in this version).

Texture mapping naturally requires a fair amount of math. You might want to brush up by reviewing the Win95GPE articles on 3D Math and matrices. The most important thing to understand before trying to write a texture mapping routine is that there are three different parts to "texture mapping":

It’s important to understand what each of these steps are and how they differ from each other.

Transforming and Perspectively Projecting the Polygon

In any given 3D world the camera will have a certain position and orientation, and each polygon you draw will have it’s own position and orientation. For simplicity you’ll probably just store the corner points of a polygon in 3D world space. You then need to figure out where each point is relative to the viewpoint. If you’re using a matrix to store the camera’s position and orientation then you can transform all the points in your world by the matrix inverse in order to obtain their positions relative to the camera.

The next step is to clip away those parts of a polygon which are not visible on the screen. Most of this step can be accomplished in 2D during the scan conversion, but at the very least you must clip the polygon to the Z=0 plane here so that only those portions of the polygon in front of the view point will be rendered. Note that during this step an extra point may be inserted into the polygon, as in the following case:

Drawing a line through any convex polygon will always divide it into 2 smaller convex polygons, so if our original polygon is convex then the resulting polygon will also be convex. Clipping a triangle by the z=0 plane like this may return a convex quadrilateral, so if your rendering routine only renders triangles you may have to split the resulting polygon into 2 triangles by drawing a line between any two opposing corner points.

With the polygon transformed and clipped to the front of the viewport we are ready to apply the perspective projection to the corner points. The equations to use are:

screenx = DIST * x / z
screeny = DIST * y / z

Note that DIST is an arbitrary value representing the distance between the view point and the projection plane. Many people assign DIST a value of 1, thus eliminating two multiplications. Unfortunately this gives a ridiculously high field-of-view (FOV). I find that setting the DIST to the same value as the screen’s horizontal resolution looks nice, this results in a FOV that is about 53 degrees. If you’re comfortable enough with matrices then you can also work the DIST multiplications into the matrix itself, thus getting the best of both worlds.

There is a problem with the equations above though: the points which are created during the z=0 clip have a z value of 0, and this will cause a divide-by-0 error in the above calculations. This stands to reason, since objects appear larger the closer they are. At z=0 the object’s size is effectively infinite on the screen. To circumvent this we actually have to clip to a z value slightly larger than 0. I usually set this value, "DELTA", to be around 0.001 or so, but you can experiment to see what kind of effect a larger value has. Another solution is to still clip to 0 but add DELTA to all your z values during the perspective project, thus shifting the entire world back a bit.

One very important thing to keep in mind is that in this article I use the left-handed coordinate system. This means that x increases to the right, y increases going up and z increases going into the screen, as in the following diagram:

When rendering to the video display’s memory we won’t be using a z component, but we will use x and y. On most displays y increases going down, so we’ll have to flip the sign of all y values. Also the 3D origin point should be centered on the screen, so we must also offset all screen points by the center point coordinates. For the remainder of this article whenever I refer to "screenx" I mean "middlex + screenx", and whenever I refer to screeny I mean "middley - screeny", where <middlex,middley> is the location of the center screen pixel (ie <320,240> for a 640x480 display).

Scan Conversion

The next step is to scan convert the polygon to the screen. This means we’ll be calculating which points on the screen must be drawn. If we were writing a flat-shaded polygon routine then we’d be filling each pixel with the same color. Starting with the next section though I’ll be describing what needs to be done to render texels to them.

As I mentioned above, drawing a line through any convex polygon splits it into 2 smaller convex polygons. For any such line there are 3 ways it can intersect the polygon:

Perspectively projecting a straight line results in a straight line on the screen. The reverse is also true, so if we render a polygon to the screen by drawing horizontal lines then each line will intersect the polygon in one of the 3 ways shown above. Each line will thus have a left edge and a right edge. During scan conversion we determine what the screenx left and right edge values are for each horizontal line on the screen. The scan conversion itself is divided into 2 smaller parts:

As mentioned above, a 2D polygon routine would implement the second task by simply filling in each pixel with a solid color, whereas a texture mapper renders texesl. Both however use the same logic for the task of determining the left and right edge values. (I should point out here that it’s possible and often faster to perform both these steps with a single routine. The algorithm to do so is obviously more difficult to implement though, and I won’t be discussing it in this article).

The first step in scan conversion is to loop through each of the polygon’s edges, figure out whether it’s a left or right edge and store the appropriate values in an array somewhere. We will need to store the left and right values for possibly each horizontal line on the screen, so the array size must equal the total y resolution of the screen we are rendering to:

int left[480], right[480]; // extent array for a 640x480 display

If a polygon’s points are stored in clockwise order then it’s corner points will also appear in clockwise order when it’s projected to the screen. The only time when this wouldn’t be the case is if the polygon was facing away from the viewpoint. Most engines however perform backface culling, so this would never happen. Now if the 2D screen polygon points appear clockwise then for all the left edges the first point’s y value will be lower than the second point’s (since all our y values are flipped this means that the actual screen y value will be higher). Similarly, for all the right edges the first y value will be higher than the second. This is handy to know since it gives us a fast way of determining which of the extent arrays to update for any given edge.

So now we have an edge and we know whether it’s a left edge or a right edge. The next step is to figure out which screeny lines it intersects and set the appropriate values in the extent array. This is not quite as straightforward as it first appears, and it’s one of the most common stumbling blocks for people writing their first 3D engine. Typically your perspective projection equations return floating point values, and yet many people immediately store the resulting screen coordinates in integer variables. If you want a 100% accurate texture mapping routine then you absolutely cannot do this. Casting the float values will cause them to "snap" to the nearest integer values. The most they will ever snap is by +- 0.5 pixels, but this is enough to cause noticeable warbling in your textures, and in extreme cases can cause your application to hang! Consider the following edge being scan converted:

Casting the corner point values to integer will cause the top corner point to snap to <4,10>. That point however is outside the actual polygon itself. During texture mapping you’ll be grabbing texels from the texture and drawing them onto the polygon. Passing an invalid point to your texture mapping routine may cause it to read a texel that is outside the texture itself, and while DOS may be forgiving of invalid memory accesses, 32-bit protected mode operating systems such as Win95 are not! One way around this is to tile your textures, but this can slow your texture mapping routine down later on. Even then, it would still cause your textures to warble a little.

So to correctly scan a polygon we must do the following:

firstscreeny = ceil(top_point); 
lastscreeny = floor(bottom_point);

This will make sure that all edge points snap in the correct direction so that they fall on the inside of the polygon.

Or will it? During an online discussion with Chris Babcock he pointed out the following case to me:

Let’s assume that both the red and purple polygons share an edge which has a y value of 11.0 exactly. Using the equations above means that the highest y value for the red polygon will be 11, and the lower y value for the purple polygon will also be 11. This means that both polygons will draw a line of pixels at y=11, which should not happen in a correct scan converter. Without going into the details, it turns out that y=11 does not in fact fall inside the red polygon, even though the bottom edge just touches it. Chris showed me that the correct equations to calculate the y extents of an edge should instead be:

firstscreeny = ceil(top_point);
lastscreeny = ceil(bottom_point)-1;

Changing my scan convertor to use these equations eliminated the remaining problems I had with the 3D engine I was developing at the time.

Now what should we do about edges that are horizontal on the screen? If such an edge is at y=11.2 (say) then that will correspond to firstscreeny = 12 and lastscreeny = 11. As it turns out, we can ignore these edges altogether. The endpoints of horizontal edges are connected to other edges in the polygon, and these edges will write the correct left and right values for the endpoints of the horizontal line.

So now we have the range of y edge values to change, the remaining step is to look down them, calculate the x value for each and adjust the extent array accordingly. The code to do this is similar to a regular 2D line drawing algorithm. However, herein lies another common problem made while developing these routine. Look at the following code stub and see if you can spot the problem. I'll assume this is for a right edge:

// ***** THIS CODE IS FAULTY ***** 
// Calculate screen y extents
firstscreeny = ceil(y1);
lastscreeny = ceil(y2)-1;
// Make sure it’s not a horizontal line 
if (firstscreeny > lastscreeny) 
 continue;
// Initialize x and deltax values 
float x = x1;                        // get the top point 
float deltax = (x2-x1) / (y2-y1);
// Loop down the edge 
for (y=firsty; y<=lasty; y++) 
{ 
 // do something with the x value here
 // update x 
 x += deltax; 
}

The problem is that x is initialized to the value of x1. Consider the following edge:

The top point on the edge has coordinate <4,10.4>, but the first y value that we update in the extent array is ceil(10.4)=11. At y=11 x no longer has the same value it did at y=10.4, it’s changed a bit. To correctly scan convert a polygon we must take this into account. The initialization code above should instead look like this:

float deltax = (x2-x1) / (y2-y1);
float x = x1 + (firstscreeny-y1) * deltax;

We now have enough information to loop through each edge, calculate the y extents, calculate the corresponding x value and update the appropriate array. The last thing to keep in mind is that the x values themselves are in floating point format and must be rounded in the correct direction so that they fall inside the polygon. Our inner loop must therefore look something like this:

// leftegde is a boolean value that we set higher up 
if (leftedge) 
 for (y=firsty; y<=lasty; y++,x+=deltax)
  left[y] = ceil(x);
else 
 for (y=firsty; y<=lasty; y++,x+=deltax)
  right[y] = ceil(x)-1;

As we do this for each edge we should maintain variables for the minimum and maximum screeny values we encounter. The final step is to loop down the extent array and render each horizontal line. For a flat shaded polygon we would do something like this:

for (y=miny; y<=maxy; y++)
 if (left[y] <= right[y])
  Line(left[y], y, right[y], y, color);

The check for left[y] <= right[y] is important. For some polygons containing particularly sharp corners the areas at the top and bottom may fall "in between" pixels and must be ignored (very thin polygons experience the same problems). This is a classic example of aliasing; the polygon itself has a high spacial frequency that the video card cannot adequately display. Increasing the display resolution can reduce this effect, but in a good 3D engine the "dropped" pixels will be "filled in" by neighbouring polygons.

Applying the Map

With the scan conversion out of the way it’s time to apply the map. The single most important thing to realize here is that texture mapping does not simply involve "drawing a 3D texture". You are mapping a texture onto a polygon. Imagine a polygon in 3D space:

(Hopefully this simple diagram isn't too confusing, I’m a programmer, not an artist. The green object is the view volume. The corner points of the blue polygon project onto the projection screen, indicated by the yellow lines).

As I said above, the texture mapping itself is totally separate to scan conversion. You can change the mapping coordinates all you like and it will not affect which pixels actually get drawn on the screen. Only the corner points of the polygon can determine that.

Once we know which pixels to plot on the screen the texture mapping equations indicate which texels to draw. For the moment, imagine that the texture is a rectangle which gets laid over top of the polygon:

(The "ghost" rectangle is the texture map, I've shown it semi-transparent just for illustration purposes). The original pixels on the screen still get drawn, but the actual color written to them comes from this texture map.

Now imagine that the map itself was at a different orientation:

Here we have a problem, because parts of the polygon don’t fall inside the texture. In cases like this we need to either expand the texture out so that it covers the entire polygon, or tile it (more on this later).

From looking at the above diagrams we can see that the polygon itself can be any shape, all we need is a routine to scan convert it and figure out which pixels need to be drawn. Then we run those screen pixels through the mapping equations I include later on in this article to find out which texel to draw for each one.

In general, the texture itself should be rectangular. It’s possible to use other shapes, but for real-time games rectangular textures are the way to go. If you want, say, a hexagonal texture, then you can put it inside a larger rectangular texture and simply define a polygon on top of it which only covers the appropriate pixels:

Textures, like screen pixels, are a two dimensional array of values. Just as screen pixels can be referred to as <x,y>, texture texels can be referred to as <u,v>. When you texture map, you must specify where the texture is in 3D space. With the actual polygon you simply define the coordinate of each of the corner points. The texture map is easier though, because we know that it's a rectangle. We only need to specify the coordinate of the upper left hand corner, the direction that u (i.e. the upper edge of the texture) moves in 3D space, and the direction that v (i.e. the left edge of the texture) moves in 3D space. The first value is a point and the other 2 are vectors, but we can easily calculate all 4 corner points from these 3 values:

Fortunately, all we need are these 3 values to map the rectangular texture to any polygon, they are used to calculate 3 more "magic vectors" discussed below. We could draw two polygons side by side, and if they both contained the same mapping coordinates then the texture would appear continuous across them (this is very handy for mapping complex shaped polygons, since we can split them up into simpler objects such as triangles and apply the same mapping coordinates).

At the start of this article I discussed how when you render a scene you must transform all points of a polygon to find their positions relative to the viewpoint. You also need to transform P, M and N for the texture maps. Transforming P is easy enough since it’s a regular point, simply transform it by the matrix as you would a polygon corner point. To transform the vectors M and N I simply calculate the actual upper right and lower left points for the texture map, transform those, then subtract the new P from the results. There’s probably a faster way of doing this, but I find that this operation takes up so little CPU time that I’ve never bothered to find out how. It could though make a very big difference if you need to render a whole lot of really small texture maps.

Another important thing to realize is that the texture rectangle itself does not even have to lie on the same plane as the polygon to which it’s mapped. Let’s say you’re rendering a room, and for HSR purposes the room must have a roof on it, but you want it to look like there is no roof and the user can see a blue sky with clouds. If you were to map a sky texture directly onto the roof it would be obvious that the roof was a low lying polygon. What you can do however is give the texture mapping origin point (P) a much higher y value. Thus, as the user moves around the room the sky texture will only appear to shift very slightly, thus giving the impression that the roof is much higher than it really is (just as objects appear smaller the further they are they also appear to shift less the further they are). By shifting P along the x and/or z axis you can even give the impression of moving clouds!

I frequently get e-mail from people asking me clarify various aspects of this article, and it seems that the most confusing aspect is that of the texture mapping coordinates (P, M and N). I think a lot of the confusion is due to the fact that both Sean Barrett and I have written articles which assume that all polygons are rectangles and are mapped with textures the same size as themselves (hence P=P1, M=P2-P1, N=P3-P1 etc.). "Snapping" the texture to the polygon like this really restricts your engine and doesn’t allow it’s true power to be fully utilized. It’s great when you first start out with the engine development because it gives you a nice fast way of quickly assigning mapping coordinates, but in a real application you’ll want to let the graphics artists fiddle around with the mapping as they please (in a commercial project this unfortunately means writing tools which allow graphics artists to do this…in my experience artists usually aren’t too keen on writing script files…..)

So now we come to the texture mapping equations themselves. There are numerous ways you can implement this, I’ve covered the seven most common techniques I’m aware of. Below is a table of each method discussed in this article along with how I personally rate them relative to each other. These results have been obtained through my own experimentation and should in no way be considered as the final say. Furthermore, some of the techniques can be tweaked to trade accuracy for speed, and visa-versa.

Technique Accuracy Speed Implementation
"Perfect" Mapping: Excellent Bad Easy
Affine Mapping: Bad Excellent Easy
Area Subdivision: Ok Poor Ok
Scanline Subdivision: Excellent Good Ok
Parabolic Mapping: Poor Good Easy
Hyperbolic Mapping: Excellent Ok Very Difficult
Constant-Z Mapping: Poor Ok Difficult

Perfect Mapping

The following algorithm is for perfect perspective correct texture mapping. The first time I saw it was in Sean Barrett's PCGPE article. Sean uses 9 "magic numbers", which I prefer to present as 3 "magic vectors".

First you need the P, M and N values I discussed above. We then plug these into the following equations to generate 3 "magic vectors":

{ x is the cross product between two vectors } 
A = P x N
B = M x P
C = N x M

For each pixel on the screen we then do this:

{ S is a vector, * is the dot product between two vectors }
S = <screenx, screeny, DIST>
a = S * A
b = S * B
c = S * C
u = texturewidth * a / c 
v = textureheight * b / c
col = texture texel at <u, v> 
Draw pixel at <screenx, screeny> with color 'col'

Once this basic texture mapping routine is working we can start to optimize it. Personally I prefer to use a C++ class with overloaded operators:

class CPoint3D 
{ 
public:
 // Constructors CPoint3D() {};
  CPoint3D(const CPoint3D &p) {*this = p;}
  CPoint3D(const double xi, const double yi, const double zi) {x=xi; y=yi; z=zi;}
 // Vector addition and subtraction 
  CPoint3D operator +(const CPoint3D p) {return CPoint3D(x+p.x, y+p.y, z+p.z);}
  CPoint3D operator -(const CPoint3D p) {return CPoint3D(x-p.x, y-p.y, z-p.z);}
  CPoint3D operator +=(const CPoint3D p) {*this = *this + p; return *this;}
  CPoint3D operator -=(const CPoint3D p) {*this = *this - p; return *this;}
 // Scaling 
  CPoint3D operator *(const double s) {return CPoint3D(x*s, y*s, z*s);}
  CPoint3D operator /(const double s) {return CPoint3D(x/s, y/s, z/s);}
  CPoint3D operator *=(const double s) {*this = *this * s; return *this;} CPoint3D operator /=(const double s) {*this = *this / s; return *this;}
 // Dot product 
  double operator *(const CPoint3D p) {return x*p.x + y*p.y + z*p.z;}
 // Cross product
  CPoint3D operator %(const CPoint3D p) {return CPoint3D(y*p.z-z*p.y, z*p.x-x*p.z, x*p.y-y*p.x);}
  CPoint3D operator %=(const CPoint3D p) {*this = *this % p; return *this;}
 // And finally, the actual x,y,z values
  double x,y,z; 
};

I've shown the most commonly used overloaded operators, my full code contains operators for transforming a point by a matrix etc. Feel free to add more of your own.

We can now initialize our magic vectors with the following code:

// Let P, M and N be CPoint3D structures we’ve already initialized 
CPoint3D A(P % N);
CPoint3D B(M % P);
CPoint3D C(N % M);

I use the modulus sign (%) to denote the cross product between two vectors. From what I know the real modulus between two vectors is not defined, so we can safely override it.

Our inner loop for a single scan line should now look something like this:

for (screenx=left[y]; screenx<=right[y]; screenx++) 
{ 
 CPoint3D S(screenx, screeny, DIST);
 double a = S * A;
 double b = S * B;
 double c = S * C;
 int u = texture.width*a/c;
 int v = texture.height*b/c;
 color = texture.data[v * texture.width + u];
 SetPixel(screenx, screeny, color); 
}

The first thing to notice is that we are scanning from left to right and incrementing screenx by 1 after each pixel is drawn. We can use this to optimise the calculation of S and the location of the pixel address in the display buffer. Here's the new loop:

CPoint3D S(x1, screeny, DIST);
char far * screen = &buffer[(100-screeny)*320+x1-160]; // or whatever
for (screenx=left[y]; x<=right[y]; screenx++) 
{
 double a = S * A;
 double b = S * B;
 double c = S * C;
 S.x++;
 int u = texture.width*a/c;
 int v = texture.height*b/c;
 *screen++ = texture.data[v * texture.width + u];
}

The next optimization has to do with the scalar values a, b and c. By multiplying the first dot product out we see that a = S.x*A.x + S.y*A.y + S.z*A.z. The only value which changes over the course of a single scan line is S.x. We know that S.x is incremented by 1 for each iteration through the loop so all we need to do is add A.x to a. The same reasoning applies to b and c. We don't even need to use S anymore so we can remove the line that increments it:

CPoint3D S(x1, y, DIST);
char far * screen = &buffer[(100-screeny)*320+x1-160]; // or whatever 
double a = S * A;
double b = S * B;
double c = S * C;
for (x=left[y]; x<=right[y]; x++)
{
 int u = texture.width*a/c;
 int v = texture.height*b/c;
 *screen++ = texture.data[v * texture.width + u];
 a += A.x;
 b += B.x;
 c += C.x;
}

The final step is to try and eliminate the two multiplications. This can be accomplished by pre-multiplying A and B outside the loop. Note that we must pre-multiply outside the y loop as well, otherwise the vectors will be multiplied once for each line:

A *= texture.width;
B *= texture.height;
.
.
.
y loop starts somewhere here
.
.
.
CPoint3D S(x1, y, DIST);
char far * screen = &buffer[y*320+x1]; // or whatever
double a = S * A;
double b = S * B;
double c = S * C;
for (x=left[y]; x<=right[y]; x++)
{
 int u = a/c;
 int v = b/c;
 *screen++ = texture.data[v * texture.width + u];
 a += A.x;
 b += B.x;
 c += C.x;
}

One final thing to keep in mind is that a, b and c can often be stored in 16:16 fixed-point arithmetic, thus making your entire inner loop integer-only.

Affine Mapping

Affine mapping uses bilinear interpolation to render. This simply involves calculating the map points at either end of a scan line and linearly stretching the texture map texels between them. Affine mapping is akin to orthographic projection. It works well when there is a small angle between the view direction and the surface normal of the object being rendered, since in these cases the z distance varies little across the face. However, as the angle increases the z distance varies accordingly and the object starts to look distorted. This is particularly noticeable when the object is close to the viewpoint, since changes in z

Despite it's obvious problems, affine mapping is a very useful rendering algorithm. When combined with other rendering techniques it can significantly speed up the rendering process with little to no visually noticeable distortion. How this is done will become apparent later in this article.

There are times however when it's possible and indeed advantageous to use affine mapping alone, particularly in cases where the z distance is constant along an entire scanline. Consider a 3D sprite object which is intended to look the same no matter which direction it is viewed from. To render such an object you would run it through the perspective projection equations to determine the size at which to render it, then use affine mapping to actually draw it to the screen.

The Wing Commander series took this one step further by storing sprite images viewed from a variety of different angles. To render an object it simply selected the appropriate image (based on it's position and orientation relative to the viewpoint) and rendered it using affine mapping. Wolfenstein 3D took this concept another step forward and used it to render walls. The walls were drawn one vertical scanline at a time, and since the distance between the viewpoint and all the pixels in a single strip was constant, affine mapping did the job perfectly.

Major Stryker and DOOM of course took things yet another step forward and also applied it to floors. Floors were rendered using horizontal scan-lines, and again the z distance along each scanline was constant. For a while it seemed that this same technique could be applied to surfaces with ANY orientation, but this introduced some new problems which I'll discuss below (see the section on free-direction texture mapping).

I remember seeing an interesting program on the Discovery Channel in which they claimed that human beings have a tendency to naturally orient themselves "right way up". Astronauts, skydivers and gymnasts have to train themselves to overcome this in order to maneuver efficiently in 3D space. This is even evident in our architecture: surfaces we walk on tend to be horizontal while walls tend to be vertical.

This aspect of human nature is especially applicable to games. I find games such as DOOM, Duke Nukem and Quake to be very easy to maneuver in, mainly because I always know which way is "up". On the other hand, games like Descent and Wing Commander, while being very good games, tend to take some time getting used to and often leave me feeling lost and disoriented.

This is tremendously good news to programmers, since it means that for many games most of the surfaces we render will, in all likelihood, be horizontal or vertical, particularly if we design the engine to automatically align the user correctly whenever they are standing on a flat surface. Writing extra routines to handle these vertical and horizontal special cases is tedious, but well worth the extra effort.

Area Subdivision

This section is still being written. It is the same as affine mapping, but you split large polygons into smaller pieces to minimize the errors.

Scan-line Subdivision

This method was first brought to my attention by Jouni Mannonen (DarkShade on #coders). It is similar to area sub-division, but instead of subdividing an area you subdivide along the scan-line.

Affine mapping is extremely fast, but errors accumulate along the scan line as you render. Thus, it's accuracy is largely dependent on the length of each scan-line. Scan-line subdivision compensates by splitting each scan-line into smaller segments and eliminating the error as it moves along. There are several advantages to doing this:

Parabolic Mapping

Parabolic mapping uses the curved shape of a parabola to render more accurately than affine mapping, at the cost of speed.

Let's assume that you want to draw 11 pixels on the screen, going from x=50 -> x=60. To keep things simple I'll assume that we bias the line by subtracting 50 from each pixel so that the range is x0=0 -> x2=10. I'll also assume that x1 is the midpoint, i.e. x1=x2/2.

I'll also assume that the quadratic equation is u = a*x^2 + b*x + c (v also has a similar equation which I won't bother with). You can determine the unknowns (a, b and c) by calculating u0, u1 and u2 for x0, x1 and x2 respectively and then solving the simultaneous equations:

u0 = a*x0^2 + b*x0 + c             (1) 
u1 = a*x1^2 + b*x1 + c             (2) 
u2 = a*x2^2 + b*x2 + c             (3)

We know that x0=0 and x1=x2/2, so this allows up to simplify these equations and put them all in terms of x2, which I'll refer to from now on as simply 'x':

u0 = a*0 + b*0 + c                 (4) 
u1 = a*(x/2)^2 + b*(x/2) + c       (5) 
u2 = a*x^2 + b*x + c               (6)

Expanding these out we get:

u0 = c                             (7) 
u1 = a * x^2 / 4 + b * x / 2 + c   (8) 
u2 = a * x^2 + b * x + c           (9)

Substituting (7) into (9):

u2 = a * x^2 + b * x + u0 
b * x = u2 - u0 - a * x^2 
b = (u2 - u0 - a * x^2) / x        (10)

Substituting (7) and (10) into (8) and rearranging we get:

a = 2*u0 + 2*u2 - 4*u1             (11)
           x^2

Thus equations (11), (10) and (7) give us the parameters for the equation u=a*x^2 + b*x + c.

The next step is to convert these values into something we can use while rendering. Taking the first derivative we get:

du/dx = 2*a*x + b                  (12)

and the second:

ddu/dx = 2*a                       (13) 

Thus our inner loop will look like this:

PutPixel(x, y, texture[v][u]) 
u += du
du += ddu         (we must remember to also do the same with v, dv and ddv)

To start this loop we need to set u to the value of u1, ie u(0)=c. However, we need to start du at the value it has at 0.5:

du(0) = 2*a*x+b 
      = 2*a*0.5+b 
      = a+b

We have to be careful to update u before updating du, so our routine to render a scan-line must look something like this:

// Initialize 
u = c;          (remember to do this with v, dv and ddv as well).
du = a + b; 
ddu = 2 * a;
// Render the scan line 
for (x=x1; x<=x2; x++) 
 { 
 PutPixel(x,y,texture[floor(v)][floor(u)]) 
 u += du; 
 du += ddu;     (remember to do this with v, dv and ddv as well).
}

Hyperbolic Mapping

Many thanks to Jan Vondrak for this idea. I believe Jan originally posted an article on this technique to a newsgroup, but by the time I received it his e-mail address was no longer attached. Rather than post his original article without his permission I have decided to write a new article on it, giving him due credit here of course.

In the above section on perfect mapping I showed how the following equation will do perfect mapping:

          DIST * (u1 + (x-x1) * du)
u(x)  =   -------------------------
               z1 + (x-x1) * dz

A similar equation is used for v. (For those of you that need brushing up on your 3D math, u(x) means "the value of u for any given x". It's simply a way of telling the user that we'll be trying to find the value for u as the value of x changes).

This can be rearranged into the more general equation:

u = A * x + B
    C * x + D

Now look at what happens when we multiply this equation out and rearrange it:

u * (C * x + D) = A * x + B
A * x + B - u * (C * x + D) = 0

i.e.

f(u,x) = A * x + B - u * (C * x + D) = 0

We have now rearranged this equation so that two variables are changing, u and x, and we know that f(u,x) will need to always equal 0 in order to achieve perfect mapping. As we move along a scan line from left to right we know that we'll always be adding 1 to x and the value of f(u,x) will change accordingly. All that remains is to figure out how to modify u so that f(u,x) is as close to 0 as possible (actually this isn't entirely correct, I'll discuss this further below)..

Those of you familiar with Bresenham's algorithms know that he developed an algorithm similar to his line and circle algorithms which also handles general conic sections. A hyperbola, like parabolas and ellipses, is a conic section. If you optimize that algorithm for hyperbolas you will get something that looks like the equation above. However, that algorithm splits the conic into 8 quadrants and renders f(x,y). Thus each pass through the inner loop will return a new <x,y> coordinate to be plotted on the screen. Unfortunately this is not much use to us here. Sometimes we will need to skip over several u values, particularly when the surface being rendered is far away from the view point.

[This section is still being written]

Constant-Z Mapping

Recall that in the above section on perfect mapping, the value of c was obtained by taking the dot product between the screen point and the vector C:

c = <x-160,y-100,DIST> * <Cx, Cy, Cz> 
  = (x-160)*Cx + (y-100)*Cy + DIST*Cz

Now it so happens that Cy=0 for all vertical surfaces, so moving one pixel down (ie incrementing j) will not change the value of c. c is therefore equal to (x-160)*Cx + DIST*Cz for every pixel in the collumn, so you can precalculate the value of 1/c and do a linear affine transformation along a vertical line the same way Wolf 3D does.

Similarly Cx=0 for all horizontal surfaces, and c= j*Cy + DIST*Cz for every pixel in a horizontal line. This allows us to do a linear affine transformation along horizontal lines the same way DOOM does.

Constant-z mapping extends this idea by choosing a direction to render so that Dx*Cx + Dy*Cy = 0. Dx/Dy is the slope of this line on the screen, and as we move along this line the value of c remains constant and a linear transform is again possible. We choose the independent axis and direction based on the sign and absolute value of Dx/Dy, the same way we do when we are drawing regular lines. Instead of drawing solid lines however, we will be updating u and v using another linear scan and using the results to get the appropriate texel from the texture map.

It's a nice theory....too bad it doesn't work! Sampling theory tells us that whenever we do point sampling like this we will get an error of +- 0.5 pixels. In perfect mapping we align it so that this error is always centered at the midpoint of the pixel we are after, so the resulting uv values always fall within the same pixel. In this case however we are doing two scans and the error is present in both of them.

Another way of looking at it is in terms of the distance between successive scan lines. As we do a scan along the screen we are snapping the position of the scan line to the nearest screen pixel. Thus the distance between two scanlines does not remain constant along the entire line for angles other than 0, 45, 90 etc degrees. Constant-z is based upon the idea that this distance is always constant, but this cannot be achieved with a pixelated display system.

The errors resulting from constant-z are due to aliasing. They can be minimized by increasing the screen and/or texture resolutions, but even then they are very noticeable in areas of high contrast. It might be possible to minimize this effect using conventioanl anti-aliasing techniques such as filtering, although I believe this would significantly slow things down. The MMX might provide an easier way of doing this, but by the time those machines are common 3D hardware accleration will probably be standard.

As if all this wasn't enough, constant-z presents numerous other problems which can be very tedious to solve. For starters, you cannot simply pick the first point on a scan line and start scanning. Doing so will result in gaps in the texture with other pixels being rendered twice. To solve this problem you need to shift all scan lines "back and forth" to align them all correctly, calculate where along each scan line you must actually start rendering as well as the uv coordinate of that particular point.

In summary: don't use constant-z. It's not worth it. (In fairness however I should mention that Sean Barret tells me System Shock, Flight Unlimited and Terra Nova all use constant-z).

Calculating the Distance to Each Point

Once you have calculated the u and v values it's easy to calculate where the point itself lies in 3D. You simply add the origin point, P, to the distance moved along the u axis, u*M, and the distance moved along the v axis, v*N, ie:

Q = P + u*M + v*N

Note that this assumes the u and v values are in the range 0-1, so use them before you multiply by 128 (or whatever). If you optimize this out you’ll get the equation "distance=d/c" in the inner loop.

Round-Off Hazards

Real-time 3D engines are supposed to more-or-less model the real world. Computers aren’t exactly the most ideal medium to do this however, what with them being digital and all. The FPU unfortunately doesn’t store numbers with infinite accuracy, so sooner or later you may into problems with round-off. If the value is "supposed" to be 0 and you get a -1.0e-30 then that could cause major headaches with 32-bit applications. Even Duke Nukem apparently seems to suffer badly from this, take a look at any angled edge and chances are you’ll see pixels from opposite edges popping through.

The only solutions I know of to get rid of round-off problems is to expand M and N by a small amount by tweaking the screen x values when you plug them into the mapping equation. I’m sure there’s been some theoretical work done on how to calculate the maximum error possible, but I find trial and error works fine for me. Personally I prefer to tweak the horizontal screen lines to accommodate for this: if a line goes from x=10.3 to x=45.9 then I’ll use those values for the scan conversion but use 10.3+EPSILON and 45.9-EPSILON for the texture mapping (EPSILON is 0.0001 in my apps). Because I’m bringing the edges in by a small amount I know they’ll wind up slightly inside the polygon. This would cause problems if x1 and x2 were very close together and crossed a pixel boundary, since x2 would wind up being less than x1 and wrong u/v values would be returned. With EPSILON being so small though the chance of this happening is fairly remote; I’ve yet to see it happen. (Another way of accomplishing this same thing is to stretch the original P, M and N values out a bit). If you do tweak x1 and x2 then make sure you don’t use those tweaked values during scan conversion. Doing so will cause different pixels to be drawn on the screen. E.g. if x1=10.0 and you add 0.0001 then the first pixel drawn will be at 11. You tend to wind up with the edges of your polygons blinking on and off. In the Win95GPE article "Connected Polyhedrons" I discuss how to use clipped polygons to do hidden surface removal. With this algorithm it’s vitally important that clipping be done accurately. In my first attempts at this type of engine the "blinking edge" actually led to larger more serious problems when I was on the border of two rooms (each entire room would alternately "blink" in and out of existence).

Conclusion

That’s it for now. This article still needs a lot more work, and I’ll be updating it when I get the chance. In the mean time if I’ve been a little too vague, confusing or simply long winded about anything please let me know.

Cheers,

Mark Feldman

Acknowledgements and Thanks

Sean Barret - For being the first person to make this information understandable to us mere mortals.

Chris Babcock - Helped me iron out a lot of bugs in my initial attempts. One of those people who seems to know at least a little bit about almost everything.

Robert Pengelly - For helping me see the more confusing parts of my original article and where it could be impoved.

I.O. Angell and B.J. Jones - For writing the best damn introductory 3D graphics book ever: "Advanced Graphics with the Acorn Electron" (I picked it up for 50c from a second hand book store, it was one of only two books I brought with me when I moved to the US).

Everyone who’s e-mailed me with tips, comments, corrections and suggestions.

 
Copyright (c) 1997 Mark Feldman (pcgpe@oocities.com) - All Rights Reserved
This article is part of The Win95 Game Programmer's Encyclopedia
Please retain this footer if you distribute this file.
Back to Graphics