Stosh's Web Pages -- Optimizing Software for Speed

Stosh@nospam.lycosmail.com

For Beginners

 Stosh@nospam.lycosmail.com

For Beginners

This is an introduction to Optimizing for Speed intended for beginners.  The advanced page is here.  How do you know which to look at?  See the section below titled "How Do I Know If I'm A Beginner."

As a Beginner, I am concerned with

These web pages will guide you to increase the speed at which your code executes while not increasing development time beyond what you deem acceptable.  At the same time, pointers will be given to help you remove bugs from your code even if the complexity of your code is increased.

The primary emphasis here is upon helping the beginner to make code which executes fast.


How Do I Know If I'm A Beginner?

I've characterized my view of a beginner by describing the following characteristics of a beginner.  This will help you know whether you're interested in these web pages "for a beginner" or whether you're more interested in the web pages for an Advanced Programmer.

Language

As a beginner,

Compiler

As a beginner,  

Hardware

As a beginner,

Portability

As a beginner,

Bugs

As a beginner,

 

Optimization Techniques For Beginners

don't write in assembly code

Especially if you're a beginner, do not write programs in assembly code.  When I need to write fast code (code which executes quickly), then I write in C language (not C++).  I do write in assembler, and I'll tell you more about when and why in the advanced section.
 

optimize only important code

Code which executes infrequently needs to be written with an eye for (1) easy maintenance and (2) no bugs.  Extremely optimized code is convoluted and not easy to maintain or debug.  There's no point in optimizing code which only executes infrequently (except maybe for practice!)

Thus, do not optimize user input code.  Also, common CRT displays update at 30-100 Hz, so your output code does not need to be optimized beyond a clearly defined limit.
 
The implication of the idea "optimize only frequently executing code" is that you need to know your program and what code is executed when.

time your code

Timing code is an expensive exercise.  You need to use quicker methods of determiening the speed of you r code.  At some point you will need to empirically determine (test) whether code X is faster than code Y.  This may be simply to be able to report the speed to others.  Hopefully your code can hit the "fast enough" speed before having to do much in the way of timing tests.

I don't find it at all helpful to use profilers.  Some people find them helpful to get a general trend of what their code is doing.  I think that beginners should use profilers, but there will also come a time when they only tell you what you already know.

    quicker methods
        write good code
        examine code -- loop invarient blah blah
        look at output of compiler (assembly language or intermediate code)
 

timing code inside a loop

I am not a big proponent of putting a loop around a piece of code and timing multiple iterations.  This is a good way of getting around the problem of not having a microsecond timer; but doesn't help you in all situations.  For a beginner, this can be helpful.  Be aware that looping code can deceive you as to the true speed at which code will execute.  This is due to caching as well as other issues (interrupts, disk I/O repeatability).

I find the best timing information by timing one (1) iteration of a small piece of code with a microsecond timer.  (Faster than 1 us is good, if you can get it, but slower than 1ms doesn't do much good in my mind.)  The basic code snippet would be

    before = usec_time();
    {do something}
    after = usec_time();
    printf("duration = %ld", after - before);

Note that the time stamp is sampled before the output statement.  This doesn't matter much here in a C language example, but can make quite a difference in other languages.  Avoid the construction  print("duration = ", usec_time() - before) especially when using an interpreter.
 

write code well -- don't do silly things

Learning how to write code which runs well (fast) doesn't cost development time and doesn't increase complexity.  I find that my mind gets into the mode of writing code well. So write code well and you will write well code.

Take, for example, the code

    MAX = 10
    index = 1
    while (index <= MAX)
        total = total + 1

        do stuff
        index = index + 1
    endwhile
 

as compared to

    MAX = 10
    index = 1
    while (index <= MAX)
        do stuff
        index = index + 1
    endwhile
    total = total + MAX

Knowing what code to put inside vs. outside of the loop is more a matter of habit rather than sitting down and thinking for a few hours.  Note that sitting down and thinking is often a good idea.  (I try to think every now and then.) Some compilers will handle this simple case for you.  More complicated cases cannot be handled by the compilers -- specifically when a function is being called.   This is because you may know that the function is returning a constant value; but, the compiler doesn't know that fact.

Take, for example, the code

    MAX = 10
    index = 1
    while (index <= MAX)
        total = total + sin(alpha)

        do stuff
        index = index + 1
    endwhile
 

as compared to

    MAX = 10
    index = 1
    while (index <= MAX)
        do stuff
        index = index + 1
    endwhile
    total = total + MAX * sin(alpha)

where recomputing the sine several times is easily avoided.

 

use look-up tables, but sparingly

Some algorithms (such as CRC calculations) can be improved by the use of lookup tables.  Lookup tables are a special case of pre-computing constants.
    e.g.    converting values from A-to-D inputs from counts to volts and from counts to engineering units.
    e.g.    performing CRC calculations in 8-16 bit chunks rather than 1 bit at a time.
    e.g.    precomputing weight constants for waveform filter functions which use a window size.
    e.g.    determining the execution address for a switch statement

The idea of a lookup table is that not only do you compute a value once and then use it several times later, but also that you can quickly access the value by using an array.

The drawback of lookup table isn't really the space of the array (although that's a good point).  The drawback is the complexity of the code.  You just can't optimize everything by hand -- the best use of lookup tables are the ones which are automatically performed for you -- compiler optimizations for switch statements.
        switch (code_number)
        {
            case 1:
                    answ = "one";
                    break;
            case 2:
                    answ = "two";
                    break;
            case 3:
                    answ = "three";
                    break;
            case 4:
                    answ = "four";
                    break;
            default:
                    answ = "";
                    break;
        }
 

Compiler Optimizations

Turn them on.

The "turn on all optimizations" option may actually not turn on absolutely all optimizations.  For example, "register aliasing" optimizations are probably not turned on; also, the compiler may want to optimize for size -- I prefer to optimize for speed.

Compilers have bugs.  Turning optimizations on (or off!) may make (more) bugs show up in the compiler.  That doesn't mean you turn optimizations off; but rather, you need to retest the code when changing the optimization level.  The best way to find the source of a bug (when the compiler flags make a bug show up) is to do a binary search.  More about binary searches and debugging in the debugging pages.
 

know what library code really does


 

in-line functions

 

substituting utilities for expertise

                profilers
 
 

substituting hardware for expertise

                buy a faster machine
 
 

Use code generators

LEX and YACC are the most famous code generators.  Learn about them and use them when processing textual input.

By the way, don't write lexical analysis and parsing code by hand -- just get better code generators.  For example, the FLEX program can generate much faster lexical analyzers than LEX does.  But a better optimization than both of these programs is to write your lexical specification "the right way."  More about optimizing lexical analysis in the advanced pages.
 

Buy pieces of software that you can put together to get some or all of the desired result.

 
 
 
(c) Copyright 1998 by John Muczynski.  All Rights Reserved.


 
Stosh@nospam.lycosmail.com