The Units.

CPU Informations:

To work a processor has need for several modules for him making it possible tocharge the data in its registers (memory), to make forward information, to sort information, to predict instructions, etc...
Since the birth of the 4004, the processors evolved/moved much.
Gradually the need for new resources and computing power forced the engineers to find new combinations of transistors in order to widen the field of the capacities of the CPU. Thus, very quickly one realized that the main frames was weak in calculation with floating points numbers, thus to reinforce the 8086, 286, 386... it A was necessary to create the famous arithmetic coprocessor FPU ( 8087, 287, 387, ...) which can make calculations in parallel of the unit for integers: INT (integer) or ALU (arithmetic and logic unit) Thus two units ALU and FPU can work at the same time! (but not all the time).
Then came a strong need for calculations for the software multimedia, and the MMX born.

Soon processors will have also units adapted to 3D calculations (K6-3D, Cayenne, ...)

The arithmetic and logic unit:

It is the unit which you control most of the time when you program in assembly language.
It deals with the additions, subtractions, multiplications, divisions, work on the bytes and the words (8, 16, or 32bits).
Thus it is this unit which is generally requested, and it is thus it which one tries to discharge from his work by using a FPU.
It comprises more than one hundred instructions and is thus very general-purpose.
On the other hand its work gives only results coded by a register: integers.
It has access also to registers of 8, 16, or 32bits (AL, AX, EAX, ...).

The Floating Point Unit:

It is there for stage with the weaknesses of ALU during the calculation of floating point numbers.
Thus the FPU is a Hardware solution which thus allows a particularly fast calculation. Even for hard requests.
It works with its 8 registers (FP0-FP1-... -FP7) and memory datas, it processes cos, sin, tan, comparisons, square roots, etc...
Its registers are particular because these registers code the floating point values (4, 8,10bytes registers,and 14/94bytebuffers).
INTEL has the characteristic to create more powerful FPUs than its competitors.
This unit is very much used by the 3D computation softwares (raytracing, games...) and nowadays it is not enough any more.
In the past there was a connector (socket) for the FPU on the motherboard; today ALU and FPU make both part of the CPU and can thus communicate easily to the bus of the processor and its high frequency.

The MultiMedia unit:

The MMX could have been called MMU but this name was already used by another unit.
MMX: Multi-media extension.
The MMX is not a coprocessor, it was not added directly in the architecture.
In fact during calculations multimedia the FPU is only seldom used, it is for that that the MMX has been grafed to the FPU registers.
These registers are: MM0-MM1-... -MM7.
The central processing MMX unit is placed in the ALU execution pipeline, with the result that MMX  instructions are not carried out in parallel of the ALU instructions but they are seen like ordinary ALU instructions.
With the MMX we can make a calculation during only 1 cycle!
On the other hand there is a disadvantage since to use a register of the FPU because the MMX should be awaited two cycles with the result that if you use a 3D software (using FPU) which diffuses sounds (using MMX), of which it is the case for many games, it will be necessary to await two cycles with each time calculations 3D succeed those of the sound (and vice versa).
The characteristic of the MMX is to calculate several values with only one register;
a register of 64 bits is used to apply a calculation to four values of 8bits (8+8+8+8=64).
There are other combinations.
The MMX comprise 57 instructions.
Pentium MMX is also called P55C.
The MMX is recent but it is nevertheless used very little by the new software because it brings not enough innovation and is used very little in 3D (a card 3D is much faster!).

The 3D unit:

It should be a part of the next processors.
The different manufacturers have different 3D projects, the software (direct3D) will have thus to be adapted.
A 3D accelerator card  need lots of FPU power.
For example: Voodoo2 requires a so powerful FPU that little CPU is rather fast, only a PII300 manages to extract from Voodoo2 the maximum of its potential.
If the 3D unit is well made it will be able perhaps to replace the FPU in the needs 3D, and to allow adapted calculations and more speed in frames calculations.
AMD: K6 3D, which has a new 3D unit.
CYRIX:  new 6x86MX With an enhanced MMX unit containing new instructions for 3D data processing.