What is DLX?
DLX is a simple pipeline architecture for CPU. It is mostly used in universities as a model to study pipelining technique.
What is a Pipeline?
In the old days, we looked at a CPU as a monolithic processing unit. Whenever we ask it to do something, we have to wait until it finishes that task before we can proceed to the next one. So if your old CPU takes 10 clock cycles to do multiplication and 1 clock cycle to do summation, you have to wait until the multiplication finish before you can do the summation. The total is 11 cycles (not include fetching and other things it need to do before the actual execution.)
Today, we look at a CPU as a collection of processing units which can execute concurrently (made possible by RISC.) So you can do both multiplication and summation at the same time. The total becomes 10 cycles (9% performance increase! Imagine 1 multiplication and 10 summation.)
So what processing units are there in DLX?
There are not many rules in DLX. The basic units it must have are:
Here is how they are connected: IF - ID - EX - MEM - WB
Instruction propagates through the pipeline from IF to WB.
Usually each unit takes one cycle to operate its task, except the EX and MEM units. You might simplify it and let every unit takes just one cycle, no exception.
There is a fixed format of instruction set. You can find it anywhere from the internet. Basically, you can do only one task per instruction and you can not use value in main memory for execution (you need to load it into register first.)
Another type of stall occurs from branching instruction. If we wait until braching instruction reaches EX, we might have to throw away the instructions already loaded into IF and ID. Why? Because it might happen that we load in a wrong branch. We can move branching unit (detach from EX) back to ID and use some kind of branch prediction to minimize the stall.
If you want to know more details, check up computer architecture books in your favorite libraries!