What is a Linear Feedback Shift Register (LFSR)?
A n-bit LFSR is a n-bit length shift register with feedback to its input. The feedback is formed by XORing or XNORing the outputs of selected stages of the shift register - referred to as 'taps' - and then inputting this to the least significant bit (stage 0). Each stage has a common clock.The 'linear' part of the term 'LFSR' derives from the fact that XOR and XNOR are linear functions. An example of a 5-bit LFSR is shown below:
Click on thumbnail for figure1 : 5-bit LFSR
This has taps at stages 1 and 4 with XOR feedback. Note also that the LS bit of the shift register is, by convention, shown at the left hand side of the shift register, with the output being taken from the MS bit at the right hand side.
So what is it about a LFSR that makes it interesting? It will produce a pseudorandom sequence of length 2n-1 states (where n is the number of stages) if the LFSR is of maximal length. The sequence will then repeat from the initial state for as long as the LFSR is clocked.
Assume that the example LFSR above is set to $1F after initialisation. The output of the feedback XOR gate will be 0 (since 1 XOR 1 = 0) and the first clock edge will load 0 into stage 0. The following table shows the sequence :
Table 1 : 5 bit LFSR sequence
After the initial state, the bit shifted into stage0 on each clock edge is the XOR of stage4 and stage1. The LFSR passes through 31 states (25-1 = 31) and so is of maximal length.
The one state that the example 5-bit LFSR doesn't pass through is $00. If the LFSR contained $00, then the feedback would also be 0 (since 0 XOR 0 = 0) and the LFSR would never leave the $00 state i.e. it would be 'locked-up'. This is very important since in some FPGAs, the internal d-type flip-flops clear to 0 on power-up or when the global reset net is activated. To avoid this problem, the XOR feedback gate should be changed to an XNOR feedback (since 0 XNOR 0 = 1). Some FPGAs (e.g Xilinx) allow the individual flip-flops to be either set or reset on power-up or initialisation and the problem can be avoided by providing a non-zero initial value (sometimes called the 'seed value').
If the feedback had been of XNOR type, then the lock-up state would be $1F (since 1 XNOR 1 = 1). The designer should verify the power-on and/or global initialisation state of the flip-flops in the target device then choose XOR or XNOR feedback or provide a seed value which is neither all zeroes or all ones.
Note also that the sequence produced will be different for the two types of feedback. The tables below show the sequences of a 3-bit, maximal length LFSR (taps at stage0 and stage2) with seed value 001 for both XOR and XNOR feedback:
Click on thumbnail for figure 2 : 3-bit LFSR
Table 2 : 3 bit LFSR sequence with XOR feedback taps 0, 2 : lock-up state = 0
Table 3 : 3 bit LFSR sequence with XNOR feedback taps 0,2 : lock-up state = 7
There is a way however, with the addition of extra logic, to force an LFSR into the lock-up state and then out again, so cycling through all 2n states:
If you examine the sequence for the 5-bit LFSR in table 1, you can see that we detect the hex value $01, 'insert' the $00 state, then force it into the $10 state.
Table 4 : 5 bit LFSR sequence with 2n states
The modified schematic is shown below:
Click on thumbnail for figure 3 : 5-bit LFSR with 2n states
An LFSR is of 'maximal' length when the sequence it generates passes through all possible 2n-1 values.
If the taps on our 3-bit LFSR with XNOR feedback are changed to stages 0 and 1, then the LFSR will enter into a continous loop and is not of maximal length:
Table 5 : 3 bit LFSR sequence with XNOR feedback taps 0,1 : lock-up state = 7
Note also that there can be more than one combination of taps that give maximal length for each LFSR. If the taps on the 3-bit LFSR are changed to stages 1 and 2, a maximal length shift register will still be produced, but with a different sequence.
There is no easy way to decide where the taps should be for maximal length, so the designer is refered to the tables provided in various texts such as :
Error Correcting Codes
Available from Amazon
One of the most frequent uses of a LFSR inside a FPGA is as a counter. Using a LFSR instead of a binary counter can increase the clock rate considerably due to the low routing resource required to produce the next state logic. In a sequential binary counter (i.e. counts 0, 1, 2, ...) the logic required for any particular bit has an input from all of the lesser significant bits and from its own register output. For example, MS bit of a 32-bit counter would require logic with a fan-in of 32. In a FPGA, with its limited fan-in for each macro-block, this would require many levels of logic hence reducing the maximum possible clock rate.
For a LFSR on the other hand, the maximum clock frequency is only limited by the propagation delay through the feedback logic which is usually not more than a couple of XOR gates. If the LFSR is floorplanned correctly with each bit in adjacent macro-blocks, then a very fast counter can be realised.
The main problem with using LFSRs as counters is the pseudrandom nature of the sequence that they produce. In some applications this may not be acceptable, but for others, frequency division for example, it may not be important.
Another problem is that the sequence length for a n-bit maximal LFSR is only 2n-1, whereas the sequence length for a n-bit binary counter is 2n. If we want to divide an input clock by 16, a 4-bit binary counter would be sufficient, but a 4-bit LFSR would not.
So how do we make a divide-by-16 LFSR? First we will need a LFSR that will pass through a sequence of at least 16 states - hence a 5-bit maximal length LFSR would a natural choice (25-1 = 31 states). The next step is to decide where the taps will be, what the feedback type will be and the seed value. For this example we will use the 5-bit LFSR presented earlier.
Then the sequence of states must be generated, either by hand or by software (or even by a VHDL simulation) - this has already been done in Table 1.
We want the LFSR to run through only 16 states, so we must add synchronous reset logic which will detect the 16th state ($08) and force the LFSR back to the first state (the seed value, $1F) on the next clock edge. The active high reset signal is OR'd with the input to every flip-flop so that they will all be forced high on the next clock edge. The reset signal can be used as the divided down clock as it has a frequency that is 1/16th the input clock frequency.
The resulting schematic will be:
Click on thumbnail for figure 4 : 5-bit LFSR as a counter
The extra logic in the circuit obviously limits the maximum clock rate. The VHDL code for this circuit is:
ARCHITECTURE arc_lfsr_div OF lfsr_div IS SIGNAL q : std_logic_vector(4 DOWNTO 0); SIGNAL reset : std_logic; CONSTANT state16 : std_logic_vector(4 DOWNTO 0):= "01000"; BEGIN lfsr : PROCESS(clk) BEGIN IF(clk'EVENT AND clk='1') THEN -- clock with rising edge IF (reset='1') THEN q <= ( OTHERS => '1'); -- set seed value on reset ELSE q(0) <= q(1) XOR q(4); -- feedback to LS bit q(4 DOWNTO 1)<= q(3 DOWNTO 0); -- others bits shifted END IF; END IF; END PROCESS lfsr; reset <= '1' WHEN q = state_16 ELSE '0'; -- detect 16th state END ARCHITECTURE arc_lfsr_div;
Note : It is not really necessary to ensure that the LFSR runs through all 31 states since only the first 16 are used. Extending this argument a bit further, it may even be better to deliberately choose the LFSR taps such that it is not of maximal length, but runs through only 16 of the possible states - no reset logic would then be required.
Another advantage of the LFSR counter is that there is no 'rollover' of the shift register contents from all 1s to all 0s, unlike a binary counter. This rollover may in some cases produce unacceptable simultaneous switching noise.
Other uses include:
As well as the two types of feedback (XOR, XNOR) there are two different feedback topologies - the first is where many taps are combined into one feedback node (many-to-1), the other has one feedback (the MS bit) which is used in all the taps (1-to-many).
A maximal length 8-bit LFSR has taps at stages 1, 2, 3 and 7. The many-to-1 topology is shown in the figure below:
Click on thumbnail for figure 5 : 8-bit LFSR in many-to-1 topology
The need to combine many taps into a single feedback node can lead to multiple levels of logic. The maximum clock rate of the above LFSR will be dependent on the propagation delay through the feedback logic - minimising this will increase the maximum clock rate.
The figure below shows the 8-bit LFSR, but using the 1-to-many topology. This time the feedback is taken from the MS bit and combined into taps at stages 1, 2 and 3. This LFSR will still be maximal length, but has a different sequence.
Click on thumbnail for figure 6 : 8-bit LFSR in 1-to-many topology
There is now only one gate between each stage and the maximum clock rate is now dependent on the propagation delay through that one gate instead of the delay through the two levels of gates in the many-to-1 topology. One possible way of coding this in VHDL is:
ARCHITECTURE arc_1_to_many OF 1_to_many IS SIGNAL q : std_logic_vector(7 DOWNTO 0); SIGNAL reset : std_logic; CONSTANT seed : std_logic_vector(7 DOWNTO 0):= (OTHERS => '1'); BEGIN lfsr : PROCESS(clk,reset) BEGIN IF(reset='0') THEN q <= seed; -- set seed value on reset ELSIF (clk'EVENT AND clk='1') THEN -- clock with rising edge q(0) <= q(7); -- feedback to LS bit q(1) <= q(0); q(2) <= q(1) XOR q(7); -- tap at stage 1 q(3) <= q(2) XOR q(7); -- tap at stage 2 q(4) <= q(3) XOR q(7); -- tap at stage 3 q(7 DOWNTO 5)<= q(6 DOWNTO 4); -- others bits shifted END IF; END PROCESS lfsr; END ARCHITECTURE arc_1_to_many;
There are several ways of implementing LFSRs in your target device:
I have written a VHDL package which provides two functions. The first, lfsr_xor, will generate a LFSR with XOR type feedback and 1-to-many topology. The maximum length is limited to 16 bits, but it can easily be extended to any length - just add a new clause to the CASE statement. It will generate a warning for simulation if the lock-up state is ever reached. The other function, lfsr_xnor, is identical except that XNOR feedback is used. An file which shows how to use the functions is also included in the download.
So far we have seen how to implement LFSRs in VHDL such that any device can be targetted. Xilinx devices however, will allow optimal implementation of LFSRs with their internal distributed RAM. This type of RAM is available in the XC4000, Spartan/XL, Spartan-II and all Virtex families. Each CLB can be configured as a RAM and this allows very compact shift registers to be built. More information can be found in the following Xilinx sources:
Maintained by Mark Harvey. Please email me with any comments.