Linear Feedback Shift Registers

What is a Linear Feedback Shift Register (LFSR)?

A n-bit LFSR is a n-bit length shift register with feedback to its input. The feedback is formed by XORing or XNORing the outputs of selected stages of the shift register - referred to as 'taps' - and then inputting this to the least significant bit (stage 0). Each stage has a common clock.The 'linear' part of the term 'LFSR' derives from the fact that XOR and XNOR are linear functions. An example of a 5-bit LFSR is shown below:

Click on thumbnail for figure1 : 5-bit LFSR

This has taps at stages 1 and 4 with XOR feedback. Note also that the LS bit of the shift register is, by convention, shown at the left hand side of the shift register, with the output being taken from the MS bit at the right hand side.

So what is it about a LFSR that makes it interesting? It will produce a pseudorandom sequence of length 2^n-1 states (where n is the number of stages) if the LFSR is of maximal length. The sequence will then repeat from the initial state for as long as the LFSR is clocked.

Assume that the example LFSR above is set to $1F after initialisation. The output of the feedback XOR gate will be 0 (since 1 XOR 1 = 0) and the first clock edge will load 0 into stage 0. The following table shows the sequence :

LFSR stage					Hex value
0	1	2	3	4	(0:4)
1	1	1	1	1	$1F
0	1	1	1	1	$0F
0	0	1	1	1	$07
1	0	0	1	1	$13
1	1	0	0	1	$19
0	1	1	0	0	$0C
1	0	1	1	0	$16
0	1	0	1	1	$0B
0	0	1	0	1	$05
1	0	0	1	0	$12
0	1	0	0	1	$09
0	0	1	0	0	$04
0	0	0	1	0	$02
0	0	0	0	1	$01
1	0	0	0	0	$10
0	1	0	0	0	$08
1	0	1	0	0	$14
0	1	0	1	0	$0A
1	0	1	0	1	$15
1	1	0	1	0	$1A
1	1	1	0	1	$1D
0	1	1	1	0	$0E
1	0	1	1	1	$17
1	1	0	1	1	$1B
0	1	1	0	1	$0D
0	0	1	1	0	$06
0	0	0	1	1	$03
1	0	0	0	1	$11
1	1	0	0	0	$18
1	1	1	0	0	$1C
1	1	1	1	0	$1E

Table 1 : 5 bit LFSR sequence

After the initial state, the bit shifted into stage0 on each clock edge is the XOR of stage4 and stage1. The LFSR passes through 31 states (2^5-1 = 31) and so is of maximal length.

| Back to top |

Lock-up State

The one state that the example 5-bit LFSR doesn't pass through is $00. If the LFSR contained $00, then the feedback would also be 0 (since 0 XOR 0 = 0) and the LFSR would never leave the $00 state i.e. it would be 'locked-up'. This is very important since in some FPGAs, the internal d-type flip-flops clear to 0 on power-up or when the global reset net is activated. To avoid this problem, the XOR feedback gate should be changed to an XNOR feedback (since 0 XNOR 0 = 1). Some FPGAs (e.g Xilinx) allow the individual flip-flops to be either set or reset on power-up or initialisation and the problem can be avoided by providing a non-zero initial value (sometimes called the 'seed value').

If the feedback had been of XNOR type, then the lock-up state would be $1F (since 1 XNOR 1 = 1). The designer should verify the power-on and/or global initialisation state of the flip-flops in the target device then choose XOR or XNOR feedback or provide a seed value which is neither all zeroes or all ones.

Note also that the sequence produced will be different for the two types of feedback. The tables below show the sequences of a 3-bit, maximal length LFSR (taps at stage0 and stage2) with seed value 001 for both XOR and XNOR feedback:

Click on thumbnail for figure 2 : 3-bit LFSR

LFSR stage
0	1	2	Value
0	0	1	1
1	0	0	4
1	1	0	6
1	1	1	7
0	1	1	3
1	0	1	5
0	1	0	2

Table 2 : 3 bit LFSR sequence with XOR feedback taps 0, 2 : lock-up state = 0

LFSR stage
0	1	2	Value
0	0	1	1
0	0	0	0
1	0	0	4
0	1	0	2
1	0	1	5
1	1	0	6
0	1	1	3

Table 3 : 3 bit LFSR sequence with XNOR feedback taps 0,2 : lock-up state = 7

There is a way however, with the addition of extra logic, to force an LFSR into the lock-up state and then out again, so cycling through all 2ⁿ states:

Detect the state in which the MS stage is 1 and all other stages are 0. Generate an active high (logic 1) signal, 'force_lock' when this condition occurs.
'XOR' this signal with the other taps used to produce the feedback. This will cause the feedback signal to be logic 0. On the next clock edge, the LFSR will enter the all zeroes state (lock-up).
Since all stages are at logic 0, the 'force_lock' signal is still at logic 1, so the feedback signal will still be at logic 1. Therefore on the next clock edge, the LFSR will enter the state where the LS stage is logic 1 and all other stages are logic 0.
The LFSR will then continue with its sequence as normal.

If you examine the sequence for the 5-bit LFSR in table 1, you can see that we detect the hex value $01, 'insert' the $00 state, then force it into the $10 state.

LFSR stage					Hex value
0	1	2	3	4	(0:4)
.....
0	0	0	0	1	$01
0	0	0	0	0	$00	This state inserted
1	0	0	0	0	$10
0	1	0	0	0	$08
.....

Table 4 : 5 bit LFSR sequence with 2ⁿ states

The modified schematic is shown below:

Click on thumbnail for figure 3 : 5-bit LFSR with 2ⁿ states

| Back to top |

Maximal Length

An LFSR is of 'maximal' length when the sequence it generates passes through all possible 2^n-1 values.

IMPORTANT!
Only certain combinations of taps will produce a maximal length LFSR.

If the taps on our 3-bit LFSR with XNOR feedback are changed to stages 0 and 1, then the LFSR will enter into a continous loop and is not of maximal length:

LFSR stage
0	1	2	Value
0	0	1	1
1	0	0	4
0	1	0	2
0	0	1	1
1	0	0	4
0	1	0	2
0	0	1	1
1	0	0	4
etc .....

Table 5 : 3 bit LFSR sequence with XNOR feedback taps 0,1 : lock-up state = 7

Note also that there can be more than one combination of taps that give maximal length for each LFSR. If the taps on the 3-bit LFSR are changed to stages 1 and 2, a maximal length shift register will still be produced, but with a different sequence.

IMPORTANT!
The LFSR sequence depends on the seed value, the tap positions and the feedback type.

There is no easy way to decide where the taps should be for maximal length, so the designer is refered to the tables provided in various texts such as :

Error Correcting Codes
Peterson W.W. & Weldon E.J.
MIT Press, Cambridge MA. USA 1972.
ISBN : 0262160390

Available from Amazon

| Back to top |

Using LFSRs as counters & dividers

One of the most frequent uses of a LFSR inside a FPGA is as a counter. Using a LFSR instead of a binary counter can increase the clock rate considerably due to the low routing resource required to produce the next state logic. In a sequential binary counter (i.e. counts 0, 1, 2, ...) the logic required for any particular bit has an input from all of the lesser significant bits and from its own register output. For example, MS bit of a 32-bit counter would require logic with a fan-in of 32. In a FPGA, with its limited fan-in for each macro-block, this would require many levels of logic hence reducing the maximum possible clock rate.

For a LFSR on the other hand, the maximum clock frequency is only limited by the propagation delay through the feedback logic which is usually not more than a couple of XOR gates. If the LFSR is floorplanned correctly with each bit in adjacent macro-blocks, then a very fast counter can be realised.

The main problem with using LFSRs as counters is the pseudrandom nature of the sequence that they produce. In some applications this may not be acceptable, but for others, frequency division for example, it may not be important.

Another problem is that the sequence length for a n-bit maximal LFSR is only 2^n-1, whereas the sequence length for a n-bit binary counter is 2ⁿ. If we want to divide an input clock by 16, a 4-bit binary counter would be sufficient, but a 4-bit LFSR would not.

So how do we make a divide-by-16 LFSR? First we will need a LFSR that will pass through a sequence of at least 16 states - hence a 5-bit maximal length LFSR would a natural choice (2^5-1 = 31 states). The next step is to decide where the taps will be, what the feedback type will be and the seed value. For this example we will use the 5-bit LFSR presented earlier.

Then the sequence of states must be generated, either by hand or by software (or even by a VHDL simulation) - this has already been done in Table 1.

We want the LFSR to run through only 16 states, so we must add synchronous reset logic which will detect the 16th state ($08) and force the LFSR back to the first state (the seed value, $1F) on the next clock edge. The active high reset signal is OR'd with the input to every flip-flop so that they will all be forced high on the next clock edge. The reset signal can be used as the divided down clock as it has a frequency that is 1/16th the input clock frequency.

The resulting schematic will be:

Click on thumbnail for figure 4 : 5-bit LFSR as a counter

The extra logic in the circuit obviously limits the maximum clock rate. The VHDL code for this circuit is:

ARCHITECTURE arc_lfsr_div OF lfsr_div IS

 SIGNAL   q       : std_logic_vector(4 DOWNTO 0);
 SIGNAL   reset   : std_logic;
 CONSTANT state16 : std_logic_vector(4 DOWNTO 0):= "01000";

BEGIN

  lfsr : PROCESS(clk)

  BEGIN

   IF(clk'EVENT AND clk='1') THEN            -- clock with rising edge

    IF (reset='1') THEN
     q <= ( OTHERS => '1');                  -- set seed value on reset

    ELSE
     q(0) <= q(1) XOR q(4);                  -- feedback to LS bit 
     q(4 DOWNTO 1)<= q(3 DOWNTO 0);          -- others bits shifted

    END IF;

   END IF;

  END PROCESS lfsr;

  reset <= '1' WHEN  q = state_16 ELSE '0';  -- detect 16th state

END ARCHITECTURE arc_lfsr_div;

Note : It is not really necessary to ensure that the LFSR runs through all 31 states since only the first 16 are used. Extending this argument a bit further, it may even be better to deliberately choose the LFSR taps such that it is not of maximal length, but runs through only 16 of the possible states - no reset logic would then be required.

Another advantage of the LFSR counter is that there is no 'rollover' of the shift register contents from all 1s to all 0s, unlike a binary counter. This rollover may in some cases produce unacceptable simultaneous switching noise.

Other uses include:

Cyclic Redundancy Check (CRC) codes.
FIFO memory addressing - the read and write pointers to the FIFO memory could be implemented with LFSRs instead of binary counters.
PseudoRandom noise & number generators (PRNG).
Data encryption & decryption.

| Back to top |

1-to-Many and Many-to-1 topologies

As well as the two types of feedback (XOR, XNOR) there are two different feedback topologies - the first is where many taps are combined into one feedback node (many-to-1), the other has one feedback (the MS bit) which is used in all the taps (1-to-many).

A maximal length 8-bit LFSR has taps at stages 1, 2, 3 and 7. The many-to-1 topology is shown in the figure below:

Click on thumbnail for figure 5 : 8-bit LFSR in many-to-1 topology

The need to combine many taps into a single feedback node can lead to multiple levels of logic. The maximum clock rate of the above LFSR will be dependent on the propagation delay through the feedback logic - minimising this will increase the maximum clock rate.

The figure below shows the 8-bit LFSR, but using the 1-to-many topology. This time the feedback is taken from the MS bit and combined into taps at stages 1, 2 and 3. This LFSR will still be maximal length, but has a different sequence.

Click on thumbnail for figure 6 : 8-bit LFSR in 1-to-many topology

There is now only one gate between each stage and the maximum clock rate is now dependent on the propagation delay through that one gate instead of the delay through the two levels of gates in the many-to-1 topology. One possible way of coding this in VHDL is:

ARCHITECTURE arc_1_to_many  OF 1_to_many IS

 SIGNAL   q     : std_logic_vector(7 DOWNTO  0);
 SIGNAL   reset : std_logic;
 CONSTANT seed  : std_logic_vector(7 DOWNTO  0):= (OTHERS  => '1');

BEGIN

 lfsr : PROCESS(clk,reset)

 BEGIN

  IF(reset='0') THEN 
   q <= seed;                         -- set seed value on reset

  ELSIF (clk'EVENT AND  clk='1') THEN -- clock with rising edge

   q(0) <= q(7);                      -- feedback to LS bit
   q(1) <= q(0);                                
   q(2) <= q(1) XOR q(7);             -- tap at stage 1
   q(3) <= q(2) XOR q(7);             -- tap at stage 2
   q(4) <= q(3) XOR q(7);             -- tap at stage 3
   q(7 DOWNTO 5)<= q(6 DOWNTO 4);     -- others bits shifted

  END IF;
  
 END PROCESS lfsr;

END ARCHITECTURE arc_1_to_many;

| Back to top |

Coding the LFSR

There are several ways of implementing LFSRs in your target device:

Write the VHDL code for each LFSR yourself, using tap positions provided by the various texts. You will have complete control over the design, no need to rely on third-party cores or code.
Use the configurable cores available from FPGA manufacturers (such as Altera Megafunctions or Xilinx logiblox).
Use VHDL packages such as lfsrstd.vhd by Ben Cohen.
Easics provide an on-line VHDL source code generator tool for CRC functions.

I have written a VHDL package which provides two functions. The first, lfsr_xor, will generate a LFSR with XOR type feedback and 1-to-many topology. The maximum length is limited to 16 bits, but it can easily be extended to any length - just add a new clause to the CASE statement. It will generate a warning for simulation if the lock-up state is ever reached. The other function, lfsr_xnor, is identical except that XNOR feedback is used. An file which shows how to use the functions is also included in the download.

| Back to top |

Implement LFSRs in Xilinx devices

So far we have seen how to implement LFSRs in VHDL such that any device can be targetted. Xilinx devices however, will allow optimal implementation of LFSRs with their internal distributed RAM. This type of RAM is available in the XC4000, Spartan/XL, Spartan-II and all Virtex families. Each CLB can be configured as a RAM and this allows very compact shift registers to be built. More information can be found in the following Xilinx sources:

Xilinx application note XAPP052.
Xilinx application note XAPP210.
Xilinx application note XAPP211.
Xilinx Xcell journal, Issue 35 Q1/2000.

| Back to top |

Maintained by Mark Harvey. Please email me with any comments.