DATA STRUCTOR

STACKS
A stack is used to provide temporary storage space for values. It is defined as a data structure which operates on a first in, last out basis. Its uses a single pointer (index) to keep track of the information in the stack.

The basic operations associated with a stack are,

The following diagram shows an empty stack of four locations. It looks just like an array, and it has an index pointer pointing to the beginning of the stack (called the TOP of the stack).

 

     +--------+

     |        |   <------- Stack Pointer       

     +--------+ 

     |        |         

     +--------+ 

     |        |         

     +--------+

     |        |          

     +--------+

 

Pushing items onto the stack
The stack pointer is considered to be pointing to a free (empty) slot in the stack. A push operation thus involves copying data into the empty slot of the stack, then adjusting the stack pointer to point to the next free slot.

 

   Module Push

        stack[stack_pointer] = data;

        stack_pointer = stack_pointer + 1;

   Module End

 

Lets push the value 45 onto the stack. [Note: The stack could hold any data type, not necessarily decimal numbers. We have used numbers for simplicity]. The stack now looks like

 

     +--------+ 

     |   45   |         

     +--------+

     |        |   <------- Stack Pointer       

     +--------+ 

     |        |         

     +--------+ 

     |        |         

     +--------+

 

Note how the stack pointer has been adjusted to point to the next free location in the stack. [Note: for the time being we are ignoring certain problems. We will address these shortly!!!].

Removing items from the stack
To remove an item, first decrement (subtract 1 from) the stack pointer, then extract the data from that position in the stack.

 

   Module Remove

        stack_pointer = stack_pointer - 1;

        data = stack[stack_pointer];

   Module End

 

Time now to address the problems of the above implementation
There are a number of problems associated with the above routines for pushing and removing items.

There are a number of solutions to this problem. We shall present a simplified solution. We do not argue that it is the best, just that it is one of a possible number of solutions.

 

   Comment: Assume that the array elements begin at 1

   Comment: Assume that the maximum elements of the stack is MAX

 

   Var: stack[MAX] : Integer;

 

   Module Initialize

        stack_pointer = 1;

   Module End

 

   Module Push

        if stack_pointer >= MAX then

           return error

        else begin

           stack[stack_pointer] = data;

           stack_pointer = stack_pointer + 1;

        end

   Module End

 

   Module Remove

        if stack_pointer <= 1 then

           return error

        else begin

           stack_pointer = stack_pointer - 1;

           data = stack[stack_pointer];

        end

   Module End

 


QUEUES
Everybody has experience with queues as they are common place. Queues occur in cafeterias, petrol stations, shopping centres, anyplace where many people (customers) line up for few resources (cashier's, salespeople, petrol pumps etc).

The purpose of a queue is to provide some form of buffering. Typical uses of queues in computer systems are,

A queue is defined as a data structure which holds a series of items to be processed on a first in first out basis (though some queues can be sorted in priority). The operations associated with a queue are,

The following diagram shows an empty queue. It is identified as a series of ten locations, and two pointers named front and rear. These two pointers keep track of where the front and rear of the queue is.

 

          1   2   3   4   5   6   7   8   9  10 

        +---+---+---+---+---+---+---+---+---+---+ 

        |   |   |   |   |   |   |   |   |   |   | 

        +---+---+---+---+---+---+---+---+---+---+ 

        +---+ 

        | 1 | Front 

        +---+ 

        +---+ 

        | 10| Rear 

        +---+ 

 

The front pointer is used to delete items, and the rear pointer insert items. Inserting two items called A and B will rearrange the queue to look like,

 

          1   2   3   4   5   6   7   8   9  10 

        +---+---+---+---+---+---+---+---+---+---+ 

        | A | B |   |   |   |   |   |   |   |   | 

        +---+---+---+---+---+---+---+---+---+---+ 

        +---+ 

        | 1 | Front 

        +---+ 

        +---+ 

        | 2 | Rear 

        +---+ 

 

The pseudo-code for the various queue operations follows.

 

        module initialize 

               count = 0 

               head = 1 

               rear = size of queue 

        end module initialize 

 

        module insert 

               if count = size of queue 

                       queue is full and do not insert 

               else 

               begin 

                       increment count 

                       increment rear 

                       if rear > size of queue 

                               rear = 1 

                       endif 

                       insert data at queue[rear] 

               endif 

        end module insert 

 

        module remove 

               if count = 0 

                       queue is empty and do not remove 

               else 

               begin 

                       data = queue[front] 

                       decrement count 

                       increment front 

                       if front > size of queue 

                               front = 1 

                       endif 

               endif 

        end module remove 

 

        module count 

               return count 

        end module count 

 

        module free_space 

               return queue.size - count 

        end module free_space 

 

 

The implementation of this is left to the student as a programming exercise.


LINKED LISTS
Data structures can be arranged in memory in a variety of different ways. Each method has advantages and disadvantages. Arrays for example seem easy to use, where each element is stored sequentially in memory.

This type of approach is not efficient in larger computer systems, where a number of users share main memory. In such circumstances, there may not be enough contiguous main memory left to hold a sequentially stored data structure like an array (but there could be enough if all the small free blocks were moved into one large block).

One way to overcome this is to link all elements of data together. Data is arranged into records, and a special item is added to each record called a pointer (a link field), which contains a link to the next record etc. Renaming each record as a node and we have a complex data structure called a linked list.

A linked list is a series of nodes connected together via pointers. Pictorially, it looks like,

 

         Node 1          +---+ Node 2        +---+ Node n 

        +--------+--+    |  +--------+--+    |  +--------+--+ 

        |        | ------+  |        | ------+  |        |  | 

        +--------+--+       +--------+--+       +--------+--+ 

           Data   Link        Data    Lnk         Data    Lnk 

 

In Pascal, a node definition looks like,

 

        type    ptr = ^node; 

               node =  record 

                       data : string[20]; 

                       next : ptr; 

               end; 

 

The following Pascal program declares three nodes, inserts data into each of them, then using a pointer, cycles through the chain from node to node printing out the contents of each node. The value 0 is always stored in the last node of a chain, to indicate the end of the list.

 

        program linked_list; 

        type    ptr = ^node; 

               node = record 

                       data : string[20]; 

                       next : ptr; 

                       end; 

        var 

               node1, node2, node3 : node; 

               nodeptr : ptr; 

        begin 

               node1.data := 'Welcome '; 

               node1.next := @node2; 

               node2.data := 'to '; 

               node2.next := @node3; 

               node3.data := 'Linked Lists.'; 

               node3.next := nil; 

               nodeptr := @node1; 

               while nodeptr <> nil do 

               begin 

                       write( nodeptr^.data ); 

                       nodeptr := nodeptr^.next; 

               end; 

                writeln; 

        end. 

 

Linked lists are used in a wide variety of applications.

An advantage of storing data using a linked list is that the key sequence can be altered by changing the pointers, not by moving data nodes. This means sorting data is quick, but searching is slower as you must follow the chain from node to node.


HASHING
Hashing is a technique used to improve data access times. Rather than sorting the data according to a key, it computes the location of the data from the key. In simple terms, it computes an index value from the key of the desired record. At that index value, the record is stored/retrieved.

If we had an array of 1000 elements, we would expect our hash function (the method used to calculate the index value) to evenly distribute the keys over this range. In practice this is difficult to achieve. It is possible that two different search keys might generate the same hash value (ie, index), and we call this a collision.

The size of the table (array) is generally a prime number, as this has the least probability of creating collisions.

The following code segment defines an array of records in Pascal which will be used to demonstrate hashing.

 

        const   size = 511; 

        type    data = record 

                       name : string[20]; 

                       age : integer; 

               end; 

               hashtable = array[1..size] of data; 

 

Next, lets initialize the table with zero's, so this makes it easy to see if information is already stored at a particular element (important when inserting data later).

 

        procedure initialize ( var Table : hashtable ); 

        var i : integer; 

        begin 

               for i := 1 to size do 

               begin 

                       Table[i].name := '                    '; 

                       Table[i].age := 0; 

               end; 

        end; 

 

The procedure insert will take the hash value and the data and insert it into the table. If the position is already occupied (ie, a collision occurs) the table will be searched from that point onwards till an empty slot occurs.

 

        procedure insert ( Position : integer; Element : data ; var Table : hashtable ); 

        begin 

               while Table[Position].name[1] <> ' ' do 

                       Position := (Position + 1) mod size; 

               Table[Position] := Element; 

        end; 

 

The procedure hash will generate a hash value from a given data record. In deciding this, it must generate a value between 1 and 511. Obviously, we cannot use the age field of the record, this will generate too many collisions. The best method is to add up the value of all the characters in the persons name, then extract nine bits from it (2^9 = 512) to generate the index value.

 

        procedure hash ( var Position : integer ; Element : data ; var Table : hashtable ); 

        var i : integer; 

        begin 

               i := 1; 

               position := 0; 

               while i < 20 do 

                       position := position + ord(Element.name[i]); 

               position := position mod 511; 

        end; 

 

The remaining procedures to search and delete are left as a programming exercise for the student.


INDEXED SEQUENTIAL
This method seeks to make a direct access file appear like a sequence file. The sequencing is achieved via an index table, which holds the keys of the records and their position within the file.

A program reads the index sequentially, and when it locates the key it desires, it reads the record from the indicated position.

The advantages of this method are,


FILE MERGING
This is the process of merging two or more input files into a single output file (which are normally all sequential in nature). The files to be merged are first sorted using the same key sequence, which is preserved in the output file.

A pseudo-code example for a 2-way merge is shown below.

 

        program two_way merge 

        begin 

               read 1st record of infile1, name as rec1 

               read 1st record of infile2, name as rec2 

               while rec1 or rec2 is not an end-of-record 

               begin 

                       if rec1.key < rec2.key 

                       begin 

                               write rec1 to outfile 

                               read new rec1 from  infile1 

                       end 

                       else 

                       begin 

                               write rec2 to outfile 

                               read new rec2 from infile2 

                       end 

                       endif 

               endwhile 

               write end-of-record to outfile 

        end program two_way merge