Subroutines

Modular Approach For Assembly Language




Welcome

Hi! Welcome to the tenth chapter of this series. I hope that you really grasped the stack concept discussed in the last chapter. Now, I'm going to explain important concept in assembly language: Subroutine. The processor itself does not directly support with special instruction. Rather the assembler provide a mean to group codes into subroutines. Thus, MASM and TASM subroutine has its own unique formats. Fortunately, the difference is very small. So, you can learn both relatively easily.

 

Subroutine Syntax

OK, so, how is the syntax? Let's suppose we'd like to declare a procedure called printout that take a word as a parameter. Let's think that it is the print out discussed in chapter 8. We'd like to make it into a procedure now. So, the only thing we'd pass here is the word pointer that will go to DX.

 
TASMMASM
proc printout  msgptr:word
     mov    dx, [msgptr]
     mov    ah, 9
     int    21h
     ret
endp
printout proc  msgptr: word
     mov    dx, msgptr
     mov    ah, 9
     int    21h
     ret
printout endp

Hmm.. it's pretty similar, isn't it? Note that MASM doesn't need the square bracket in referencing the parameters whereas TASM does. Note also that you have to mention the ret instruction to make sure that your subroutine properly returns to its caller. If you omit it, the processor will continue executing whatever instructions beneath it. So be careful! Beginning assembly programmer tends to forget this because of the features in high level languages.

Then how can we invoke it?

 
TASMMASM
call printout, offset message
invoke printout, offset message

No big deal. In TASM we use the instruction call directly whereas in MASM we ought to use their proprietary instruction invoke. The invoke will be translated down to call after it is boiled down to executable code, however.

Now for the truth: The Intel x86 processor does provide a call instruction. However, the original call does NOT take any parameter at all. So, how does the assembler able to accept parameters? Well, that parameterized call or invoke instruction will be translated down to the original call. Then, the parameters are passed down through stack using push and pop. The transformation is automatic (and also tends to be pretty boring too). So, you don't have to do the same thing over and over again. If you're curious, the actual code would be like this: (Warning: a bit tech notes below)

    push offset message
    call printout
    add  sp, 2

Note the last instruction. The SP register is added by 2. Why? We must restore the stack register too. You can do a pop instead, but the pop needs a victim register, which may not be available. That's why the assembler choose the add sp, 2 instead. If you're confused about this, just forget it. This has been taken care of by assembler for you. :-)

 

More on Parameters and Local Variables

How can we add more parameters? It's easy. Look at the example declaration below:
TASMMASM
proc myproc param1: word, param2: dword, param3: byte
    local  mylocal1: word
    local  mylocal2: byte

    mov    [mylocal1], ax
       :
    ret
endp
myproc proc param1: word, param2: dword, param3: byte
    local  mylocal1: word
    local  mylocal2: byte

    mov    mylocal1, ax
       :
       :
    ret
myproc endp

The example above delcare the procedure myproc that takes 3 parameters. It has two local variables. Note how the local variables are declared and used. Note also that we can not initialize local variables like in "global" variable because we don't have syntax like local blah:db 5. This would be illegal. Of course you can do a mov to assign it with a value later on.

Again, the local variables are actually "simulated" using stacks. The assembler are pretty smart to ease programmers with this. So, you don't have to deal with stacks and so on just to access local variables. Neat huh?

 

A Word of Caution

Since procedures are built with the help of stacks, you have to remember not to modify SP and BP anytime in the subroutines. It's because SP is used to store stack position and BP is used to store the stack position before entering the subroutine. If you mess around with these registers, you're toasted. Remember this!

Moreover, when you modify certain registers in a subroutine, it is likely you interfering the main program. Why? Suppose that the main program relies on AX register a lot. Then it calls the subroutine. The subroutine modifies AX, do something and returns. The AX register has changed. This may be not expected by the caller. What would happen? Chaos.

So how to cope this situation then? It's better for us to preserve the registers we're changing. Let's look at our first example printout. We're changing AH and DX right? Not only that, we have to look into the interrupt list definition of int 21h too:

INT 21 - DOS - PRINT STRING
        AH = 09h
        DS:DX = address of string terminated by "$"
Note:  Break checked, and INT 23h called if pressed

Ah we can see that after calling int 21h/09h, no output is changed. So, we're only obliged to save AX and DX. So, the code get revised like this:
TASMMASM
proc printout  msgptr:word
     push   dx
     push   ax

     mov    dx, [msgptr]
     mov    ah, 9
     int    21h

     pop    ax
     pop    dx
     ret
endp
printout proc  msgptr: word
     push   dx
     push   ax

     mov    dx, msgptr
     mov    ah, 9
     int    21h

     pop    ax
     pop    dx
     ret
printout endp

Ah... at the beginning we preserve DX and AX, then before returning, we restore them. Recall the LIFO nature of stack, that's why the popping order is in reverse from the push.

If we modify a lot of registers in our subroutine, we have to do a lot of pushes and before returning, we're restoring them using lots of pops. This can be very tedious. In 80286 or better, we have pusha instruction, an abbreviation of "push all", which basically stores (almost) all registers. Also, we have the corresponding popa ("pop all") to pop into the appropriate registers.

 

How About Functions?

How about functions? We certainly want subroutines that can return some values too. How can we accomplish this? Usually, we designate registers to hold the output or result for our subroutine. As a convension, many programmers tend to choose AX for this purpose. If you have more than one output from the subroutine, you can select multiple registers to hold the results.

Due to this nature, the output registers need not to be saved nor restored because the caller itself expect those designated registers to change, right? Whereas, you still have to preserve others. For example: Let's make a subroutine to calculate 1+2+...+n.

 
TASMMASM
proc addup  n:word
     push   cx

     sub    ax, ax
     mov    cx, [n]
@@myloop:
     add    ax, cx
     loop   cx

     pop    cx
     ret
endp
addup proc  n: word
     push   cx

     sub    ax, ax
     mov    cx, n
@@myloop:
     add    ax, cx
     loop   cx

     pop    cx
     ret
addup endp

Notice that in this subroutine, we modify CX and AX, but only CX is saved here. Why? It's because AX will hold the result. It wouldn't make any point if we restore AX too, would it?

Notice that here I introduce the label @@myloop. It's important to precede the label name with @@ prefix. Why? It's to declare that this label only applies within this subroutine. This is to avoid labelling confusion in assembly. Thanks to the local labelling, you can have other @@myloop labels in different subroutines!

I think it is a good habit to document a subroutine. At least give a comment above it. One example of this comment for the subroutine above is like this:

; -----------------------------------
; |  addup
; |     -- Adding 1 through n
; |
; |  input:
; |     parameter n: word
; |  returns:
; |     result in AX
; |  modifies:
; |     AX
; -----------------------------------

We have to clearly mention what register is modified in the subroutine so that we are not confused and know what to expect when calling this routine. Believe me, it will save a lot of sleepless nights to hunt bugs. :-)

 

Routine Placement

OK, you now understand how subroutine works. Now, where we should place this? Recall our very first program:

ideal
p286n
model tiny

codeseg
   org 100h
   jmp start

   ; your data here

   ; Area 1

start:
   mov ax, 4c00h
   int 21h

   ; Area 2
end

You can place your subroutines either in area 1 or area 2. Pascal programmers will probably love area 1 better where as C/C++ programmers will prefer area 2. Example: Let's put our addup and printout example to the main program. Putting it into area 1 will look like this:

 
TASMMASM
ideal
p286n
model tiny

codeseg
   org 100h
   jmp start

   message db 'Hello World!$'

   proc addup  n:word
        push   cx

        sub    ax, ax
        mov    cx, [n]
   @@myloop:
        add    ax, cx
        loop   cx

        pop    cx
        ret
   endp

   proc printout  msgptr:word
        push   dx
        push   ax

        mov    dx, [msgptr]
        mov    ah, 9
        int    21h

        pop    ax
        pop    dx
        ret
   endp

start:
   call printout, offset message

   mov  ax, 4c00h
   int  21h
end
.286
.model tiny

.code
   org 100h
entry:
   jmp start

   message db 'Hello World!$'

   addup proc  n: word
        push   cx

        sub    ax, ax
        mov    cx, n
   @@myloop:
        add    ax, cx
        loop   cx

        pop    cx
        ret
   addup endp

   printout proc msgptr:word
        push   dx
        push   ax

        mov    dx, [msgptr]
        mov    ah, 9
        int    21h

        pop    ax
        pop    dx
        ret
   printout endp

start:
   call printout, offset message

   mov ax, 4c00h
   int 21h
end entry

Or alternatively in the second area like this:
TASMMASM
ideal
p286n
model tiny

codeseg
   org 100h
   jmp start

   message db 'Hello World!$'

start:
   call printout, offset message

   mov  ax, 4c00h
   int  21h

   proc addup  n:word
        push   cx

        sub    ax, ax
        mov    cx, [n]
   @@myloop:
        add    ax, cx
        loop   cx

        pop    cx
        ret
   endp

   proc printout  msgptr:word
        push   dx
        push   ax

        mov    dx, [msgptr]
        mov    ah, 9
        int    21h

        pop    ax
        pop    dx
        ret
   endp

end
.286
.model tiny

.code
   org 100h
entry:
   jmp start

   message db 'Hello World!$'

start:
   call printout, offset message

   mov ax, 4c00h
   int 21h

   addup proc  n: word
        push   cx

        sub    ax, ax
        mov    cx, n
   @@myloop:
        add    ax, cx
        loop   cx

        pop    cx
        ret
   addup endp

   printout proc msgptr:word
        push   dx
        push   ax

        mov    dx, [msgptr]
        mov    ah, 9
        int    21h

        pop    ax
        pop    dx
        ret
   printout endp

end entry

Or you can put the subroutine in both areas. Whichever you prefer. Don't forget to give comments too!

 

Closing

OK, I think that's all for now. By this time, you should be able to program a lot of assembly with ease. Why don't you try to code your favourite subroutines just for exercise?

I'd like to thank you Victor Forsyuk who create a wonderful Norton Guides Database which excerpt is quoted for the interrupt list for this tutorial. See you next time.

 


Where to go

Chapter 11
News Page
x86 Assembly Lesson 1 index
Contacting Me


Roby Joehanes © 2001