Hi! Welcome to the tenth chapter of this series. I hope that you really grasped the stack concept discussed in the last chapter. Now, I'm going to explain important concept in assembly language: Subroutine. The processor itself does not directly support with special instruction. Rather the assembler provide a mean to group codes into subroutines. Thus, MASM and TASM subroutine has its own unique formats. Fortunately, the difference is very small. So, you can learn both relatively easily.
OK, so, how is the syntax? Let's suppose we'd like to declare a procedure called printout that take a word as a parameter. Let's think that it is the print out discussed in chapter 8. We'd like to make it into a procedure now. So, the only thing we'd pass here is the word pointer that will go to DX.
TASM | MASM |
---|---|
proc printout msgptr:word mov dx, [msgptr] mov ah, 9 int 21h ret endp | printout proc msgptr: word mov dx, msgptr mov ah, 9 int 21h ret printout endp |
Hmm.. it's pretty similar, isn't it? Note that MASM doesn't need the square bracket in referencing the parameters whereas TASM does. Note also that you have to mention the ret instruction to make sure that your subroutine properly returns to its caller. If you omit it, the processor will continue executing whatever instructions beneath it. So be careful! Beginning assembly programmer tends to forget this because of the features in high level languages.
Then how can we invoke it?
TASM | MASM |
---|---|
call printout, offset message | invoke printout, offset message |
No big deal. In TASM we use the instruction call directly whereas in MASM we ought to use their proprietary instruction invoke. The invoke will be translated down to call after it is boiled down to executable code, however.
Now for the truth: The Intel x86 processor does provide a call instruction. However, the original call does NOT take any parameter at all. So, how does the assembler able to accept parameters? Well, that parameterized call or invoke instruction will be translated down to the original call. Then, the parameters are passed down through stack using push and pop. The transformation is automatic (and also tends to be pretty boring too). So, you don't have to do the same thing over and over again. If you're curious, the actual code would be like this: (Warning: a bit tech notes below)
push offset message call printout add sp, 2
Note the last instruction. The SP register is added by 2. Why? We must restore the stack register too. You can do a pop instead, but the pop needs a victim register, which may not be available. That's why the assembler choose the add sp, 2 instead. If you're confused about this, just forget it. This has been taken care of by assembler for you. :-)
How can we add more parameters? It's easy. Look at the example declaration below:
TASM | MASM |
---|---|
proc myproc param1: word, param2: dword, param3: byte local mylocal1: word local mylocal2: byte mov [mylocal1], ax : ret endp | myproc proc param1: word, param2: dword, param3: byte local mylocal1: word local mylocal2: byte mov mylocal1, ax : : ret myproc endp |
The example above delcare the procedure myproc that takes 3 parameters. It has two local variables. Note how the local variables are declared and used. Note also that we can not initialize local variables like in "global" variable because we don't have syntax like local blah:db 5. This would be illegal. Of course you can do a mov to assign it with a value later on.
Again, the local variables are actually "simulated" using stacks. The assembler are pretty smart to ease programmers with this. So, you don't have to deal with stacks and so on just to access local variables. Neat huh?
Since procedures are built with the help of stacks, you have to remember not to modify SP and BP anytime in the subroutines. It's because SP is used to store stack position and BP is used to store the stack position before entering the subroutine. If you mess around with these registers, you're toasted. Remember this!
Moreover, when you modify certain registers in a subroutine, it is likely you interfering the main program. Why? Suppose that the main program relies on AX register a lot. Then it calls the subroutine. The subroutine modifies AX, do something and returns. The AX register has changed. This may be not expected by the caller. What would happen? Chaos.
So how to cope this situation then? It's better for us to preserve the registers we're changing. Let's look at our first example printout. We're changing AH and DX right? Not only that, we have to look into the interrupt list definition of int 21h too:
INT 21 - DOS - PRINT STRING AH = 09h DS:DX = address of string terminated by "$" Note: Break checked, and INT 23h called if pressed
Ah we can see that after calling int 21h/09h, no output is changed. So, we're only obliged to save AX and DX. So, the code get revised like this:
TASM | MASM |
---|---|
proc printout msgptr:word push dx push ax mov dx, [msgptr] mov ah, 9 int 21h pop ax pop dx ret endp | printout proc msgptr: word push dx push ax mov dx, msgptr mov ah, 9 int 21h pop ax pop dx ret printout endp |
Ah... at the beginning we preserve DX and AX, then before returning, we restore them. Recall the LIFO nature of stack, that's why the popping order is in reverse from the push.
If we modify a lot of registers in our subroutine, we have to do a lot of pushes and before returning, we're restoring them using lots of pops. This can be very tedious. In 80286 or better, we have pusha instruction, an abbreviation of "push all", which basically stores (almost) all registers. Also, we have the corresponding popa ("pop all") to pop into the appropriate registers.
How about functions? We certainly want subroutines that can return some values too. How can we accomplish this? Usually, we designate registers to hold the output or result for our subroutine. As a convension, many programmers tend to choose AX for this purpose. If you have more than one output from the subroutine, you can select multiple registers to hold the results.
Due to this nature, the output registers need not to be saved nor restored because the caller itself expect those designated registers to change, right? Whereas, you still have to preserve others. For example: Let's make a subroutine to calculate 1+2+...+n.
TASM | MASM |
---|---|
proc addup n:word push cx sub ax, ax mov cx, [n] @@myloop: add ax, cx loop cx pop cx ret endp | addup proc n: word push cx sub ax, ax mov cx, n @@myloop: add ax, cx loop cx pop cx ret addup endp |
Notice that in this subroutine, we modify CX and AX, but only CX is saved here. Why? It's because AX will hold the result. It wouldn't make any point if we restore AX too, would it?
Notice that here I introduce the label @@myloop. It's important to precede the label name with @@ prefix. Why? It's to declare that this label only applies within this subroutine. This is to avoid labelling confusion in assembly. Thanks to the local labelling, you can have other @@myloop labels in different subroutines!
I think it is a good habit to document a subroutine. At least give a comment above it. One example of this comment for the subroutine above is like this:
; ----------------------------------- ; | addup ; | -- Adding 1 through n ; | ; | input: ; | parameter n: word ; | returns: ; | result in AX ; | modifies: ; | AX ; -----------------------------------
We have to clearly mention what register is modified in the subroutine so that we are not confused and know what to expect when calling this routine. Believe me, it will save a lot of sleepless nights to hunt bugs. :-)
OK, you now understand how subroutine works. Now, where we should place this? Recall our very first program:
ideal p286n model tiny codeseg org 100h jmp start ; your data here ; Area 1 start: mov ax, 4c00h int 21h ; Area 2 end
You can place your subroutines either in area 1 or area 2. Pascal programmers will probably love area 1 better where as C/C++ programmers will prefer area 2. Example: Let's put our addup and printout example to the main program. Putting it into area 1 will look like this:
TASM | MASM |
---|---|
ideal p286n model tiny codeseg org 100h jmp start message db 'Hello World!$' proc addup n:word push cx sub ax, ax mov cx, [n] @@myloop: add ax, cx loop cx pop cx ret endp proc printout msgptr:word push dx push ax mov dx, [msgptr] mov ah, 9 int 21h pop ax pop dx ret endp start: call printout, offset message mov ax, 4c00h int 21h end | .286 .model tiny .code org 100h entry: jmp start message db 'Hello World!$' addup proc n: word push cx sub ax, ax mov cx, n @@myloop: add ax, cx loop cx pop cx ret addup endp printout proc msgptr:word push dx push ax mov dx, [msgptr] mov ah, 9 int 21h pop ax pop dx ret printout endp start: call printout, offset message mov ax, 4c00h int 21h end entry |
Or alternatively in the second area like this:
TASM | MASM |
---|---|
ideal p286n model tiny codeseg org 100h jmp start message db 'Hello World!$' start: call printout, offset message mov ax, 4c00h int 21h proc addup n:word push cx sub ax, ax mov cx, [n] @@myloop: add ax, cx loop cx pop cx ret endp proc printout msgptr:word push dx push ax mov dx, [msgptr] mov ah, 9 int 21h pop ax pop dx ret endp end | .286 .model tiny .code org 100h entry: jmp start message db 'Hello World!$' start: call printout, offset message mov ax, 4c00h int 21h addup proc n: word push cx sub ax, ax mov cx, n @@myloop: add ax, cx loop cx pop cx ret addup endp printout proc msgptr:word push dx push ax mov dx, [msgptr] mov ah, 9 int 21h pop ax pop dx ret printout endp end entry |
Or you can put the subroutine in both areas. Whichever you prefer. Don't forget to give comments too!
OK, I think that's all for now. By this time, you should be able to program a lot of assembly with ease. Why don't you try to code your favourite subroutines just for exercise?
I'd like to thank you Victor Forsyuk who create a wonderful Norton Guides Database which excerpt is quoted for the interrupt list for this tutorial. See you next time.
Chapter 11
News Page
x86 Assembly Lesson 1 index
Contacting Me
Roby Joehanes © 2001