Chapter 1: COM Program Structure

Welcome

Welcome to your first assembly lesson. I suggest you to read the preliminary lesson before proceeding. If you have read it, I understand that everything is not clear. Now, I try to clarify things here. I also assume that you have the tools you need: an assembler and a text editor. If you don't have either TASM or MASM, you can look at free assemblers in here. Of course you'd have to adjust your code a bit. I assume that you are able convert numbers from decimal to binary or to hexadecimal and vice versa. You would probably need a Norton Guide for assembly language. It acts like a help when you forget any commands or "magic numbers".

Why Assembly?

Some of you may complain about assembly: It's difficult, error prone, hard to debug, takes a lot of time to develop, etc. Yes, that's true. However:

Assembly is fast. A LOT faster than any compiler of any language could ever produce.
Assembly is a lot closer to machine level than any language because the commands of assembly language is mapped 1-1 to machine instructions.
Assembly code is A LOT smaller than any compiler of any language could ever produce.
In Assembly, you have the RAW power of your machine. You can tweak it any way you want.
In Assembly, we can do a lot of things that we can't do in any higher level language, such as playing with processor flags, etc.

'Nuff said. You must have a really good reason in writing assembly language. Don't do anything in assembly. Instead, tweak any codes that need speed or size smashed into assembly. Otherwise you'd end up with unfinished projects.

COM Structure

Among the simplest structure of assembly language is COM structure. The structure is quite straight forward to implement. However, there are some restrictions apply:

Code and data must be contained within 64KB limit (there is a work around for this, but that's another story).
Cannot reserve memory through operating system (there is a work around for this, too).

Well, don't bother about the 64KB restriction. It is a whole lot of code and data -- for now. Trust me! Probably after you code some assembly language, you'd understand why Bill Gates said that 1 MB is THE unreachable limits years ago :-).

OK, let's look into our first program (TASM ideal version):

ideal
p286n
model tiny

codeseg
   org 100h
   jmp start

   ; your data and subroutine here

start:
   mov ax, 4c00h
   int 21h
end

In MASM version:

.286
.model tiny

.code
   org 100h
entry:
   jmp start

   ; your data and subroutine here

start:
   mov ax, 4c00h
   int 21h
end entry

It's pretty much similar. Let's save your file into MYPROG.ASM

To assemble it using TASM, type the following:

tasm myprog
tlink /t myprog

For MASM, type: ML myprog.asm

Here is line-by-line explanation on the program:

ideal says that we're using ideal syntax of TASM.
p286n or .286 says that we're using 80286 processor instructions. Here, we still use 8088 assembly language, so the result is runnable in PC/XT.
model tiny or .model tiny says that we're using COM format.
codeseg or .code says that this is the begining of our code.
org 100h is the magic word you have to say for (almost) every normal COM programs. Don't worry about its meaning first.
In MASM, we have to define an entry point while in TASM, you don't have to. The entry point is a label. Here we have a label called entry (marked with a colon :).
COM programs are almost always begin with a jump, i.e. jump to the beginning of the code. Between the jump and the beginning of your code, you place your variables here. The jump is denoted by the word jmp and followed with a label (here we call it start).
After the label start, the next two lines is just the code to terminate your program. IMPORTANT: Do not assume that after the end point your program terminates. You have to explicitly say that you want to quit. These two lines are used to terminate the program.
end or end entry specify the end point of your program. In MASM, you have to specify the entry point again after the word end.

Making Labels

In assembly, to make labels is simple, just put any name and stick it with a colon(:). Label usually serves as a tag of where you'd like to jump and so on. You have to pick unique names for each label, otherwise the assembler will fail. There is a way to make it local: to prefix it with a @@ in front of the label name and still end it with a colon. However, this kind of label only valid in procedures or sub-routines. So, don't use this kind of labels right now.

Note

In assembly, after the word end or end entry, your program DOES NOT terminate automatically. Pay attention to this. You have to specify that you would like to terminate your program. The two line commands in above example can be used to terminate programs.

Assembly language command in x86 platform is usually formatted as follows:

[label:] mneumonic target, source

Mneumonic is just the jargon for assembly commands. Why is it called that way? It is because that the commands in assembly (somewhat) resemble English words. Then followed by the target, and then comma, and then the source. Label can precede the command if any. For example:

mov ax, 4c00h

Is to move the value 4c00 hexadecimal into the register AX. Simple right?

If there is only one parameter (like in int 21h), usually it denotes the source or the destination depending on the command. Like in jmp start, it means jump to the destination start. As in int 21h, you are invoking operating system library in vector 21 hexadecimal.

You must notice that many of assembly language numbers are expressed in either hexadecimal or binary. So, be really prepared.

If you list the directory after you correctly assemble your program, you will be surprised: The whole bunch of lines is squashed into only 8 bytes! And it's running! WOW! See? It is still could be even smaller (5 bytes or even 2 bytes).

Where to go

Chapter 2
News Page
Lesson 1 contents
Contacting Me