Memory machine assembly comments
location language language
(HEX)
21AA:0100 B409 MOV AH,09 ;display string of characters
21AA:0102 BA1701 MOV DX,0117 ;point to string
21AA:0105 CD21 INT 21 ;do it
21AA:0107 B401 MOV AH,01 ;keyboard input function
21AA:0109 CD21 INT 21 ;do it
21AA:010B B44C MOV AH,4C ;exit function
21AA:010D 2C30 SUB AL,30 ;convert to number
21AA:010F 7EF6 JLE 0107 ;jump to 107 if < "1"
21AA:0111 3C09 CMP AL,09 ;compare to "9"
21AA:0113 7FF2 JG 0107 ;jump to 107 if greater
21AA:0115 CD21 INT 21 ;do exit
Memory dump
Location HEX Ascii
21AA:0100 B4 09 BA 17 01 CD 21 B4-01 CD 21 B4 4C 2C 30 7E ......!...!.L,0~
21AA:0110 F6 3C 09 7F F2 CD 21 45-6E 74 65 72 20 6F 70 74 .<....!Enter opt
21AA:0120 69 6F 6E 3A 20 24 C1 C4-B4 00 B3 20 20 20 20 20 ion: $.....
Above is a example of an assembly language program to get a
keypress from 1 to 9 from the keyboard. The program is written
by the programmer as shown under the columns "assembly language"
and "comments". He then has an assembler read and assemble the
program into machine language. An assembler is a computer
program which can interpret an assembly language file and
convert it to machine language complete with memory locations.
Usually the assembly language program would be named something
like OPT1TO9.ASM and the output from the assembler would be a
file like OPT1TO9.COM or OPT1TO9.EXE. When the program is
loaded into memory the data in memory would be as shown in the
following table labeled "memory dump".
The memory location 21AA:0100 is actually determined by
multiplying the number 21AA (HEX) by 16 (decimal) and adding 100
(HEX). That's because that the chip used in IBM-PC compatible
computers loads programs at the beginning of 16 byte paragraphs
of memory. Memory locations are referred to as the offset in
bytes from the beginning of the paragraph. Programs generally
begin at offset 100 because the first 100H bytes (256 in
decimal) are reserved for information needed by the computer and
are called the Program Segment Prefix (PSP). The assembler
actually only determines the offset memory locations and the
segment (paragraph) location is determined by the DOS (Disk
Operating System) as the program is loaded from disk.
To write an assembly language program the programmer must know
the mnemonics for the particular chip and assembler he is using.
Mnemonics are the language understood by the assembler like the
mnemonic, "MOV AH,09", which means to move the Hexidecimal
number 9 into the AH register of the chip. The comments in a
program are for the programmer's own use and are ignored by the
assembler.
BIOS, DOS, and Interrupts
Because computers manufactured by different companies and the
components they are made from may vary, each PC compatible has a
series of machine language mini programs built into its ROM,
(Read Only Memory). These programs, actually subroutines, are
called the BIOS, (Basic Input/Output System). They handle all
interaction between the ALU and the IO devices built into the
system.
These routines are used by setting certain memory registers in
the ALU chip and calling an Interrupt. The memory registers
include AX, BX, CX, and DX and well as others. Each memory
register is a set of two byte wide locations within the ALU
chip. The AX register, for example, is made up of the AL, (Low
order) byte and the AH, (High order) byte.
There can be as many as 256 interrupts available to carry out
various functions. For instance interrupt 10H (16 decimal)
provides functions to control the video display, interrupt 16H
provides keyboard access functions, and interrupt 21H provides a
plethora of MS-DOS functions. AH is set with the function
number before calling the interrupt. For instance function 9 in
interrupt 21H is a function to display a string of characters on
the video. If this function is called, it is assumed that the
DX register will point to the memory location where the string
of characters begins and that the string will be terminated by a
24H character which is a $. This is illustrated in the first
three instructions in the assembly language program above.
The disk operating system is the computer program which is
loaded into memory upon initial startup of the system and which
remains in memory, at least in part, during the entire time the
system is powered up. It provides access through the BIOS to a
number of commands to load programs from disk, execute programs,
and provide various disk management and other I/O functions.
The DOS internal commands include:
DIR CLS TYPE COPY COPY + DATE TIME DEL
ERASE REN VER VOL RMDIR CHDIR MKDIR CTTY
PATH PROMPT SET VERIFY EXIT CALL ECHO FOR
GOTO IF PAUSE REM SHIFT
Besides these the DOS can execute any other EXE, COM, or BAT
program if the program's name is entered at the DOS prompt.
Higher Level Languages
Although assembly language provides the most speed and the most
efficient use of memory of any programming language, it is also
the most difficult in which to program. This is because it is
the most primitive language next to the manual system of simply
calculating the bytes to put into memory for a program and
setting the memory locations. For speed and ease of programming
a number of higher level languages have been developed over the
years. These include BASIC, C, and Pascal.
Although higher level languages are easier to understand and use
they result in larger and slower programs than does assembly
language. The most efficient higher level language in these
terms is C, but it is also more difficult to use than some other
languages.
Let us discuss BASIC as a commonly available example of higher
level languages. Source code written in the BASIC language for
the above program might look like this:
10 PRINT "Enter option: ";
20 INPUT X$
30 IF X$<"1" OR X$>"9" THEN GOTO 20
40 END
This very small program takes up 74 bytes in interpreted BASIC,
but only 38 bytes in machine language and will operate much
faster in machine language. Since BASIC can be used as an
interpreted program or BASIC source code can be compiled into
machine language, this program could be used in either of two
ways. If it were compiled by a basic compiler the output would
be a machine language EXE program which would take more memory
space, 2400 bytes, but would run much faster than if
interpreted. To use an interpreter the BASIC language
interpreter would have to be loaded into memory as a program to
be run. It would then load, interpret and execute the above
program from the source code.
Another type of program which can be executed by the DOS is a
batch program. This is simply a series of DOS internal commands
and EXE, COM, or BAT files which can be listed and run in
sequence.
An example batch program would be:
@ECHO OFF
SCREEN cyan black
ECHO ΥΝΝΝΝΝΝΝΝ΅ CHOOSE A PROGRAM ΖΝΝΝΝΝΝΝΝΝΝΝΝΝΝΈ
ECHO ³ ³
ECHO ³ [1] - Interest calculations ³
ECHO ³ [2] - Play Chess ³
ECHO ³ [3] - Return to DOS ³
ECHO ³ ³
ECHO ΤΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΎ
OPT1TO9
ECHO Key Accepted - One moment please ...
IF ERRORLEVEL 3 GOTO END
IF ERRORLEVEL 2 GOTO B
IF ERRORLEVEL 1 GOTO A
:A
BASIC INTEREST
GOTO END
:B
CHESS
GOTO END
:END
In the MS-DOS world some of the extensions at the end of file
names have standardized meanings as follows:
EXE - An execute program which can be executed from DOS
COM - A command program which can be executed from DOS
BAT - A batch file of DOS commands which can be executed in
sequence from DOS.
BAS - A BASIC program which can be compiled or interpreted.
ASM - An assembly language program which can be assembled.
C - Source code for a C program which can be compiled.
PAS - Source code for a Pascal program which can be compiled.
TXT - An ASCII file
DOC - An ASCII file
So all that is necessary to make a computer do whatever you want
is to know the syntax and language of one or several of the
above languages and plan, write, and debug the program.