a rtisan shop
in italiano en español
articles downloads tutorials
An active workshop for the construction of future MenuetOS applications and the spread of assembly programming language.

FIRST STEPS:

Welcome to this first tutorial in assembly programming for win32. Our scope is to build working applications in a simple way using Windows operating system elements, but we will concentrate ourselves mostly in general aspects in order to applicate them to other operating systems. Let us remember that our objective is wider: to develop applications for an operating system completely made in assembly language.

Why we choose Windows for this tutorial?
The main reason is that we have at hand a series of tools that ease our work.
Moreover, it is an operating system known by many people and therefore presents greater familiarity. Windows will permit to understand the concepts needed that later we will apply in the building of a complete operating system.

Let us talk about the training package:

Training Package tut01.zip (50 kB)

This is just the time to download it. It merely contains the needed things with the purpose of avoiding diversions.
Remember that our compromise is to start from the ground up. There are not any prerequisites in order to follow this tutorial.

April 17, 2004



If you have already any knowledge in computer programming, which is very probable, let it rest behind for a while and follow the tutorial as a review. You are going to see that this does not resemble to any other programming language tutorial. We will treat arguments such as assembly, Windows API programming, data structures and the use of needed tools.
There are too many concepts and it is a great challenge for us, therefore we ask you for your patience and best trust.

We do a great effort in dosificating treated arguments. Please avoid anticipating happenings in order to take advantage in learning.
All arguments are large areas, the language instruction set, operating system libraries and functions, data structures, all subjects of years of expertise and research. We are going to superficially scratch them but at the end we hope to give useful information to apply in many circumstances and to give you the possibility to deeply involve in the area you find most interesting.

Main Concept
Your microprocessor is able to work only in terms of zeros and ones. The secret in computer programming is to model reality with numbers. Each one and zero is called a bit, the minimal unit of information.

There are many tricks and hacks to represent fractions, text, songs or photos as bit sequences. New ideas and formats are developed and improved daily. We are going to follow them as long as they are needed.


As we promise, here is the full working example. This is the source code to create a program, an executable file that does some work. An EXE in Windows is a formatted binary file that contains instructions to say to the operating system exactly what to do when the user launches the program. The format in which the instructions are written into the file is known as PE (Portable Ejecutable) and we are going to analize it piece by piece during the tutorials.

It is time to start assembling! Launch fasm for Windows (fasmw.exe) that you find in the training package. Then, copy and paste the source here below into the fasm editor.
This is probably the most simpler program to show a window containing the traditional "Hello World".
We are going to briefly describe the source part by part and then we will see it in action..

tut01en1.asm Source Code
format PE GUI


MB_OK              = 00h
MB_ICONEXCLAMATION = 30h


push MB_OK + MB_ICONEXCLAMATION
push _caption
push _message
push 0
call [MessageBox]


push 0
call [ExitProcess]


_caption db 'Win32 Assembly Programming',0

_message db 'Hello World!',0


data import

dd 0,0,0,RVA kernel_name,RVA kernel_table
dd 0,0,0,RVA user_name,RVA user_table
dd 0,0,0,0,0


kernel_table:
ExitProcess dd RVA _ExitProcess
dd 0
user_table:
MessageBox dd RVA _MessageBoxA
dd 0


kernel_name db 'KERNEL32.DLL',0
user_name db 'USER32.DLL',0


_ExitProcess dw 0
db 'ExitProcess',0
_MessageBoxA dw 0
db 'MessageBoxA',0

end data

Screenshot
Result
format

This word is a directive. It indicates the type of format the assembler must give to the final file. In other words, the kind of program we want to produce.
PE GUI

The values given to the directive indicates to produce Portable Executable for the graphical interface.
=

The "equal than" sign is another directive. Fasm will substitute given name with the associated numerical value.
In the example, each time "MB_OK" appears in the file, fasm replaces it with a zero (0).
Numeric values terminated in "h" are interpreted as hexadecimal, that is to say, numbers counted from 0 to 15 instead of 0 to 9 and thus using the first 6 letters of the alphabet. They are so used that we will give a detailed explanation a little later. Number 030h corresponds to 48 decimal.
push

This is an assembly instruction that is converted by fasm in a direct order to the processor. Push "pushes" the given value to a stack of values in memory. The stack stores values in the arrival order, with the latest one at top.

The result of the sum of 0+030h is being pushed into the stack.

In other words, the first value in (top of) stack is 030h
The following instruction pushes the value of label into the stack. This value is an address or position in memory. This one is a fundamental concept for the learning of the assembly language.
Fasm maintains the value of position that corresponds to the point where "_caption" label is defined. In order to define a label there are three equivalent options:

1. To use a data directive, in the case of _caption is db
2. To make follow it by a colon, as in the case of kernel_table:
3. To use the label directive, that we will see in another tutorial.
db
It is a data directive meanning Data Byte. It indicates the assembler to convert the information that follows in sequences of bytes to write them in the produced file (our program).
One byte corresponds to eight bits, it can represent a value between 0 and 255.
As information it counts comma separated numbers and/or text strings. A text string is converted character by character using a code where each symbol is paired to a number (ASCII code).
call

This is other assembly instruction. Its work is to "call" a function, external or internal. A function is a portion of code doing a particular task and is referred by a meaningful label. Please remember that a label represents an address in memory.

Calling the function is telling the processor to follow the instructions when we call the label.

The function MessageBox is an external function. It belongs to Windows and could be used by any program. Its function is to show us the message in a window while waiting for us to click on a button.

In order to work, the function needs to know which message and title to show, how many buttons and the style to use. This information is taken directly from the stack by the function and we need to provide it exactly and in the proper order before calling.
The function ExitProcess is another Windows' function. Its work is to properly terminates our program; a well done Windows program always end by calling ExitProcess.

Have you noticed the zero we put on stack before calling this function? It serves to tell Windows how our program ended. Conventionally, zero means that our program does well what it was intended to do.
Here the execution of our program finishes indeed.

There are some lines not analyzed. These lines are not being executed. They serve for the correct operation of the program and each one accomplishes a certain task. Although they seem cryptic enough, in fact this is only at first glance. In order to explain them, we will need additional concepts that we will see at the end of this tutorial.

OK, let's go. We are going to see our program in action!

First, we are going to save the source code into the disc, assemble it and finally, launch it. From the menu "Run" in the editor, we select "Run" and fasm will do the rest. It will ask the name and location where to store source. Click on OK button and almost immediately will appear the message window similar to that presented here in the screenshot picture. We have our program working now. It does almost nothing, but it already contains the essential elements common to all Windows programs. We are going to develop and increase program functionality as the tutorial advances and we get more concepts.

Note:
It is possible that your antivirus program issues an alert message.
Do not worry, the problem in this case is not your program but the antivirus.
- what?
Simply it is confused by a program so small and guesses that could be a virus. Normally, virus are small enough to be able to reproduce themselves while remaining hidden. What really happens is that your antivirus is trying to protect you from an eventual threat.
A little later, we will see a solution for this problem and your antivirus will not suspect about us anymore.

Improvements...



PE EXE files are organized in sections according to the type of information they manage. Thus for example, the instructions that are executed go inside a section, the data that are not executed go in another one, the tables with the names of the Windows functions in another section and so on.

In our case, we have all mixed in one section and because we do not tell anything to our assembler, it puts everithing in a section called ".flat".
To play clean with our operating system and to avoid suspects from antivirus software, we will keep organized into proper sections.

tut01en2.asm Source Code
format PE GUI
MB_OK              = 00h
MB_YESNO           = 04h
MB_ICONEXCLAMATION = 30h


section '.code' code readable executable

push MB_YESNO + MB_ICONEXCLAMATION
push _caption
push _message
push 0
call [MessageBox]

push 0
call [ExitProcess]


section '.data' data readable writeable

_caption db 'Win32 Assembly Programming',0

_message db 'Are you enjoying Fasm?',0


section '.idata' import data readable

dd 0,0,0,RVA kernel_name,RVA kernel_table
dd 0,0,0,RVA user_name,RVA user_table
dd 0,0,0,0,0

kernel_table:
ExitProcess dd RVA _ExitProcess
dd 0
user_table:
MessageBox dd RVA _MessageBoxA
dd 0

kernel_name db 'KERNEL32.DLL',0
user_name db 'USER32.DLL',0

_ExitProcess dw 0
db 'ExitProcess',0
_MessageBoxA dw 0
db 'MessageBoxA',0

Screenshot
Result
Our program practically remains untouched, the changes are hidden into the produced result, the executable file.
First lines in the source remains identical. The equalities are outside any section. In fact, these do not affect the final result. They are not going into the file, because they serve only to make sources more readable using meaningful names instead of numbers. The first thing the assembler does is change them by the corresponding number at each occurrence within sources.
section

This directive causes that fasm reserves space in the PE for a section. The order in the executable file will reflect the order given in source code.
Following is a name that can be any combination of 8 characters, but to keep standards, each section has a name according to the type of contained information.
The first section is called ".code" and it contains the instructions that the processor is going to execute. The idea is that Windows knows it and for that reason we specify that is executable.
By security, Windows must know if the section is to be read only or read/write. In this case we indicated readable because we do not try to write in it.
One of the reasons why antivirus are angried against our first program is that only one section, marked as readable/writeable/executable smells like a mutant code. This is code that can modify itself as many virus do.
The second version presents the solution to this problem.
Second section is called ".data" and is indicated as data. The assembler knows that following lines are not to be executed, in other words, they do not produce instructions for processor. We mark it as writeable even if we are not going to write anything. Our program is so simple that do not even need to write. A common program needs to write some values and thus marking writeable is the standard.
Finally, the table with library names and Windows functions is called ".idata". We mark it as import to indicate the type of data it contains.
Each section arrives until the begining of the next or until the end of source code.
It is a good idea to save this second version with a different filename in order to compare it.
After compiling and running, we note that the file is larger than the precedent one.
The grown is caused by the way the sections are stored inside the PE format and we will have the oportunity to see it in the next tutorial.

Now let us speak a little about the last lines of our program. We have specified import AS the data type required for the section. The purpose is to inform Windows that we want to call some functions external to our program. We want to import them, in this case, from the operating system.
How can it be accomplished?
Through PE format, with a precise table in where Windows can find the labels of the functions required by our program.

In Windows, the functions of the operating system are collected in archives with extension DLL, called dynamic link libraries. Each one of these libraries contains one or more functions with their respective labels and ready to be used by running programs. The only thing we need to do is to know their names and strictly build the table according to PE format.

The table must contain the following information:

Library name: KERNEL32.DLL USER32.DLL
Names of required functions: ExitProcess MessageBox

The key to build the table is here:
dd 0,0,0,RVA kernel_name,RVA kernel_table
dd 0,0,0,RVA user_name,RVA user_table
dd 0,0,0,0,0
The data directive dd indicates to fasm that it must turn the sequence of comma-separated values in DWORD data (Double word), numbers of 4 bytes: the famous 32 bits.
RVA is a fasm operator that calculates the number of bytes from the beginning of the PE section until the point in which the corresponding label was defined, in this case "kernel_name".
RVA means Relative Virtual Address and it is used because Windows decides where to put the sections in memory when it loads the program. RVA is an advanced concept that we will treat in detail in another tutorial.

Windows finds the table, looks for the relative address in where we have the name of the library. It looks for the library within certain locations. In this case it looks in the system directory, usually C:\Windows\System32.
Once Windows loads the library, it locates the functions with the provided names, makes the connections and it is ready. Our program works.

A little theory...


¿Why they create those libraries and what does it mean dynamic?

Those libraries were created to avoid common tasks to be provided by repetitive code. For example, each program needing to show a message, doing it in a different way and duplicating existing code.
Dynamic means that the libraries start to work once the program is running and still they are not part of our program. It permits our program to be so small.
When our program is launched, Windows loads it in memory, then looks for the import table to know which libraries to load and then locate all needed functions using given function names.
If one single library is not found or a function name is missing or mispelled, the program is not allowed to run.


It is important right now to describe what does load into memory mean.
The processor in our computer cannot neither execute instructions nor work with data directly from disks where they are stored. The processor needs that all information is being read and located into the Random Access Memory (RAM). This is performed by Windows when we launch a program.
The term processor is abbreviated form of Microprocessor or CPU, the Central Process Unit.

RELATED INFORMATION:
This is a good moment to extend the following concepts:

CPU and RAM
Storage Disks

Conclusion



We hope that you have enjoyed this tutorial and you start to be enthusiastic about assembly language: the only one that offer you total control over the machine.
We have seen merely two language instructions, but we have advanced solidly with the functioning of Windows operating system.


Our program will grow with the following of the tutorials, hand in hand with the capabilities of assembly, new instructions, and more operating system's functions.

To follow our next tutorial, visit GETTING PRACTICE.


Feedback:
This is very important for us to know your questions and comments about this page in order to improve tutorials. They will be addressed and classified as soon as possible.

Thank You



The programs distributed or referred in this website are free for any use.
All rights are reserved by their respective authors. Educational material and articles are protected by the
Artisanal Artistic License
® 2004 Artisan Shop