String Instructions

MOVS, CMPS, LODS, SCAS, and STOS

Welcome

Hi! Welcome to the thirteenth chapter of this series. I hope that you really understand the array concepts we discussed in the last chapter. Now, I'm going to explain the basic string instruction. As you have noticed that in assembly, a string is basically an array. So, the instructions to do strings are also applicable into arrays. This turns out to be an invaluable concept.

There are five basic string instructions, which are also known as the "five brothers". Of course these instruction can be "emulated" with mov, cmp, loop and jmp. However, these five brothers are a lot faster since they are the "built-in" instructions. OK, straight to the stuff.

LES DI and LDS SI

String instructions typically uses DS:SI pair to denote the source string and ES:DI pair to denote the destination string. In "tiny" memory mode, we don't really care about setting DS and ES, right? So, the only thing we care is to set the register SI and DI to point to the source and destination offset respectively. However, you may find the instructions les di, [somestringvar] and lds si, [otherstringvar] in some programs. These instructions are used to set both ES and DI or both DS and SI respectively. So, you may think of it as a "combo" instruction.

Direction Flag

After setting source and/or destination register pairs, you may want to specify on how the string instruction is performed: Should it be performed backwards or forwards? Well, this may be a bit strange for you, but assembly can do these instructions in both directions.

Determining which way to go involves setting the direction flag. Intel x86 assembly has two instructins for this: cld ("clear direction flag") and std ("set direction flag"). Clearing direction flag will cause the string instructions done forward. Setting it will make a reverse direction. Since we typically want to do the string instructions forward, we almost always put cld instruction after setting the register pairs.

MOVS

The instruction movs is used to copy source string into the destination (yes, copy, not move). This instruction has two variants: movsb and movsw. The movsb ("move string byte") moves one byte at a time, whereas movsw moves two bytes at a time.

Since we'd like to move several bytes at a time, these movs instructions are done in batches using rep prefix. The number of movements is specified by CX register. See the example below:

    :
   lds   si, [src]
   les   di, [dest]
   cld
   mov   cx, 100
   rep   movsb
    :

This example will copy 100 bytes from src to dest. If you replace movsb with movsw, you copy 200 bytes instead. If you remove the rep prefix, the CX register will have no effect. You will move one byte (if it is movsb, or 2 bytes if it is movsw).

Assembly gurus use this instruction a lot, because arrays can be copied in the very same way. You can use this to emulate C/C++'s strcpy.

CMPS

The instruction cmps is used to compare two strings. It also has two variants: cmpsb and cmpsw. The cmpsb is to compare one byte at a time and cmpsw will compare two bytes at a time. Usually, we tend to use more of cmpsb. Let's look at the example below:

    :
   lds   si, [src]
   les   di, [dest]
   cld
   mov   cx, 100
   rep   cmpsb
   jne   @@mismatch
@@match:
     :
     :

@@mismatch:
   dec   si
   dec   di
     :

After the rep cmpsb, the zero flag is set if the result is equal. If the strings are not equal, then the zero flag is cleared. Thus, typically, after a rep cmpsb you do a jne @@somelabel to detect mismatches.

If there is a mismatch, then SI and DI point one byte further from the mismatch point. So, you need to decrement them by one like the example above.

If you replace the prefix rep with repne, it means that you want to make sure that all elements in the strings are completely not the same. The repne is seldom used in conjunction of cmpsb though.

C/C++ users: Why don't you use this for doing strcmp?

SCAS

The instruction scas is used to scan a string pointed by ES:DI. So, this time DS:SI is not used. This instruction is typically used for searching a particular character in a string. As with other string instructions, scas also has two variants: scasb and scasw. In scasb, the string ES:DI is searched for the occurence of the element specified by the register AL, whereas in scasw, the element to be searched is in AX. Look at the following example:

   :
   les   di, [msg]
   mov   al, 65      ; --> 65 is the ASCII code for capital A.
   cld
   mov   cx, 1000    ; --> search within 1000 bytes
   rep   scasb
   je    @@found

@@notfound:
   :
   :
@@found:
   dec   di          ; --> If we found it, DI always point 1 byte further, just like in cmps
   :

As it is in cmps instruction, we must check with either jne or je to assert whether it really finds it or not.

Let's look at the following procedure. This procedure is used to calculate the string length (C/C++: strlen, Pascal: length).

; -- String length, result in AX
proc strlen strpointer: dword
   push  es
   push  di
   push  si
   push  cx

   les   di, [strpointer]
   mov   si, di
   sub   al, al

   cld
   mov   cx, 10000    ; --> Scanning within the first 10000 bytes
   rep   scasb
   je    @@found

   mov   ax, -1       ; --> When we can't find it, return -1
   jmp   @@quit

@@found:
   sub   si, di
   mov   ax, si
   inc   ax

@@quit:
   pop   cx
   pop   si
   pop   di
   pop   es
   ret
endp

Well, in building strcpy, you'll need this function. To invoke this function, do call strlen, @data, offset mystr (TASM) or invoke strlen, seg data, offset mystr (MASM).

STOS

The stos instruction bombard the string pointed by ES:DI pair with the value in the accumulator. So, it is great when you'd like to initialize arrays (usually with zeroes). As with the other brothers, it has two variants: stosb and stosw. In stosb, all bytes in the string ES:DI is replaced with whatever AL contains. In stosw, the initializator is AX instead of AL.

Look at the following example:

   :
   les   di, [myarray]
   sub   ax, ax        ; --> AX = 0
   cld
   mov   cx, 100
   rep   stosw
   :

This excerpt will initialize 200 bytes of myarray by 0.

LODS

The lods instruction will load a chunk (either a byte or a word) from the string pointed by DS:SI into accumulator. As always, it has two variants: lodsb and lodsw. Unlike the other brothers, this lods instruction usually never comboed with rep prefix. Why? Because we usually interested in fetching a byte (or a word) at a time and then examine it. If we use rep stosb or rep stosw, the value in accumulator gets overwritten. Thus, the rep prefix makes no sense here.

The lods instruction actually replaceable by the normal mov. But, I think the lods is faster. Look at the following example:

   :
   lds   si, [mystr]
   cld
   lodsb
   ;  now AL contains the first byte pointed by DS:SI
   :

The excerpt above is actually equivalent to:

   :
   lds   si, [mystr]
   mov   al, [si]
   inc   si
   ;  now AL contains the first byte pointed by DS:SI
   :

The other advantage in lods is that this instruction can go backward or forward depending the direction flag. The processor will take care of this automatically (which may be handy when you'd like to reverse a string, for example). In the manual way, you have to keep track this yourself.

Quick Facts

After a blitz introduction with the five brothers, you'd probably feel a little overwhelmed. Let me summarize it for you:

String instructions usually use DS:SI pair or ES:DI pair or both. Setting this register pair can be achieved using lds si and les di instructions.
movs and cmps instructions will need both. scas and stos will need only ES:DI pair. lods will need only DS:SI pair.
String instructions use the direction flag to determine the direction of the operation. Clearing the flag using cld will cause the operations done forward, setting the flags using std will make them run backward.
The string instructions always have two variants. One is by adding the letter 'b', the other is to mount the letter 'w' which signify byte order or word order respectively.
The byte order instructions are done byte per byte, whereas the word order ones are done per two bytes.
Among the string instructions that need accumulator (either AX or AL) are scas, stos, and lods.
In byte order instruction, the accumulator means AL, whereas in the word order one, it means AX.
All instructions are usually prefixed by rep or its variants (e.g. repne), except lods.
All instructions which need rep prefix or its variants, needs to set CX as the counter.
After doing rep, the instructions cmps and scas will set the zero flag, which must be checked to detect whether a match or mismatch was found. The detection can be done using je or jne instructions.

Closing

Whew! A pretty long chapter. OK, I think that's all for now. By the way, 80386 processors or better include one more variant by augmenting the letter 'd'. So, we'll have movsd, and so on. The operation is done per 4-bytes at a time. Therefore, it's twice speedier than the w-variant and four times faster than the b-variant. Wonderful, isn't it? If the d-variant instructions ever need the accumulator, it means EAX, the extended version of AX. I'll explain more of this in the second lesson.

See you next time.

Where to go

Chapter 14
News Page
x86 Assembly Lesson 1 index
Contacting Me