left-icon

Assembly Language Succinctly®
by Christopher Rose

Previous
Chapter

of
A
A
A

CHAPTER 4

Addressing Modes

Addressing Modes


The different types of parameters an instruction can take are called addressing modes. This term is not to be confused with addresses in memory. The addressing modes include methods for addressing memory as well as the registers. Addressing modes are defined both by the CPU and the assembler. They are methods by which a programmer can address operands.

Registers Addressing Mode

The registers addressing mode is fairly self-explanatory. Any of the x86 registers can be used.

mov eax, ebx      ; EAX and EBX are both registers

add rcx, rdx      ; RCX and RDX are 64-bit registers

sub al, bl        ; AL and BL are the low 8-bit registers of RAX and RBX

Immediate Addressing Mode

The immediate or literal addressing mode is where a literal number appears as a parameter to an instruction, such as mov eax, 128 where 128 would be the literal or immediate value. MASM understands literal numbers in several different bases.

Table 5: Common Bases

Base

Name

Suffix

Digits

Example

2

Binary

b

0 and 1

1001b

8

Octal

o

0 to 7

77723o

10

Decimal

d or none

0 to 9

1893 or 235d

16

Hexadecimal

h

0 to F

783ffh or 0fch

Note: When describing numbers in hexadecimal, if they begin with a letter digit (leftmost digit is A, B, C, D, E, or F), then an additional zero must be placed before it; “ffh” must be “0ffh”. This does not change the size of the operand.

In addition to using a number, you can also use mathematical expressions, so long as they evaluate to a constant. The mathematical expressions will not be evaluated by the CPU at run time, but MASM will translate them to their constant values prior to assembling.

mov rax, 29+23     ; This is fine, will become mov rax, 52

mov rcx, 32/(19-4); Evaluates to 2, so MASM will translate to mov rax, 2

mov rdx, rbx*82    ; rbx*82 is not constant, this statement will not work

Implied Addressing Mode

Many instructions manipulate a register or some part of memory pointed to by a register, even though the register or memory address does not appear as a parameter. For instance, the string instructions (MOVSxx, SCASxx, LODSxx, etc.) reference memory, RAX, RCX, RSI, and RDI even though they take no parameters. This usage is called the implied addressing mode; parameters are implied by the instructions themselves and do not appear in the code.

REP SCASB    ; Scan string at [RDI] for AL and scan the number of bytes in RCX

CPUID       ; CPUID takes EAX as input and outputs to EAX, EBX, ECX, and EDX

Memory Addressing Mode

There is a multitude of ways to reference memory in MASM. They all do essentially the same thing; they read or write data from some address in RAM. The most basic usage of the memory addressing mode is using a variable defined in the data segment by name.

.data

xyzVar db ?    ; Define some variable in the data segment

.code

SomeFunction proc

      mov al, xyzVar     ; Move *xyzVar, the value of xyzVar, into AL

      .

      . Code continues

      .

Note: Because a label defined in the data segment is actually a pointer, some people tend not to call them variables but rather pointers or labels. The usage of “xyzVar” in the sample code is actually something like “mov al, byte ptr [xyzVar]” where xyzVar is a literal address.

It is often necessary to tell MASM what size the memory operand is, so that it knows what machine code to generate. For instance, there are many MOV instructions: there is one that moves bytes, one for words, and another for dwords. The same MOV mnemonic is used for all of them, but they generate completely different machine code and the CPU does different things for each of them.

These prefixes can be placed to the left of the square braces. The size prefixes are as follows.

Table 6: Pointer Size Prefixes

Size in Bytes

Prefix

1

byte ptr

2

word ptr

4

dword ptr

8

qword ptr

10

real10 ptr

16

xmmword ptr

32

ymmword ptr

Note: Signed, unsigned, or float versus integer is irrelevant here. A signed word is two bytes long, just as an unsigned word is two bytes long. These prefixes are only telling MASM the amount of data in bytes; they do not need to specify with any more clarity. For instance, to move a double (64-bit float) you can use the qword ptr, since 8 bytes is a quad word and it does not matter that the data happens to be a real8. You can also use real8 to move this amount of data.

In addition to using simple variables defined in the data segment, you can use registers as pointers.

mov eax, dword ptr [rcx]; Move 4 bytes starting where RCX is pointing

mov bl, byte ptr [r8]     ; Move a byte from *R8 into BL

add dx, word ptr [rax]    ; Add the word at *RAX to the value in DX

You can also add two registers together in the square braces. This allows a single base register to point to the first element of an array and a second offset pointer to step through the array. You can also use a register and add or subtract some literal value from it.

Note: Values being added or subtracted from a register can be complex expressions so long as they evaluate to a constant. MASM will calculate the expression prior to assembling the file.

sub rbx, qword ptr [rcx+rax]    ; Perhaps the base is RCX and RAX is an offset

add dword ptr [r8+68], r9d ; Here we have added a constant to r8

add dword ptr [r8-51], r9d ; Here we have subtracted a constant from r8

Note: Whenever values are being subtracted or added to addresses, either by using literal numbers or by using registers, the amount being added or subtracted always represents a number of bytes. Assembly is not like C++ with its pointer arithmetic. All pointers in assembly increment and decrement a single byte at a time, whereas in C++ an integer pointer will increment and decrement 4 bytes at a time automatically.

The most flexible of all memory addressing modes is perhaps the SIB (Scale, Index, Base) memory addressing mode. This involves a base register pointing to the start of some array, an index register that is being used as an offset into the array, and a scale multiplier that can be 1, 2, 4, or 8, and is used to multiply the value the index holds to properly access elements of different sizes.

mov bx, byte ptr [rcx+rdx*2]    ; RCX is the base, RDX is an offset and we

                             ; are using words so the scale is 2

add qword ptr [rax+rcx*8], r12  ; RAX is the base, RCX is the index

                                   ; and we are referencing qwords so the

                                   ; scale is 8

This addressing mode is useful for stepping through arrays in a manner similar to C++. Set the base register to the first element of the array, and then increment the index register and set the scale to the data size of the elements of the array.

mov rax, qword ptr [rcx+rbx*8]; Traverse a qword array at *RCX with RBX

mov cx, word ptr [rax+r8*2]   ; Traverse a word array at *RAX with R8

Scroll To Top
Disclaimer
DISCLAIMER: Web reader is currently in beta. Please report any issues through our support system. PDF and Kindle format files are also available for download.

Previous

Next



You are one step away from downloading ebooks from the Succinctly® series premier collection!
A confirmation has been sent to your email address. Please check and confirm your email subscription to complete the download.