Chp02 Assembly Language Fundamentals
Chp02 Assembly Language Fundamentals
References:
• Chapter 3 ”Assembly Language Fundamentals” By Kip R. Irvine, 3rd edition. Most
of the examples in this edition are for 16-bit processors
• Chapter 3 ”Assembly Language Fundamentals” By Kip R. Irvine, 4th edition. Most
of the examples in this edition are for 32-bit processors
Directive Description
end End of program assembly
proc Begin procedure
endp End of procedure
title Title of the listing file
.code Mark the beginning of the code segment
.data Mark the beginning of the data segment
.model Specify the program’s memory model
.stack Set the size of the stack segment
• Statements can be written in any column with any number of spaces between
each operand. Blank lines are permitted between statements. Details for each
part of the statement syntax will be explained later in this chapter.
Ends with a radix symbol that identifies the numeric base (H→ Hex, B→ Bin, D→
Decimal). Default is decimal. For Octal end with q or Q.
Note: The letters for radix (H, B, D) are NOT case sensitive
When a hexadecimal constant begins with a letter, it must contain a leading zero.
For example: 0F6h
Examples of integer constants: 26, 1Ah, 1101b, 2BH, 0F6H, etc
Assembler will detect most of the errors caused by wrong use of keywords
Example:
add: mov ax,5
Above statement will cause an assembling error, since add is a keyword and here it
is used as a label.
Nam es
• A name identifies a label or a variable. It may contain any of the following
characters: A…Z, a…z (letters), 0…9 (Digits), ? (Question mark), _ (underscore),
@ (number sign), $ (dollar sign), . (period)
Variables
A variable is a location in a program’s data area that has been assigned a name
Example: count db 50
Refer to data allocation directives later in this chapter on how to define variables.
Com m ents
Comments can be specified in two ways:
• Single-line com m ents , beginning with a semicolon character (;). All characters
following the semicolon on the same line are ignored by the assembler and may be
used to comment the program.
• Block com m ents , beginning with the COMMENT directive and a user-specified
symbol. All subsequent lines of text are ignored by the assembler until the same
user-specified symbol appears. For example:
COMMENT !
This line is a comment .
This line is also a comment.
!
COMMENT &
This line is a comment.
This line is also a comment.
&
Labels
Labels serve as place markers when a program needs to jump or loop from one
location to another. A label can be on a blank line by itself, or it can share a line with
an instruction. It must be followed by a colon (:)
Important: The code field should be separated from the operands by at least one
space.
Examples:
var1 db 'A'
var2 db –128
var3 db 255
A variable’s initial contents may be left undefined by using a question mark for
the initializer. Example:
number db ?
• Each initializer can use a different radix (base) when a list of items is defined.
Numeric, character or string constants can be freely mixed.
Examples:
list1 db 10, 32, 41h, 00100010b
list2 db 0Ah, 20h, 'A', 22h
Note: You can continue a line onto the next line, if the last character in the previous
line is \ (backslash). Example:
Example:
longArrayDef db 10h, 23h, 46h, 15h
db 17h, 63h, 77h, 89h
Note:
capital_letters db ’ABC’
is equivalent to
capital_letters db 41h, 42h, 43h ; (ASCII codes for ’A’, ’B’ and ’C’)
The above string is null terminated string. Another common termination character
is $.
The string can continue on multiple lines without the necessity of supplying a
label for each line. The following is a long null-terminated string:
LongString db "This is a long string, that "
db "clearly is going to take "
db "several bytes to store",0
Note: It is possible to combine characters and numbers in one definition as:
msg db 'Hello World! ', 0Ah, 0Dh, '$'
where 0Ah is linefeed and 0Dh is carriage return.
Examples:
var1 db 20 dup(0) ; 20 bytes, all equal to zero
var2 db 20 dup(?) ; 20 bytes, un-initialized
var3 db 4 dup("ABC") ; 12 bytes: "ABCABCABCABC"
Note: A space is required before the word dup
Note: dup can also be used with dw as in var6 and var7 above.
Demo: Draw figure for var4 above.
Note: Intel uses Little Endian format (and NOT Big Endian). "Little Endian"
means that the low-order byte of the number is stored in memory at the lowest
address, and the high-order byte at the highest address. Motorolla is one of the
processors which uses Big Endian.
For example, the value 1234h would be stored in memory as follows using Little
endian format:
Offset: 00 01
Value: 34 12
A symbol defined with EQU cannot be redefined later in the program (like
constant identifier in C/C++)
No memory is allocated for EQU names
reg can be any non-segment register (for segment registers see next
paragraphs)
Instruction Pointer (IP) register and immediate values CAN NOT be a destination
operand. The sizes of both operands must be the same!!
Where segment registers are involved, the following types of moves are
possible, with the exception that CS cannot be a destination operand and
immediate value can not be moved to segment register (segreg):
mov segreg , reg16
mov segreg , mem16
mov reg16 , segreg ; segreg are 16-bits regs
mov mem16 , segreg ; for 16-bit CPUs
MOV instruction lacks the ability to use two memory operands in one statement
(mov mem,mem not allowed). In such a case, a register must be used when
copying a byte or word from one memory location to another as in:
mov ax,var1 ; where var1 and var2 are words variables
mov var2,ax
This is equivalent to var2=var1 in High level programming languages.
Destination Source
reg
segreg reg
mem
reg
mem segreg
segreg mem
reg
mem imm
reg
Note: CS can not be destination!!
Examples of MOV instruction using above operands
Data segment (Details later on defining data segm ent )
.data
count db 10
total dw 4126h
Question: What about mov ah, ’A’ and mov AX, ’A’ ??? (Note: ’A’≡41h)
Memory values that do not have their own labels can be addressed by adding a
displacement to the name of a memory operand. Example:
arrayB db 10h,20h
XCHG (ex change data) I nstruction - (R efer to instruction set page 30)
The instruction exchanges the contents of two registers, or the contents of a
register and a variable. Refer page 30 instruction set. The syntax is:
xchg reg , reg
xchg reg , mem
xchg mem , reg
This is the most efficient way to exchange two operands because no storage of
temporary value is required
Very useful in sorting (remember in H/L: temp=x; x=y; y=temp)
Examples of XCHG
xchg ax,bx ; exchange two 16-bit registers
xchg ah,al ; exchange two 8-bit registers
xchg var1,bx ; exchange 16-bit memory operand (var1) with BX
H/W: Satisfy yourself that the above three statements will do the exchange.
ARITHMETIC INSTRUCTIONS
Examples:
inc al ; increment 8-bit register
dec bx ; decrement 16-bit register
inc byte_var ; increment memory operand
dec word_var ; decrement memory operand
Destination Source
Register
Register Memory
Immediate
Register
Memory Immediate
Examples:
add cl,al ; add 8-bit register to register
add var1,ax ; add 16-bit register to memory
add bx,1000h ; add immediate value to 16-bit register
add var2,10 ; add immediate value to memory
add dx,var3 ; add 16-bit memory to register
Note: In above examples, it is assumed that the variables var1 and var3 have been
defined using dw directive.
To add mem to mem use register as follows:
mov al, byte2
add al, byte1 ; al = byte1 + byte2
Examples of SUB
sub cl,al ; subtract 8–bit register from register
sub bx,1000h ; subtract immediate value from 16–bit reg.
sub var1,10 ; subtract immediate value from memory
sub var2,ax ; subtract 16–bit register from memory
sub dx,var3 ; subtract 16–bit memory from register
Note: If either ADD or SUB generates a result of zero, the Zero flag is set. If the
result is negative, the sign flag is set
H/W: Satisfy yourself about above flag bits status (after learning Codeview )
Hom ew ork : write assembly code for calculating value of z assuming x and y are
known 16-bt variables and z is a 16-bit variable using only add and sub instructions:
z=3x-2y+10
The OFFSET operator returns the 16-bit offset of a variable. The assembler
automatically calculates every variable’s offset from DS when a program is being
assembled.
.data
wordnum dw 1234h
.code
……….. ; some other instructions
mov bx , offset wordnum
Also: lea bx, wordnum ; Load Effective Address can be used instead (page 15
Instruction set)
.data
arrayB db 10h, 20h
arrayW dw 100h, 200h, 300h
.code
mov bx , offset arrayB ; offset of a byte array
mov al, [bx] ; al=10h ; [] means contents of
;[bx]=>register-indirect addressing mode
mov cl, [bx+1] ; cl=20h
mov bx , offset arrayW ; offset of a word
mov ax, [bx] ; ax=100h
mov cx, [bx+2] ; cx=200h
P TR operator
Operands of an instruction must be of the same type (both bytes or words) for
example in MOV, ADD, SUB instructions
If one operand is a constant (immediate value), the assembler attempts to
guess the type from the other operand. For example:
mov ax,1 is treated as a word instruction because AX is a 16-bit register; while
mov bh, 1 is a byte instruction
Often the size of an operand is not clear from the content of an instruction.
Consider the following instructions, which will generate an ” operand m ust
have size ” errors message by the assembler:
The assembler doesn’t know whether bx points to a byte or a word. The PTR
operator makes the operand size clear. PTR must be used in combination with the
standard assembler data types such as BYTE, WORD, etc
……………….
mov bx, offset var1
inc byte PTR [bx]
mov word PTR [bx], 1
Example:
.data
val1 dw 1234h
…….
.code
......
mov bx, offset val1
mov al, byte ptr [bx] ; al=34h
mov cl, byte ptr [bx+1] ; cl=12h
mov al, [bx] ; H/W: al=?
mov cl, [bx+1] ; H/W: cl=?
Simple examples
Exam ple1 : Adding three byte numbers and store the sum into sum variable (How
can you im plem ent it in C++ ? )
.data
var1 db 10h
var2 db 20h
var3 db 30h
sum db 0 ; assume sum of var1, var2 and var3
; can fit into an 8-bit variable sum
.code
…….. ; to be added later
mov al,var1 ; get first number
add al,var2 ; sum of var1 and var2
add al, var3 ; sum of var1, var2 and var3
mov sum,al ; store the sum
.data
arrayB db 10h,20h,30h
sum db 0 ; or sum db ?
.code
………… ; to be added later
mov bx,offset arrayB
mov al,[bx] ; get first number
add al,[bx+1] ; add 1st and 2nd numbers
add al, [bx+2] ; sum of three numbers
mov sum,al ; store the sum
Hom ew ork : Re-write the above example without using OFFSET operator
== END ==