INTRODUCTION TO
EMBEDDED SYSTEMS
1
ARM PROCESSOR
INSTRUCTION SET
Created by Mr. THOMAS KWANTWI 07/06/2025
Instructions
2
Instructions process data held in registers
and access memory with load and store
instructions
Classes of instructions:
Data processing
Branch instructions
Load-store instructions
Software interrupt instruction
Program status register instructions
Created by Mr. THOMAS KWANTWI 07/06/2025
Features of ARM instruction set
3
3-address data processing instructions
Conditional execution of every instruction
Load and store multiple registers
Shift, ALU operation in a single instruction
Open instruction set extension through the
coprocessor instruction
Created by Mr. THOMAS KWANTWI 07/06/2025
ARM data types
4
Word is 32 bit long.
Word can be divided into four 8-bit bytes.
ARM addresses can be 32 bits long.
Address refers to byte.
Address 4 starts at byte 4.
Can be configured at power-up as either little
or big-endian mode
Created by Mr. THOMAS KWANTWI 07/06/2025
Data Processing Instructions
5
Manipulate data within registers
MOVE instructions
Arithmetic instructions (Multiply instructions)
Logical instructions
Comparison instructions
Suffix S on data processing instructions updates
flags in CPSR
Operands are 32-bits wide; come from registers or
specified as a literal in the instruction itself
Second operand sent to ALU via barrel shifter
32-bits result placed in register; long multiply
instruction produces 64-bit result
Created by Mr. THOMAS KWANTWI 07/06/2025
Move instruction
6
MOV Rd, N
Rd : destination register
N : can be an immediate value or source
register
Example: mov r7, r5
MVN Rd, N
Move into Rd not of the 32-bit value from
source
Created by Mr. THOMAS KWANTWI 07/06/2025
Using Barrel Shifter
7
Enables shifting 32-bit operand in one of the
source registers left or right by a specific
number of positions within the cycle time of
instruction
Basic Barrel shifter operations
Shift left, shift right, rotate right
Facilitates fast multiply, division and
increases code density
Example: mov r7, r5, LSL #2
Multiply content of r5 by 4 and puts result in
r7
Created by Mr. THOMAS KWANTWI 07/06/2025
Arithmetic Instructions
8
Implements 32 bit addition and subtraction
3-operand form
Examples:
SUB r0, r1, r2
Subtract value stored in r2 from that of r1
and store in r0
SUBS r1, r1, #1
Subtract 1 from r1 and store result in r1
and update Z and C flags
Created by Mr. THOMAS KWANTWI 07/06/2025
With Barrel Shifter
9
Use of barrel shifter with arithmetic and
logical instructions increases the set of
possible available operations
Example
ADD r0, r1, r1 LSL #1
Register r1 is shifted to the left by 1, then it
is added with r1 and the result (3 times r1) is
stored in r0.
Created by Mr. THOMAS KWANTWI 07/06/2025
Multiply Instructions
10
Multiply contents of a pair of registers
Long multiply generates 64 bit result
Examples:
MUL r0, r1, r2
Contents of r1 and r2 multiplied and put
in r0
UMULL r0, r1, r2, r3
Unsigned multiply with result stored in r0
and r1
Number of cycles taken for execution of
multiply instruction depends upon processor
Created by Mr. THOMAS KWANTWI 07/06/2025
Multiply and Accumulate
11
Result of multiplication can be accumulated
with content of another register
MLA Rd, Rm, Rs, Rn
Rd = (Rm*Rs) + Rn
UMLAL Rdlo, Rdhi, Rm, Rs
[Rdhi, Rdlo] = [Rdhi, Rdlo] +
(Rm*Rs)
Created by Mr. THOMAS KWANTWI 07/06/2025
Logical Instructions
12
Bit wise logical operations on the two source
registers
AND, OR, Ex-OR, bit clear
Example: BIC r0, r1, r2
o R2 contains a binary pattern where every
binary 1 in r2 clears a corresponding bit
location in register r1
o Useful in manipulating status flags and
interrupt masks
Created by Mr. THOMAS KWANTWI 07/06/2025
Compare Instructions
13
Enables comparison of 32 bit values
Updates CPSR flags but do not affect other
registers
Examples
o CMP r0, r9
Flags set as a result of r0-r9
o TEQ r0, r9
Flags set as a result r0 ex-or r9
o TST r0, r9
Flags as a result of r0 & r9
Created by Mr. THOMAS KWANTWI 07/06/2025
Load –Store Instructions
14
Transfers data between memory and processor
registers
Single register transfer
o Data types supported are signed and unsigned
words (32 bits), half-words, bytes
Multiple-register transfer
o Transfer multiple registers between memory
and the processor in a single instruction
Swap
o Swaps content of a memory location with the
contents of a register
Created by Mr. THOMAS KWANTWI 07/06/2025
Single Transfer Instructions
15
Load & Store data on a boundary alignment
LDR, LDRH, LDRB:
load (word, half-word, byte)
STR, STRH, STRB:
store (word, half-word, byte)
Supports different addressing modes:
Register indirect: LDR r0, [r1]
Immediate: LDR r0, [r1, #4]
12-bit offset added to the base register
Register operation: LDR r0, [r1, -r2]
Address calculated using base register and
another register
Created by Mr. THOMAS KWANTWI 07/06/2025
More addressing modes
16
Scaled
Address is calculated using the base address
register and a barrel shift operation
Pre & Post Indexing
Pre-index with write back: LDR r0, [r1, #4]!
Updates the address base register with new
address
Post index: LDR r0, [r1], #4
Updates the address register after address
is used
Created by Mr. THOMAS KWANTWI 07/06/2025
Example
17
Pre-indexing with write back
LDR r0, [r1, #4]!
Before instruction execution
r0 = 0x00000000 r1 = 0x00009000
Mem32[0x00009000] = 0x01010101
Mem32[0x00009004] = 0x02020202
After instruction execution
r0 = 0x02020202
r1 = 0x00009004
Created by Mr. THOMAS KWANTWI 07/06/2025
Multiple Register Transfer
18
Load-store multiple instructions transfer
multiple register contents between memory
and the processor in a single instruction
More efficient – for moving blocks of memory
and saving and restoring context and stack
These instructions can increase interrupt
latency
Usually instruction executions are not
interrupted by ARM
Created by Mr. THOMAS KWANTWI 07/06/2025
Multiple Byte Load-Store
19
Any subset of current bank of registers can
be transferred to memory or fetched from
memory
LDM
SDM
The base register Rn determines source or
destination address
Created by Mr. THOMAS KWANTWI 07/06/2025
Addressing Modes
20
LDMIA│IB│DA│DB ex: LDMIA Rn!, {r1 –
r3}
STMIA│IB│DA│DB
Start End Rn!
Address Address
IA Increment Rn Rn + 4*N -4 Rn + 4*N
after
IB Increment Rn + 4 Rn +4 *N Rn + 4*N
before
DA Decrement Rn +4*N +4 Rn Rn – 4*N
after
DB Decrement Rn – 4 *N Rn - 4 Rn – 4*N
before
Created by Mr. THOMAS KWANTWI 07/06/2025
Stack Processing
21
A stack is implemented as a linear data
structure which grows up (ascending) or
down (descending)
Stack pointer hold the address of the current
top of the stack
Created by Mr. THOMAS KWANTWI 07/06/2025
Modes of Stack Operation
22
ARM multiple register instructions support
Full ascending: grows up, SP points to the
highest address containing a valid item
Empty ascending: grows up, SP points to the
first empty location above stack
Full descending: grows down, SP points to
the lowest address containing a valid data
Empty descending: grows down, SP points to
the first location below the stack.
Created by Mr. THOMAS KWANTWI 07/06/2025
Some Stack Instructions
23
Full Ascending
LDMFA: translates to LDMA (POP)
STMFA: translates to STMIB (PUSH)
SP points to last item in stack
Empty Descending
LDMED: translates to LDMIB (POP)
STMED: translates to STMIA (PUSH)
SP points to first unused location
Created by Mr. THOMAS KWANTWI 07/06/2025
SWAP Instruction
24
Special case of load store instruction
Swap instructions:
SWP: swap a word between memory and
register
SWPB: swap a byte between memory and
register
Useful for implementing synchronization
primitives like semaphore
Created by Mr. THOMAS KWANTWI 07/06/2025
Control Flow Instructions
25
Branch Instructions
Conditional Branches
Conditional Execution
Branch and Link instructions
Subroutine Return Instructions
Created by Mr. THOMAS KWANTWI 07/06/2025
Branch Instruction
26
Branch instruction: B label
Example: B forward
Address label is stored in the instruction as a
signed pc-relative offset
Conditional Branch: B<cond>label
Example has a condition associated with it
and executed if condition codes have the
correct value
Created by Mr. THOMAS KWANTWI 07/06/2025
Example: Block memory copy
27
Loop LDMIA r9!, {r0-r7}
STMIA r10!, {r0-r7}
CMP r9, r11
BNE Loop
R9 points to source of data, r10 points to start
of destination data, r11 points to end of the
source
Created by Mr. THOMAS KWANTWI 07/06/2025
Conditional Execution
28
An unusual feature of ARM instruction set is
that conditional execution applies not only to
branches but to all ARM instructions
Example: ADDEQ r0, r1, r2
Instruction will only be executed when the
zero flag is set to 1
Created by Mr. THOMAS KWANTWI 07/06/2025
Advantages
29
Reduces the number of branches
Reduces the number of pipeline flushes
Improves performance of the code
Increases code density
Whenever the conditional sequence is 3
instructions or fewer (smaller and faster) to
exploit conditional execution than to use a
branch
Created by Mr. THOMAS KWANTWI 07/06/2025
Branch & Link Instruction
30
Perform a branch, save the address following
the branch in the link register, r14
Example: BL subroutine
For nested subroutine, push r14 and some
work registers required to be saved onto a
stack in memory
Example
BL sub1
STMFD r13!,{r0-r2, r14}
BL sub2
Created by Mr. THOMAS KWANTWI 07/06/2025
Subroutine return instructions
31
No specific instructions
Example (1):
sub ……
MOV PC, r14
Example (2): when return address has been
pushed to stack
sub2 …….
LDMFD r13!,{r0-r12,PC}
Created by Mr. THOMAS KWANTWI 07/06/2025
Software Interrupt Instruction
(SWI)
32
A software instruction causes a software
interrupt exception, which provides a
mechanism for applications to OS routines
Instruction: SWI{<COND>} SWI_number
When the processor executes an SWI
instruction, it sets the program counter PC to
the offset 0x8 in the vector table
Instruction also forces the processor mode to
SVC, which allows an operating system
routine to execute
Created by Mr. THOMAS KWANTWI 07/06/2025
SWI
33
SWI is typically executed in user mode
Instruction forces processor mode to
supervisor (SVC)- this allows an OS routine to
be executed in privileged mode
Each SWI has an associated SWI number
which is used to represent a particular
function call or feature
Parameter passing – through registers; return
value is also passed using registers
Created by Mr. THOMAS KWANTWI 07/06/2025
Example
34
PRE : cpsr = nzcvqift_USER
pc = 0x00008000
Ir = 0x003fffff (Ir = r14)
r0 = 0x12
0x00008000 SWI 0x123456
POST: cpsr = nzcvqift_SVC
spsr = nzcvqift_USER
pc = 0x00008004
Ir = 0x00008004 (Ir = r14_SVC)
r0 = 0x012
Created by Mr. THOMAS KWANTWI 07/06/2025
Program Status Register
Instructions
35
Two instructions to control PSR directly
MRS – transfers contents of either cpsr or
spsr into a register
MSR –transfers contents of register to cpsr or
spsr
Created by Mr. THOMAS KWANTWI 07/06/2025
Example
36
Enabling IRQ interrupt
PRE cpsr = nzcvqIFt_SVC
MSR r1, CPSR
BIC r1, r1, #0x080
MSR cpsr, r1
POST cpsr = nvcvqIFt_SVC
Instructions in SVC mode
Created by Mr. THOMAS KWANTWI 07/06/2025
Coprocessor Instructions
37
Used to extend the instruction set
Used by cores with a coprocessor
Coprocessor specific operations
Syntax: coprocessor data processing
CDP{<cond>} cp, opcode 1, Cd, Cn, Cm, {,opcode2}
o Cp represents coprocessor number between p0 to
p15
o Opcode field describes coprocessor operation
o Cd, Cn, Cm coprocessor registers
Also coprocessor register transfer and memory
transfer instructions
Created by Mr. THOMAS KWANTWI 07/06/2025
Thumb
38
Thumb encodes a subset of the 32bit
instruction set into a 16-bit subspace
Thumb has higher performance than ARM on
a processor with a 16-bit data bus
Thumb has higher code density
For memory constrained embedded system
Created by Mr. THOMAS KWANTWI 07/06/2025
Code density
39
ARM divide Thumb divide
MOV r3, #0 MOV r3, #0
Loop Loop
SUBS r0, r0, r1 ADD r3, #1
ADDGE r3, r3, #1 SUB r0, r1
BGE loop BGE loop
ADD r2, r0, r1 SUB r3, #1
ADD r2, r0, r1
5 * 4=20 bytes
6 *2=12
bytes
Created by Mr. THOMAS KWANTWI 07/06/2025
Thumb instructions
40
Only low registers r0 to r7 fully accessible
Higher registers accessible with MOV, ADD,
CMP instructions
Only branch instruction can be conditionally
executed
Barrel shift operations are separate
instructions
Created by Mr. THOMAS KWANTWI 07/06/2025
ARM-Thumb Interworking
41
To call a thumb routine from an ARM routine
the core has to change state
Changing T bit in CPSR
BX and BLX instruction can be used for the
switch
Example : BX r0; BLX r0
Enters Thumb state if bit 0 of the address in
Rn is set to binary 1; otherwise it enters ARM
state
Thumb
Created by Mr. THOMAS KWANTWI 07/06/2025
Thumb (T) Architecture
42
Thumb instruction
decoder is placed in
the same instruction
data path called the
instruction pipeline
path of the decoder.
Change in Thumb
mode happens by
changing the state of
multiplexers A1
Created by Mr. THOMAS KWANTWI 07/06/2025
ARMv5E Extensions
43
Extensions to facilitate signal processing
operations
Supports
Signed multiply accumulate instruction
Saturation Arithmetic
Greater flexibility and efficiency when
manipulating 16 bit values for applications
such 16-bit digital audio processing
Created by Mr. THOMAS KWANTWI 07/06/2025