AD Up Dig Design Be A
AD Up Dig Design Be A
3
Course requirements
Pre-requisites to this course are
i) knowledge of
• Digital logic; principles and design.
• High level language programming.
ii) core concepts of
• Computer Organisation/Architecture
iii) and rudimentary idea of
• Operating System
4
What is a Microprocessor?
• It is usually a single IC device that contains the
CPU of a typical computer. So, a
microprocessor is a CPU in a single chip.
• Major functional units of a microprocessor are:
– Control Unit (CU)
– Arithmetic and logic unit (ALU)
– Instruction Decoder (ID)
– Small high speed memory (Registers)
– Bus interface units (BIU)
– Buffers, Cache and different pipelines
These functional units are connected through
internal buses and exclusive data paths. 5
Why Microprocessors?
Microprocessor based design has practically outdone all
discrete IC based non-CPU oriented design due its
many faceted advantages.
• Accumulator based
– Old architecture with limited hardware resources.
– One register (Accumulator) is used to hold one of
the operands and the result (1-address machine).
• General Register
– All registers are equally powerful.
– All modern processors are of this type (3-address
machine). 8
Processor: Resource and action
• Irrespective of the architecture, the main
resource of any processor is the memory in
which instruction and data are stored. A
processor is connected to this resource
through Buses; Address and data buses.
• The processor does very simple external
(i.e., bus) operations; namely, bus read and
bus write in order to fetch instruction and
data from the memory and to write the
computed results back into the memory.
9
Memory: The main resource
• Memory holds everything; data, instruction and
any other information.
• By the term memory we mean the external
semiconductor memory (RAM + ROM).
• CPU usually has a small amount of very high
speed memory, known as register.
• Registers hold data and addresses to reduce
CPU memory (external) interactions for faster
processing.
• Modern CPUs also have internal cache memory.
10
The Memory Hierarchy
Few bytes to
Registers
Mega and Giga bytes
Cost/bit
On-chip Cache
is high
External Cache
Data Bus
(n-bits)
CPU MEMORY
Control
Lines
14
Microprocessor: Word size
• Arithmetic and logic operations are carried out
in the Arithmetic and Logic Unit (ALU).
• The size of the operands in bits is known as the
word size.
• If a processor can process N bit operand at a
time (for majority of its instruction); the word
size is N and consequently the CPU is known
as an N-bit processor.
• These days, N happens to be an integral
multiple of 8. So, 8, 16, 32 or 64 bit CPUs are
commonly available. 15
Word size vs. Bus size
• Address bus size dictates the memory address
space ( 2m for m bits; e.g., 64K for 16 bits).
• Data bus size determines the number of bits
that can be read/written in a single bus cycle. It
is usually equal to the word size of the
processor; i.e., N bits for N-bit CPU. However,
exceptions are there and the CPU reads/writes
word data through ½ Word size (or less)
through multiple bus cycles. Performance
penalty is compromised for the ease of
backward integration with the available
peripherals as well as for cost benefits.
16
Should we start with 8-bit
• 8-bit processors, though not very powerful, are
ideal to learn the basic principles and
mastering the hardware design due to their
simplicity, low cost and high availability of the
support systems.
• As such we cannot say that a particular
processor is the best one; there are so many
with different attributes and varying computing
power.
• A particular processor may be suitable for a
particular application.
• However, the underlying principles are same
for all processors. 17
Hypothetical or real life CPU
• We will use INTEL 8085A 8-bit microprocessor
for learning the basic principles. It is a low cost
simple CPU with high availability of all sorts of
hardware and software support for learning and
development systems.
• Sometimes, a hypothetical, ideal, all powerful
CPU is considered for training. However, this
approach suffers from the non-availability of all
types of real life development and testing
facilities; simulation can only be done.
18
Hardware and Software
• For small system design (most usage fall in this
category) hardware development is easy as off
the self and compatible components and
peripherals are available from the vendors and
the design is more of an assembly of functional
blocks.
• Software development calls for more effort
(70% or more) in the development cycle of a
product.
• S/W for small system is developed (usually) in
assembly language; a pre-requisite to this is to
know the processor programming model.
19
Processor Programming Model
• Processor programming model is the graphical
representation of different registers whose
contents can be manipulated through machine
instructions.
• For assembly language programming we also
need to know the instruction set and the
addressing modes available for a particular
CPU.
• Instruction set is the set of machine instructions
for computing and other operations while the
addressing modes dictate different ways of
20
getting the operands from memory to CPU.
8085A: Programming Model
b7 b 0 b7 b0 b7 b0
B C
The 5 flags are Sign, Zero,
Auxiliary carry, Parity and Carry
D E
b15 b0
8 bit register pairs (drawn side by side) may be used to hold 16 bit operands 21
Instruction Set
Instructions may be classified into 4 major groups
• Data Movement
– Any movement between registers, register to
memory or memory to register.
• Arithmetic and Logic
– Any processing involving the ALU (or Register).
• Branch
– All jumps, calls, returns & s/w interrupts; i.e.,
whenever we break the normal sequential flow of
execution.
• Special or Machine Control
– Any instruction other than the first three types.
22
Instruction Set: Examples
Here is a non-exhaustive list of instructions.
• Data Movement: MOV, MVI, LXI, PUSH, POP,
XCHG, XTHL, IN, OUT
• Arithmetic and Logic: ANA, ORA, XRA, CMP,
ADD, ADI, ADC, SUB, SUI, SBB, INC, DCR,
INX, DCX, DAD, DAA, CMC, STC
• Branch: JMP, JZ, JNZ, JC, JNC, JPO, JPE, JP,
JS, CALL, CZ, CNZ, CC, CNC, RET, RZ, RST
• Special or Machine control: HLT, NOP, EI, DI,
SIM, RIM 23
Data Movement
Some examples:
ANA B
XRA L
CMA ; complement accumulator
STC ; set carry
26
Branch Instructions
• All conditional and unconditional Jumps,
subroutine calls and returns as well as the
software interrupts are used to deviate from the
sequential execution flow where the next
instruction in memory is not executed and the
execution control is diverted to the instruction
stored elsewhere in the memory. e.g.,
29
Direct Addressing Mode
Here the operand address is directly (fully) specified in
the instruction. For example;
STA 2050H; encoding 32 50 20
ADD B; A A + B
INX H; HL HL + 1
PUSH H; (--SP) H, (--SP) L
DAD B; HL HL + BC
32
Register Indirect Mode
Here the address of the operand is available in
the register(s). This is advantageous for address
manipulation. No. of bits (In 8085A only 3-bit is
necessary to specify any 8 bit register) required
to represent register(s) is also much lower than
full m-bit address. e.g.,
MOV A, M ; Move the content of location M
; ( held in HL registers) to A
PUSH H ; Push H & L into stack top whose
; address comes from SP register.
33
Branch Instruction
Execution of instructions is usually sequential. However
after executing nth instruction we may not execute the
very next, i.e., the (n+1)th instruction. This is done by
branch instruction; branch is either conditional or
unconditional. Example:
Unconditional branch
JMP 508AH; go to location 508AH
PCHL ; Load PC with HL : basically jump to
; location stored in HL unconditionally
Conditional branch
JZ to_main; jump if Z flag = 1
34
Special or machine control
Some instructions do not really need any memory
or register operand and are used for special
purpose. Examples:
NOP ; No operation
HLT ; Halt
Addressing modes:
Register indirect mode is available only in its
crude form; offsets cannot be used. A negative
point is the absence of relative addressing mode.
36
Assembly Language Programming
• Assembly language is CPU specific
• It has a very simple form; a single line contains
a single instruction
• An instruction contains 3 parts
– label: a symbolic reference to a location
– op-code: part of the machine instruction
specifying the type of operation (movement,
branch etc.) to be performed.
– operand: one or more operands on which the
operation will be done.
• Comments (starts with a ; i.e., a semi-colon)
can also be added anywhere to improve
readability and are ignored by the assembler. 37
Instructions and addressing modes
• Note that an instruction class (say, data
movement) is available for many addressing
modes; e.g.;
MOV B, L ; register mode
MOV C, M ; register indirect mode
LDA 0B035H ; direct mode
LXI SP, 20FFH ; immediate mode
• Ideally each type of instruction should be
available for all addressing modes and all
registers should hold operands for all
instructions. However, this orthogonal property
may not be observed in all real life CPUs.
38
Assembly instructions: examples
LXI SP, 20FFH ; op-code and operand only
39
A Program
LXI SP, 2FFFH; initialize SP
XRA A ; A0 and carry0
MVI C, 10 ; loop count
MVI B, 1 ; start value of the term
L1: ADC B ; add
INR B ; generate next terms
DCR C ; decrease loop count
JNZ L1 ; If (Count <> 0) redo
RST 1 ; return to monitor
END ; this program adds 1 to 10 40
Op-code
44
Assembler directives
• PLC control
– ORG (origin) is used to initialize the PLC; e.g.,
ORG 2000H; Next byte of code/data will be
; assembled from 2000H
– $ is to get the current value of the PLC; e.g.,
ORG $+16; PLC PLC + 16
( $ is appearing in operand field; a special use)
45
Constants
EQU (equate) is used to define a constant, e.g;
NANDB MACRO
ANA B
CMA
ENDM
• Name of the macro is NANDB.
• MACRO and ENDM are directives indicating the
beginning and the end of the Macro. 53
Calling a Macro
Calling a macro is done by referring to its name as a
pseudo op-code in an instruction. e.g.,
NANDB
NOP
NANDB
Would be expanded by the macro-processor as:
ANA B
CMA
NOP
ANA B
CMA
So each call is replaced by the macro body. Note that
call and return overheads of the subroutine are absent.
54
Macro with parameter
NAND MACRO &R
ANA &R
CMA
ENDM
• This is a more powerful macro and can perform
NAND operation with Accumulator and any
other 8 bit register. e.g.,
NAND L ; NAND of A and L
NAND B ; NAND of A and B
• Multiple parameters can also be used if
necessary.
55
Conditional assembly
Other than copy-code and parameter passing the
third facility offered by macro assembler is
conditional assembly.
• It is similar to an IF statement in high level
language to test a condition and either assemble
a block of instruction or bypass it.
• This is useful to configure part(s) of a system
software to accommodate different h/w facilities
for the same platform. So, conditional assembly
is important and useful for the so called system
generation.
56
8085A Hardware
• 40 pin DIP package
• 8-bit data and 16-bit address lines
• Separate address space for Memory and I/O
• 5 interrupting inputs
• Multiplexed data and address (lower order) bus
• Special built-in serial ports
• Single +5 V power supply
• On-chip clock generator circuit
• DMA facility
• Ready input for interfacing with slow devices57
CPU: Pins (Function-wise)
• Address lines
• Data lines
• Control lines
• Status lines
• Interrupt input and acknowledge lines
• DMA/Bus arbitration lines
• Interfacing with slow peripherals line
• Master reset line
• Clock out, power supply and ground lines etc.
• Special purpose lines 58
8085A Functional pin diagram
59
Execution of Instructions
• Each instruction is fetched from the memory.
• Instruction is decoded in ID and decision is
taken about the next steps for execution; this
includes getting necessary operands from the
memory and carrying out the desired operation
(e.g., ADD, MOV etc.).
• An instruction contains all information like the
operation to be carried out (op-code) and the
details of the operands and addressing modes.
• Length of the instruction has no connection with
the complexity of the instruction. 60
BUS operation
• All external activities of the CPU can be
explained in terms of Bus operations.
• If we disregard CPU internal operations,
execution of instruction too can be seen in
terms of BUS operations (BUS read/write etc.).
• BUS operations are represented primarily as
BUS READ or BUS WRITE (these are also
known as machine cycles)
• Through BUS operation the CPU interact with
the memory; hence the basic machine cycles
are MEMORY READ and MEMORY WRITE 61
Machine cycles
• Instruction cycle consists of a number of machine
cycles
• Basic machine cycles are read and write
• Intel 8085A defines 7 machine cycles
– Op-code Fetch (OF)
– Memory read (MR)
– Memory write (MW)
– I/O read (IOR)
– I/O write (IOW)
– Interrupt acknowledge (INA)
– Bus idle (BI)
• These 7 cycles may be mapped to 2 cycles only.
62
Machine and BUS cycles
Considering the bus activity the 7 machine cycles
can be mapped either as memory read or
memory write due to the following facts.
• I/O devices are logically equivalent to memory
device.
• Bus activity is stopped during BI; so it may be
ignored as an external operation.
• OF machine cycle is same as MR with extra
time needed to decode the instruction during
which bus activity is stopped.
• In INA machine cycle memory read operation is
done to get the device identification number. 63
Memory Read
• It takes 3 T states
• Low order address is
latched during T1
• Memory needs time
to drive data bus with
the required data
during which the data
bus is in High Z state
• Throughout the cycle
Write and Ready lines
are high. 64
Memory Write
• It is similar to MR cycle.
• Address and Data, both are
supplied by the CPU.
• Data is available in the BUS
right from the beginning of T2;
so there is no high Z state
between T1 and T2 ( a notable
difference with MR where
Memory supplies data and it
takes time to drive the data bus
and hence there is a high Z
state between T1 and T2) 65
Op-code Fetch machine cycle
• It takes 4T states
• Operation in the first 3T
states is similar to MR
• One more T state is
required to decode the
instruction.
• Many instructions do not
take extra time as they call
for internal operation and
that can be carried out just
after the decoding. OF for
a few instructions take 6T
for complex internal
operations. 66
Machine cycles/Instruction
• Each instruction consists of a number of
machine cycles (Min. 1 to Max. 6).
• Each machine cycle consists of a number of
clock cycles (3, 4 or 6); e.g.,
STA 2050H ; encoding 32 50 20
• This instruction consists of the 4 machine
cycles (OF, MR, MR and MW) and 13T states.
• OF fetches the op-code (32H); CPU knows that
it requires 3 more machine cycles; two MRs to
read the address (50H and then 20H) and
finally an MW is needed to write the contents of
A register in the direct address 2050H. 67
Machine cycles/Instruction
For instruction MOV A, B; (code 78H)
• We need only the OF (4T) machine cycle.
• The opcode is fetched within the first 3T states.
• In next T state the code is deciphered and
contents of B is copied to A.
• As the registers are internal to the CPU it is
possible to decode and execute within 1 T.
However, MOV A, M; (code 84H)
• Takes OF and MR (i.e., 4T + 3T or 7T states)
• OF fetches the instruction that calls for a
memory read (external operation) to transfer
68
the contents from memory to A.
Instruction complexity
• Most OF cycles are 4T long; however some
takes 6T. To be precise fetching op-code is
similar to memory read and takes 3T.
However, complex instruction takes more time
to decode and even to complete internal
operations it takes extra time; e.g.,
LXI H, JMPTAB :
MVI B, 0 RTN1: <instr>
ADD A ;AA*2 :
MOV C, A :
DAD B
RTNn: <instr>
MOV E, M
INX H
JMPTAB: DW RTN0
MOV D, M
XCHG DW RTN1
PCHL :
: DW RTNn
85
Peripherals
• For any real life computing system we need to
connect peripheral devices with the basic CPU-
Memory computing backbone. These
peripherals are primarily I/O devices of different
types. In order to reduce the burden on the
CPU off-the-shelf peripheral controllers are
also used to interface them. These peripheral
controllers under instruction from the CPU
increases the throughput of the system by
making the CPU free from routine work of a
structured and synchronised approach to
respond to the need of these devices and to
ensure a fair and maximum utilisation of them. 86
Peripheral controllers
• Practically for any I/O device we thus need
peripheral controllers.
• These controllers are programmable; i.e., the
CPU initializes them and send basic operating
instructions.
• Peripheral controllers are logically the extended
arm of the CPU relieving it from the daily
chores of managing the need of the I/O devices
connected to the system.
• Off the shelf peripheral controllers are PIO,
PIC, DMA, SIO, CRT and Keyboard controllers. 87
Interrupts
• Interrupts are requests made by the external
devices to get some service from the CPU.
• CPU after getting interrupt request suspends
the current task, identifies the interrupting
device and executes a selected Interrupt
Service Routine (ISR) to serve the device.
• The suspended task is resumed once the ISR
is done.
• Other than the external h/w interrupts s/w
interrupts and internal h/w interrupts are also
posssible. 88
External H/W Interrupts
• By the term interrupt we normally mean External
hardware interrupts from the I/O devices.
• These are asynchronous external event and are
used to increase the I/O throughput of the
system.
• In case of multiple simultaneous interrupts the
CPU applies some priority logic to decide whom
to serve first.
• CPU may not accept or ignore any interrupt
requests if it is busy doing something important.
• A non-maskable interrupt input is usually
available in the CPU. 89
Interrupt lines 8085A
100
Programming Model
Data memory
127
Program memory
07FFH
31
Reg. Bank 1
(RB1)
24
(8 – 23) 16 byte Stack
7: Timer interrupt
7: R7 8 Registers, R0
& R1 can act
as address
3: Interrupt
Registers.
1: R1
Reg. Bank 0
0: Reset 0: R0
(RB0)
101
Loc. no. 0, 3 & 7 are special
Stack
MCS-48 uC’s have limited data memory stack (16 bytes
only) allowing the user up to 8 level of nesting. For all
practical purpose this is enough. However, implementing
recursive routine will be risky due to small stack space.
PC bits and some flag (PSW7-4) bits are stored in the
stack automatically.
CY AC F0 BS 1 S2 S1 S0
102
Addressing Modes
MCS-48 is equipped with all the standard
addressing modes as well as special modes
like paged mode. Here are examples through
instructions.
• Direct ; JMP address (12 bits)
• Immediate ; MOV A, #data
• Register ; ADD A, Ri (i=0,1,…,7)
• Register Indirect ; ADD A, @Rx (x=0,1)
• Paged ; MOVP A, @A
• Relative ; J(cond) address (8 bits)
– several possible conditions are possible; e.g.,
103
JC/JNC/JZ/JNZ/JBb etc.
Instructions
• In comparison to 8085, MCS-48 instructions are
short but powerful.
• Most of the instructions (over 90%) are 1 byte
and executed in a single cycle.
• Bit testing facility is available (e.g., JBb <addr>)
allows user to test any bits (b=0 to 7) of the
accumulator to branch to an address.
• Logic operations can be done directly at the I/O
ports and facilitates control programs (e.g.,
instruction ORL P0, #data; does a logical OR
operation with the current data at port P0 with
the mask specified by #data 104
Sample programs
pakdig: ; packs bits 0 – 3 of locations 50-51 into
location 50.
mov R0, #50
mov R1, #51
XCHD A, @R0 ; exch. bits 0-3 of Acc
; with location 50
SWAP A ; exch bits 0-3 & 4-7
; of Acc
XCHD A, @R1
MOV @R0, A 105
More examples
LOC3: JNI INIT ; Jump to routine INIT if
; interrupt input is 0
INIT: MOV R7, A
SEL RB1
MOV R7, #0FA
:
SEL RB0
MOV A,R7
RETR ; RET FROM INTR
; RESTORE A & PC106
More examples
This routine disable interrupt; but jump to interrupt
routine after 8 overflows and stop timer.
JMP MAIN
COUNT: INC R7
MOV A, R7
JB3 INT
JMP MAIN
INT: STOP TCNT
JMP 7H
107
More examples
mov128: mov A, #128
movp A, @A
This two instructions would move the contents of
memory location 129 (in the current page) to the
accumulator.
page3: mov a, #0bFH
ani a, #7FH
movp3 a, @a
This will transfer the contents of location no. 38H of
page 3 to Accumulator. This two instructions are useful
to access data stored permanently from program
memory. 108
Logic operations at I/O ports
• Logic operations can be done directly at the ports. Let us
assume that one 8-bit 8255 port; say bit 3 is controlling a process.
Now you would like to set bit 3 without disturbing the other bits.
; 8085 example
; MCS-48 example
CONWORD DS 1 ; allocate a byte
109
Powerful CPUs
• 8-bit processors are good for simple
applications. Natural extension to these 8-bit
CPUs are the 16-bit processors which may be
used to design general purpose low cost
computers.
• Now 32 or even 64 bit CPUs are available in
the market and are extensively used in
computing.
• It may be noted that the general principles are
same for any processor; however the higher
order processors are powerful and have more
throughput due to various factors. 110
Powerful CPUs
• No practical limitation on address space and
data width.
• Superscalar performance exploiting pipeline and
other mechanism for parallel operation.
• Use of multiple level cache to reduce the CPU
memory speed gap.
• Special Hardware/Software features to
implement multiprocessing/multitasking OS.
AD4
AD3 M/IO
27. DT/R (S1)
AD2 DT/R
15 0
AH AL AX
BH BL BX
CH CL CX
DH DL DX
SI Flag register MS byte
DI
BP OF DF IF TF
SP
CS
DS Flag register LS byte
SS
Same as 8085A
ES
F
IP 115
8086 Operand addressing modes
Mode Forms and alternatives Examples
other_data segment
unpacked_number db 8, 7, 6, 5, 4, 3, 2, 1
other_data segment
prog_data segment
assume cs:prog_data, ds:main_data, es:other_data
prog_start: mov ax, main_data
mov ds, ax
mov ax, other_data
mov es, ax 118
mov bx, offset packed_number
mov si, 0
mov di, si
mov cx, 4
pack: mov ax, word ptr es: unpacked_number[si]
mov dx, cx
mov cl, 4
shl ah, cl
add al, ah
mov [bx][di], al
add si, 2
inc di
loop pack
hlt ; a better option is a syscall to OS
prog_code ends
end prog_start 119
More examples
; One of the common use of the stack is
parameter passing between functions. The
following program fragments show an example.
121
; procedure prologue SP---
push bp ; save bp
mov bp, sp ; establish base pointer Stack: Prior to PUSH
push bx
push cx ;save caller’s
pushf ; reg & flags
sub sp, 6 ; allocate local storage
; end of prolgue
125
126
RISC Processor
The processors we have discussed are known as
CISC (Complex instruction set computing). In
CISC processor there are many types of
instructions with many addressing modes.
Logically the motive of the designers were to
reduce the semantic gap between higher level
language and the machine level (or assembly
level) facilities. In 70s’ a question was raised by
the community whether it is profitable to have
CISC processor.
127
RISC Processor contd.
It was found that complex instructions and
addressing modes are rarely used by the
compliers. In real life also bricks are the building
blocks of walls and subsequent bigger and
complex structures; the reverse is not true. This
observation gave rise to the quest for having less
complex CPU with more computing throughput.
128
RISC Processor contd.
It may be noted that the distinction between CISC
and RISC is getting blurred and in all modern
CPU the philosophy is ‘utilise best of both worlds’
and the long standing debate of which one is
better is no longer pursued. Some of the RISC
features are given below:
• Reduced no of instructions and addr. modes
• Fixed length instruction
• LOAD/STORE machine
• H/W control
• More on-chip registers (More L1 cache) 129
More on RISC
130