Mces Notes

VTU: 21CS43
MICROCONTROLLER
2023
AND EMBEDDED
SYSTEMS
Mr.Chetan R
Reference Textbooks
1. Andrew N Sloss, Dominic Symes and Chris Wright, ARM system developers guide,
Elsevier, Morgan Kaufman publishers, 2008.
2. Shibu K V, “Introduction to Embedded Systems”, Tata McGraw Hill Education, Private
Limited, 2nd Edition.
03.09.2022
IV Semester
MICROCONTROLLER AND EMBEDDED SYSTEMS

Course Code 21CS43 CIE Marks 50
Teaching Hours/Week (L:T:P: S) 3:0:2:0 SEE Marks 50
Total Hours of Pedagogy 40 T + 20 P Total Marks 100
Credits 04 Exam Hours 03
Course Learning Objectives:
CLO 1: Understand the fundamentals of ARM-based systems, including programming modules with
registers and the CPSR.
CLO 2: Use the various instructions to program the ARM controller.
CLO 3: Program various embedded components using the embedded C program.
CLO 4: Identify various components, their purpose, and their application to the embedded system's
applicability.
CLO 5: Understand the embedded system's real-time operating system and its application in IoT.
Teaching-Learning Process (General Instructions)
These are sample Strategies, which teachers can use to accelerate the attainment of the various course
outcomes.
1. The lecturer method (L) does not mean only the traditional lecture method, but different types of
teaching methods may be adopted to develop the outcomes.
2. Show video/animation films to explain the functioning of various concepts.
3. Encourage collaborative (group learning) learning in the class.
4. Ask at least three HOT (Higher order Thinking) questions in the class, which promotes critical
thinking.
5. Adopt Problem Based Learning (PBL), which fosters students’ Analytical skills, develop thinking
skills such as the ability to evaluate, generalize, and analyze information rather than simply recall
it.
6. Topics will be introduced in multiple representations.
7. Show the different ways to solve the same problem and encourage the students to come up with
their own creative ways to solve them.
8. Discuss how every concept can be applied to the real world, and when that's possible, it helps
improve the students' understanding.
Module-1
Microprocessors versus Microcontrollers, ARM Embedded Systems: The RISC design philosophy, The
ARM Design Philosophy, Embedded System Hardware, Embedded System Software.
ARM Processor Fundamentals: Registers, Current Program Status Register, Pipeline, Exceptions,
Interrupts, and the Vector Table, Core Extensions
Textbook 1: Chapter 1 - 1.1 to 1.4, Chapter 2 - 2.1 to 2.5

Laboratory Component:
1. Using Keil software, observe the various registers, dump, CPSR, with a simple ALP programme.
Teaching-Learning Process 1. Demonstration of registers, memory access, and CPSR in a
programme module.
2. For concepts, numerical, and discussion, use chalk and a
whiteboard, as well as a PowerPoint presentation.
Module-2
Introduction to the ARM Instruction Set: Data Processing Instructions , Branch Instructions, Software
Interrupt Instructions, Program Status Register Instructions, Coprocessor Instructions, Loading Constants
C Compilers and Optimization :Basic C Data Types, C Looping Structures, Register Allocation, Function
03.09.2022
Calls, Pointer Aliasing,
Textbook 1: Chapter 3: Sections 3.1 to 3.6 (Excluding 3.5.2), Chapter 5

2. Write a program to find the sum of the first 10 integer numbers.
3. Write a program to find the factorial of a number.
4. Write a program to add an array of 16 bit numbers and store the 32 bit result in internal RAM.
5. Write a program to find the square of a number (1 to 10) using a look-up table.
6. Write a program to find the largest or smallest number in an array of 32 numbers.
Teaching-Learning Process 1. Demonstration of sample code using Keil software.

2. Laboratory Demonstration
Module-3
C Compilers and Optimization :Structure Arrangement, Bit-fields, Unaligned Data and Endianness,
Division, Floating Point, Inline Functions and Inline Assembly, Portability Issues.
ARM programming using Assembly language: Writing Assembly code, Profiling and cycle counting,
instruction scheduling, Register Allocation, Conditional Execution, Looping Constructs
Textbook 1: Chapter-5,6
1. Write a program to arrange a series of 32 bit numbers in ascending/descending order.
2. Write a program to count the number of ones and zeros in two consecutive memory
locations.
3. Display “Hello World” message using Internal UART.
Teaching-Learning Process 1. Demonstration of sample code using Keil software.

2. Chalk and Board for numerical
Module-4
Embedded System Components: Embedded Vs General computing system, History of embedded
systems, Classification of Embedded systems, Major applications areas of embedded systems, purpose of
embedded systems.
Core of an Embedded System including all types of processor/controller, Memory, Sensors, Actuators,
LED, 7 segment LED display, stepper motor, Keyboard, Push button switch, Communication Interface
(onboard and external types), Embedded firmware, Other system components.
Textbook 2: Chapter 1 (Sections 1.2 to 1.6), Chapter 2 (Sections 2.1 to 2.6)

1. Interface and Control a DC Motor.
2. Interface a Stepper motor and rotate it in clockwise and anti-clockwise direction.
3. Determine Digital output for a given Analog input using Internal ADC of ARM controller.
4. Interface a DAC and generate Triangular and Square waveforms.
5. Interface a 4x4 keyboard and display the key code on an LCD.
6. Demonstrate the use of an external interrupt to toggle an LED On/Off.
7. Display the Hex digits 0 to F on a 7-segment LED interface, with an appropriate delay in between.
Teaching-Learning Process 1. Demonstration of sample code for various embedded
components using keil.
2. Chalk and Board for numerical and discussion
Module-5
RTOS and IDE for Embedded System Design: Operating System basics, Types of operating systems,
Task, process and threads (Only POSIX Threads with an example program), Thread preemption,
Multiprocessing and Multitasking, Task Communication (without any program), Task synchronization
03.09.2022
issues – Racing and Deadlock, Concept of Binary and counting semaphores (Mutex example without any
program), How to choose an RTOS, Integration and testing of Embedded hardware and firmware,
Embedded system Development Environment – Block diagram (excluding Keil),
Disassembler/decompiler, simulator, emulator and debugging techniques, target hardware debugging,
boundary scan.
Textbook 2: Chapter-10 (Sections 10.1, 10.2, 10.3, 10.4 , 10.7, 10.8.1.1, 10.8.1.2, 10.8.2.2, 10.10
only), Chapter 12, Chapter-13 ( block diagram before 13.1, 13.3, 13.4, 13.5, 13.6 only)
1. Demonstration of IoT applications by using Arduino and Raspberry Pi
Teaching-Learning Process 1. Chalk and Board for numerical and discussion
2. Significance of real time operating system[RTOS] using
raspberry pi
Course outcome (Course Skill Set)
At the end of the course, the student will be able to:
CO 1. Explain C-Compilers and optimization
CO 2. Describe the ARM microcontroller's architectural features and program module.
CO 3. Apply the knowledge gained from programming on ARM to different applications.
CO 4. Program the basic hardware components and their application selection method.
CO 5. Demonstrate the need for a real-time operating system for embedded system applications.
Assessment Details (both CIE and SEE)
The weightage of Continuous Internal Evaluation (CIE) is 50% and for Semester End Exam (SEE) is 50%.
The minimum passing mark for the CIE is 40% of the maximum marks (20 marks). A student shall be
deemed to have satisfied the academic requirements and earned the credits allotted to each subject/
course if the student secures not less than 35% (18 Marks out of 50) in the semester-end examination
(SEE), and a minimum of 40% (40 marks out of 100) in the sum total of the CIE (Continuous Internal
Evaluation) and SEE (Semester End Examination) taken together
Continuous Internal Evaluation:
Three Unit Tests each of 20 Marks (duration 01 hour)
1. First test at the end of 5th week of the semester
2. Second test at the end of the 10th week of the semester
3. Third test at the end of the 15th week of the semester
Two assignments each of 10 Marks
4. First assignment at the end of 4th week of the semester
5. Second assignment at the end of 9th week of the semester
Practical Sessions need to be assessed by appropriate rubrics and viva-voce method. This will contribute
to 20 marks.
 Rubrics for each Experiment taken average for all Lab components – 15 Marks.
 Viva-Voce– 5 Marks (more emphasized on demonstration topics)
The sum of three tests, two assignments, and practical sessions will be out of 100 marks and will be
scaled down to 50 marks
(to have a less stressed CIE, the portion of the syllabus should not be common /repeated for any of the
methods of the CIE. Each method of CIE should have a different syllabus portion of the course).
CIE methods /question paper has to be designed to attain the different levels of Bloom’s taxonomy
as per the outcome defined for the course.
Semester End Examination:

Theory SEE will be conducted by University as per the scheduled timetable, with common question
papers for the subject (duration 03 hours)
03.09.2022
1. The question paper will have ten questions. Each question is set for 20 marks.
2. There will be 2 questions from each module. Each of the two questions under a module (with a
maximum of 3 sub-questions), should have a mix of topics under that module.
3. The students have to answer 5 full questions, selecting one full question from each module.
Marks scored shall be proportionally reduced to 50 marks
Suggested Learning Resources:

Textbooks
1. Andrew N Sloss, Dominic Symes and Chris Wright, ARM system developers guide, Elsevier,
Morgan Kaufman publishers, 2008.
2. Shibu K V, “Introduction to Embedded Systems”, Tata McGraw Hill Education, Private Limited, 2 nd
Edition.
Reference Books
1. Raghunandan. G.H, Microcontroller (ARM) and Embedded System, Cengage learning
Publication,2019
2. The Insider’s Guide to the ARM7 Based Microcontrollers, Hitex Ltd.,1st edition, 2005.
3. Steve Furber, ARM System-on-Chip Architecture, Second Edition, Pearson, 2015.
4. Raj Kamal, Embedded System, Tata McGraw-Hill Publishers, 2nd Edition, 2008.
Weblinks and Video Lectures (e-Resources):
Activity Based Learning (Suggested Activities in Class)/ Practical Based learning

MICROCONTROLLER & EMBEDDED SYSTEM
QUESTION BANK
MODULE 1
1. Compare microprocessor and microcontroller

2. Differentiate between CISC and RISC
3. Discuss the ARM design philosophy
4. With a neat block diagram, explain typical ARM based embedded system
5. With the help of basic layout diagram,explain the current program status register
6. Explain the different processor modes of ARM and draw the complete register set diagram
7. What is pipeline in ARM? Explain the different pipeline stages of ARM processors
8. Write a note on a. Vector table b. Core extension
MODULE 2
1. With example,explain the following ARM instructions

i.MOV ii.MVN iii.ADC iv.RSC v.BIC
2. Explain the different branch instructions of ARM processor
3. Explain the multiply instrucitons of ARM processor
4. Explain the different data processing instrucitons in ARM
5. Briefly explain the different Load-Store instruction categories used with ARM
6. Write a program for forward and backward branch by considering an example
7. Explain Co-processor Instructions of ARM processor
8. Why should we avoid using char and short as local variable types? justify with an example
codes
9. Explain how the compiler treats a loop by incrementing and decrementing the loop with an
example code
10. Write a note on register allocation
11. Explain ARM Procedure Call Standard (APCS) with diagram
12. Explain pointer aliasing with example codes.
MODULE 3
1. How a frequently used structure can have a significant impact on its performance and code
density
2. Write a note on Bit-fields,
3. What is the best way to deal with endian and alignment problems?,
4. How Division operation is done in ARM,
5. Write a note on Portability Issues.
MODULE 4
1. Differentiate between Embedded Vs General Computing system.

2. With the diagram of elements of embedded system mention all the cores around which an
embedded system is built. Discuss any two in detail.
3. Define embedded systems. Explain the 6 purpose of embedded systems with an example for
each.
4. What are the different types of memories used in embedded system design? Explain the role
of each.
5. Differentiate between Harvard and Princeton architectures.
6. Differentiate between Little Endian and Big Endian architecture.
7. With the diagram explain SRAM & DRAM.
8. For the common anode configuration, find and tabulate the equivalent Hexadecimal numbers
to display the alphanumeric characters ‘1’, ‘2’, ‘6’, ‘7’, ‘9’ in seven segment LED display.
9. Write a note on a. RS232 b. UART.
10. Compare Wireless fidelity and Zigbee communication interface.
11. Compare IrDA and Bluetooth communication interfaces.
12. Explain the different ‘on board’ communication interfaces in brief.
15. Write a note on the followings a. RESET circuit b. Brown out protection c. Watchdog Timer
d. Push button switch e. Keyboard
16. Explain the working of stepper motor with an interface diagram.
MODULE 5
1. With a neat diagram, explain operating system architecture

2. Differentiate between hard real time and soft real time operating system with an example for
each.
3. Define Task, Process and Threads
4. Explain the process structure, process states and state transistions
5. Explain multithreading
6. Differentiate between Multiprocessing and Multitasking
7. Explain the concept of deadlock with a neat diagram
8. Write a note on Message passing
9. Explain the concept of Semaphore
10. Explain the functional and non functional requirements for selecting RTOS for an embedded
system
11. Explain the role of Integrated Development Environment (IDE) for Embedded Software
Development
12. Explain boundary scanning for hardware testing with diagram
13. Write a note on a. Disassembler b. Decompiler c. Debugging d. Emulator e. Simulator
MODULE 1
MODULE 1
Microprocessors versus Microcontrollers, ARM Embedded Systems: The RISC design philosophy, The ARM
Design Philosophy, Embedded System Hardware, Embedded System Software.
ARM Processor Fundamentals: Registers, Current Program Status Register, Pipeline, Exceptions, Interrupts,
and the Vector Table, Core Extensions
Textbook 1: Chapter 1 - 1.1 to 1.4, Chapter 2 - 2.1 to 2.5
Text book: Andrew N Sloss, Dominic Symes and Chris Wright, ARM system developers guide, Elsevier, Morgan Kaufman
publishers, 2008.
Prepared by: Mr.Chetan R, Sr.Asst.Professor, ECE Dept.
ARM Embedded System

The ARM processor core is a key component of many successful 32-bit embedded systems. You probably own
one yourself and may not even realize it! ARM cores are widely used in mobile phones, handheld organizers, and
a multitude of other everyday portable consumer devices. For example, one of ARM’s most successful cores is
the ARM7. It is known for its high code density and low power consumption, making it ideal for mobile
embedded devices.
SMVITM,UDUPI Page 1
RISC VS CISC
ARM Design Philosophy

 The ARM processor has been specifically designed to be small to reduce power consumption and extend
battery operation—essential for applications such as mobile phones and personal digital assistants
(PDAs).
 High code density is another major requirement since embedded systems have limited memory due to cost
and/or physical size restrictions. High code density is useful for applications that have limited on-board
memory, such as mobile phones and mass storage devices.
 In addition, embedded systems are price sensitive and use slow and low-cost memory devices. The ability
to use low-cost memory devices produces substantial savings.
 ARM has incorporated hardware debug technology within the processor so that software engineers can
view what is happening while the processor is executing code. With greater visibility, software engineers
can resolve issues faster, which has a direct effect on the time to market and reduces overall development
costs.
 Thumb 16-bit instruction set—ARM enhanced the processor core by adding a second 16-bit instruction set
called Thumb that permits the ARM core to execute either 16- or 32-bit instructions. The 16-bit
instructions improve code density by about 30% over 32-bit fixed-length instructions.
SMVITM,UDUPI Page 2
Embedded System Hardware
The four main hardware components are:
■ The ARM processor controls the embedded device. Different versions of the ARM processor are available to
suit the desired operating characteristics. An ARM processor comprises a core plus the surrounding
components that interface it with a bus. These components can include memory management and caches.
■ Controllers coordinate important functional blocks of the system. Two commonly found controllers are
interrupt and memory controllers.
• Memory controllers connect different types of memory to the processor bus. On power-up a memory
controller is configured in hardware to allow certain memory devices to be active. These memory
devices allow the initialization code to be executed.
• An interrupt controller provides a programmable governing policy that allows software to determine
which peripheral or device can interrupt the processor at any specific time by setting the appropriate
bits in the interrupt controller registers.
■ The peripherals provide all the input-output capability external to the chip and are responsible for the
uniqueness of the embedded device. ARM peripherals are memory mapped. Peripherals range from a simple
serial communication device to a more complex 802.11 wireless device.
■ A bus is used to communicate between different parts of the device.
SMVITM,UDUPI Page 3
ARM Bus Technology

Embedded devices use an on-chip bus that is internal to the chip and that allows different peripheral devices to be
interconnected with an ARM core. There are two different classes of devices attached to the bus.
 The ARM processor core is a bus master—a logical device capable of initiating a data transfer with
another device across the same bus.
 Peripherals tend to be bus slaves—logical devices capable only of responding to a transfer request from a
bus master device.
 The Advanced Microcontroller Bus Architecture (AMBA) was introduced in 1996 and has been widely
adopted as the on-chip bus architecture used for ARM processors.
 Peripherals developed with an AMBA interface simply be bolted onto the on-chip bus without having to
redesign an interface for each different processor architecture.
 ARM has introduced two variations on the AHB bus: Multi-layer AHB and AHB-Lite.
o AHB-Lite is limited to a single bus master.
o The new interconnects in Multi-layer AHB are good for systems with multiple processors.
 An AHB bus for the high- performance peripherals, an APB bus for the slower peripherals, and a third bus for
external peripherals, proprietary to this device. This external bus requires a specialized bridge to connect with
the AHB bus.
Memory
An embedded system has to have some form of memory to store and execute code. You have to compare price,
performance, and power consumption when deciding upon specific memory characteristics, such as hierarchy,
width, and type.
 Hierarchy
The fastest memory cache is physically located nearer the ARM processor core and the slowest secondary
memory is set further away. Generally the closer memory is to the processor core, the more it costs and the
smaller its capacity. The cache is placed between main memory and the core. It is used to speed up data transfer
between the processor and main memory.
SMVITM,UDUPI Page 4
 Width
The memory width is the number of bits the memory returns on each access—typically 8, 16, 32 bits. The
memory width has a direct effect on the overall performance and cost ratio.
 Types
Flash ROM can be written to as well as read. Its main use is for holding the device firmware or storing
long- term data that needs to be preserved after power is off.
SRAM Cell DRAM Cell
Made up of 6 CMOS transistors (MOSFET) Made up of a MOSFET and a capacitor
Doesn’t require refreshing Requires refreshing
Low capacity (Less dense) High capacity (Highly dense)
More expensive Less expensive
Fast in operation. Typical access time is 10ns Slow in operation due to refresh
requirements. Typical access time is 60ns.
Write operation is faster than read
operation.
Embedded System Software

 The initialization code is the first code executed on the board and is specific to a particular target or group
of targets. It sets up the minimum parts of the board before handing control over to the operating system.
 The Real time operating system (RTOS) provides an infrastructure to control applications and manage
hardware system resources.
 The device drivers are the third component (Middleware). They provide a consistent software interface to
the peripherals on the hardware device.
SMVITM,UDUPI Page 5
 Finally, an application performs one of the tasks required for a device. The software components can run
from ROM or RAM. ROM code that is fixed on the device (for example, the initialization code) is called
firmware.
 Initialization Code
It is common for ARM-based embedded systems to provide for memory remapping because it allows the system to start
the initialization code from ROM at power-up. The initialization code then redefines or remaps the memory map to
place RAM at address 0x00000000—an important step because then the exception vector table can be in RAM and
thus can be reprogrammed.
The initialization code handles a number of administrative tasks prior to handing control over to an operating
system image. We can group these different tasks into three phases: initial hardware configuration, diagnostics,
and booting.
SMVITM,UDUPI Page 6
Operation Modes
ARM processor has two modes and two privilege levels. The operation modes (thread mode and handler mode)
determine whether the processor is running a normal program or running an exception handler like an interrupt
handler or system exception handler.
Software in the privileged access level can switch the program into the user access level using the control register
(CPSR). When an exception takes place, the processor will always switch back to the privileged state and return
to the previous state when exiting the exception handler. A user program cannot change back to the privileged
state by writing to the control register. It has to go through an exception handler that programs the control register
(CPSR) to switch the processor back into the privileged access level when returning to thread mode.
Current Program Status Register (CPSR)

The ARM core uses the cpsr to monitor and control internal operations. The cpsr is a dedicated 32-bit register
and resides in the register file. Note that the shaded parts are reserved for future expansion. The cpsr is divided
into four fields, each 8 bits wide: flags, status, extension, and control. In current designs the extension and status
fields are reserved for future use. The control field contains the processor mode, state, and interrupt mask bits.
The flags field contains the condition flags
The processor mode determines which registers are active and the access rights to the cpsr register itself. Each
processor mode is either privileged or non-privileged: A privileged mode allows full read-write access to the cpsr.
SMVITM,UDUPI Page 7
Conversely, a non-privileged mode only allows read access to the control field in the cpsr but still allows read-
write access to the condition flags.
There are seven processor modes in total: six privileged modes (abort, fast interrupt request, interrupt request,
supervisor, system, and undefined) and one nonprivileged mode (user).
 The processor enters abort mode when there is a failed attempt to access memory.
 Fast interrupt request and interrupt request modes correspond to the two interrupt levels available on the
ARM processor.
 Supervisor mode is the mode that the processor is in after reset and is generally the mode that an operating
system kernel operates in.
 System mode is a special version of user mode that allows full read-write access to the cpsr.
 Undefined mode is used when the processor encounters an instruction that is undefined or not supported
by the implementation.
 User mode is used for programs and applications.
Interrupt Masks:
Interrupt masks are used to stop specific interrupt requests from interrupting the processor. There are two
interrupt request levels available on the ARM processor core—interrupt request (IRQ) and fast interrupt request
(FIQ).
The cpsr has two interrupt mask bits, 7 and 6 (or I and F ), which control the masking of IRQ and FIQ,
respectively. The I bit masks IRQ when set to binary 1, and similarly the F bit masks FIQ when set to binary 1.
‘T’ State:
The ARM instruction set is only active when the processor is in ARM state (T=0). Similarly the Thumb
instruction set is only active when the processor is in Thumb state (T=1).
Registers
General-purpose registers hold either data or an address. They are identified with the letter r prefixed to the
register number, there are up to 37 registers. The ARM processor has three registers assigned to a particular task
or special function: r13, r14, and r15. They are frequently given different labels to differentiate them from the
other registers.
SMVITM,UDUPI Page 8
 Register r13 is used as the stack pointer (sp) and stores the head of the stack in the current processor
mode.
 Register r14 is called the link register (lr) and is where the core puts the return address whenever it calls a
subroutine.
 Register r15 is the program counter (pc) and contains the address of the next instruction to be fetched by
the processor.
Every processor mode except user mode can change mode by writing directly to the mode bits of the cpsr. All
processor modes except system mode have a set of associated banked registers that are a subset of the main 16
registers. A banked register maps one-to- one onto a user mode register. If you change processor mode, a banked
register from the new mode will replace an existing register.
For example, when the processor is in the interrupt request mode, the instructions you execute still access
registers named r13 and r14. However, these registers are the banked registers r13_irq and r14_irq. The user
mode registers r13_usr and r14_usr are not affected by the instruction referencing these registers. A program still
has normal access to the other registers r0 to r12.
Saved program status register (spsr) stores the previous mode’s cpsr. You can see in the diagram the cpsr being
copied into spsr_xxx. Note that the spsr can only be modified and read in a privileged mode. There is no spsr
available in user mode.
SMVITM,UDUPI Page 9
Conditional flags
Condition flags are updated by comparisons and the result of ALU operations. Most ARM instructions can be
executed conditionally on the value of the condition flags. These flags are located in the most significant bits in
the cpsr. These bits are used for conditional execution.
Conditional execution controls whether or not the core will execute an instruction. Most instructions have a
condition attribute that determines if the core will execute it based on the setting of the condition flags. Prior to
execution, the processor compares the condition attribute with the condition flags in the cpsr. If they match, then
the instruction is executed; otherwise the instruction is ignored.
The condition attribute is postfixed to the instruction mnemonic, which is encoded into the instruction. Below
table lists the conditional execution code mnemonics.
Pipeline
Using a pipeline speeds up execution by fetching the next instruction while other instructions are being decoded
and executed.
SMVITM,UDUPI Page 10
 Fetch loads an instruction from memory.

 Decode identifies the instruction to be executed.
 Execute processes the instruction and writes the result back to a register.
It shows a sequence of three instructions being fetched, decoded, and executed by the processor. Each instruction
takes a single cycle to complete after the pipeline is filled.
As the pipeline length increases, the amount of work done at each stage is reduced, which allows the processor to
attain a higher operating frequency. This in turn increases the performance.
Exceptions, Interrupts, and the Vector Table

When an exception or interrupt occurs, the processor sets the pc to a specific memory address. The address is
within a special address range called the vector table. The entries in the vector table are instructions that branch to
specific routines designed to handle a particular exception or interrupt.
The memory map address 0x00000000 is reserved for the vector table, a set of 32-bit words. On some processors
the vector table can be optionally located at a higher address in memory (starting at the offset 0xffff0000).
When an exception or interrupt occurs, the processor suspends normal execution and starts loading instructions
from the exception vector table (see Table). Each vector table entry contains a form of branch instruction pointing
to the start of a specific routine:
 Reset vector is the location of the first instruction executed by the processor when power is applied. This
instruction branches to the initialization code.
 Undefined instruction vector is used when the processor cannot decode an instruction.
 Software interrupt vector is called when you execute a SWI instruction. The SWI instruction is frequently
used as the mechanism to invoke an operating system routine.
 Interrupt request vector is used by external hardware to interrupt the normal execution flow of the
processor. It can only be raised if IRQs are not masked in the cpsr.
Core Extensions
The hardware extensions covered in this section are standard components placed next to the ARM core. They
improve performance, manage resources, and provide extra functionality and are designed to provide flexibility in
handling particular applications. Each ARM family has different extensions available.
Von Neumann–style cores combines both data and instruction into a single unified cache, as shown in Figure. For
simplicity, we have called the glue logic that connects the memory system to the AMBA bus logic and control.
Tightly coupled memory (TCM) is fast SRAM located close to the core and guarantees the clock cycles required
to fetch instructions or data—critical for real-time algorithms requiring deterministic behavior. TCMs appear as
memory in the address map and can be accessed as fast memory. An example of a processor with TCMs is shown
in Figure.
By combining both technologies, ARM processors can have both improved performance and predictable real-
time response. Below figure shows an example core with a combination of caches and TCMs.
MODULE 2
ARM INSTRUCTION SET
Prepared By: Mr. Chetan.R, Sr. Asst. Professor,ECE Dept.
SMVITM,UDUPI Page 1
Move Instructions
Arithmetic Instructions
Logical Instructions
Comparison Instructions
SMVITM,UDUPI Page 2
Multiply Instructions
Branch Instructions
Load-Store Instructions
SMVITM,UDUPI Page 3
Single-Register Load-Store Addressing Modes
Loading Constants
Multiple-Register Transfer
SMVITM,UDUPI Page 4
Swap Instruction
Software Interrupt Instruction
Program Status Register Instructions
Coprocessor Instructions
SMVITM,UDUPI Page 5
Shift and Rotate Instructions
SMVITM,UDUPI Page 6
ARM7 ASSEMBLY LEVEL PROGRAMS
PREPARED BY: MR.CHETAN R, SR. ASST. PROFESSOR
; TO READ FROM MEMORY ;TO READ FROM MEMORY ;TO READ FROM MEMORY ;TO READ FROM MEMORY
AND WRITE TO MEMORY AND WRITE TO MEMORY AND WRITE TO MEMORY
(ANOTHER TYPE) (USING PRE INDEXED
ADDRESSING MODE)
AREA AREA AREA
ARMPGM,CODE,READONLY ARMPGM,CODE,READONLY ARMPGM,CODE,READONLY AREA
ENTRY ENTRY ENTRY ARMPGM,CODE,READONLY
LDR R0,=MEMORY LDR R0,=MEMORY LDR R0,=0X40000000 ENTRY
LDR R1, [R0] LDR R1, [R0] LDR R1, [R0] LDR R0,=0X40000000
LDR R1, [R0]
MEMORY DCD 0X12345678 LDR R0, DEST LDR R0, =0X40000010
STR R1,[R0] STR R1,[R0] STR R1,[R0,#4]
HERE B HERE
MEMORY DCD 0X12345678 HERE B HERE HERE B HERE
END
DEST DCD 0X40000000 END END
HERE B HERE NOTE: (BEFORE THE

EXECUTION YOU HAVE TO NOTE: (BEFORE THE
END FEED THE DATA IN THE EXECUTION YOU HAVE TO
0X40000000) FEED THE DATA IN THE
0X40000000)
;TO READ 2 DATA FROM ;TO READ 5 DATA FROM ;TO FIND POSITVE OR ;TO FIND EVEN OR ODD
MEMORY AND WRITE TO MEMORY AND WRITE TO NEGATIVE
MEMORY MEMORY AREA
AREA ARMPGM,CODE,READONLY
AREA AREA ARMPGM,CODE,READONLY ENTRY
ARMPGM,CODE,READONLY ARMPGM,CODE,READONLY ENTRY MOVS R1,#0X1
ENTRY ENTRY LDR R1,=0X80000000 LSRS R1,#1
LDR R0,=0X40000000 LDR R0,=0X40000000 LSLS R1,#1 BCS DOWN
LDR R1, [R0] LDM R0!,{R2-R6} BCS DOWN MOVS R2, #0XAA; EVEN
LDR R2, [R0,#4] MOVS R2, #0XAA; +VE B STOP
LDR R0,=0X40000020 B STOP DOWN MOVS R2,#0XFF; ODD
LDR R0,=0X40000010 STM R0!,{R2-R6} DOWN MOVS R2,#0XFF; -VE
STR R1, [R0] STOP NOP
STR R2, [R0,#4] HERE B HERE STOP NOP
HERE B HERE
HERE B HERE END HERE B HERE
END
END END
;TO FIND EQUAL OR UNEQUAL ; TO ADD 10 INTEGERS NUMBER ; TO FIND THE SQUARE OF NO. USING LOOKUP
TABLE
AREA AREA
ARMPGM,CODE,READONLY ARMPGM,CODE,READONLY AREA SQUARE,CODE,READONLY
ENTRY ENTRY ENTRY
MOVS R1,#0X71 MOVS R0, #0XA LDR R0,=DATATABLE
CMP R1, 0X77 MOVS R1, #0X0 LDR R3,RESULT
BEQ DOWN MOVS R1,#10 ; USER DATA
LOOP ADD R1, R0 MOVS R2,#2 ; MEMORY INCREMENTAL
MOVS R2, #0XAA ; UNEQUAL SUB R0, #0X1 INDEXING FOR 16 BIT
B STOP CMP R0, #0X0 MUL R2,R1,R2 ;
BNE LOOP ADD R0,R2
DOWN MOVS R2,#0XFF ; EQUAL LDRH R1,[R0]
LDR R0, =0X40000000 STRH R1,[R3]
STOP NOP STRB R1,[R0]
STOP B STOP
HERE B HERE HERE B HERE
DATATABLE
END END DCW 0X0000, 0X0001, 0X0004 , 0X0009,
0X0016, 0X0025, 0X0036,
0X0049, 0X0064, 0X0081, 0X0100
RESULT DCD 0X40000000

END
; TO FIND THE LARGEST/SMALLEST NUMBER ; TO SORT THE GIVEN ARRAY IN ASCENDING /DESCENDING ORDER
AREA ARMPGM,CODE,READONLY AREA ARMPGM,CODE,READONLY

ENTRY ENTRY
MOVS R0,#0x4;
LDR R1,=SOURCE MOVS R0, #0x4
LDR R2,[R1] MOVS R6, R0 COUNTER DECLARATION
UP
CMP R0,#0 UP1 LDR R1,=0X40000000
BEQ DOWN1 LDR R2, [R1]
LDR R3,[R1,#4]!
CMP R2,R3 UP LDR R3, [R1,#4]! DATA READING & COMPARING
BHI DOWN CMP R2, R3
MOV R2, R3 BLT DOWN
DOWN
SUB R0,#0x1 STR R2, [R1]
B UP SUB R1, #0x4
DOWN1 STR R3, [R1]
LDR R0,RESULT ADD R1, #0x4 SWAP THE DATA
STR R2,[R0] SUB R0, #0x1
CMP R0,#0X0
HERE B HERE BNE UP
SOURCE
DCD 0x22222222, 0x77777777, 0x33333333 DOWN MOV R2, R3
DCD 0x11111111, 0xFFFFFFFF SUB R0, #0x1
RESULT DCD 0X40000000 CMP R0,#0X0 INNER COUNTER TRACKING
END BNE UP
SUB R6, #0X1

MOV R0, R6
CMP R6,#0X0 OUTTER COUNTER TRACKING
BNE UP1
HERE B HERE
END
C COMPILERS & OPTIMIZATION
Basic C Data Types, C Looping Structures, Register Allocation, Function, Calls, Pointer Aliasing
Prepared by: Mr. Chetan R, Sr. Asst. Professor, ECE Dept.

INTRODUCTION
 Optimizing code takes time and reduces source code readability. Usually, it’s only worth optimizing
functions that are frequently executed and important for performance.
 C compilers have to translate your C function literally into assembler so that it works for all possible inputs.
In practice, many of the input combinations are not possible or won’t occur.
 To write efficient C code, you must be aware of areas where the C compiler has to be conservative, the
limits of the processor architecture the C compiler is mapping to, and the limits of a specific C compiler.
 Dependent on the compiler vendor and compiler revision. You will need to look at the compiler’s
documentation or experiment with the compiler yourself.
BASIC C DATA TYPES
In Table 5.1 loads that act on 8- or 16-bit values extend the value to 32 bits before writing to an ARM register.
Unsigned values are zero-extended, and signed values sign-extended. This means that the cast of a loaded value
to an int type does not cost extra instructions.
SMVITM,UDUPI Page 1
LOCAL VARIABLE TYPES

ARMv4-based processors can efficiently load and store 8-, 16-, and 32-bit data. However, most ARM data
processing operations are 32-bit only. For this reason, you should use a 32-bit datatype, int or long, for local
variables wherever possible. Avoid using char and short as local variable types, even if you are manipulating an
8- or 16-bit value.
The following code checksums a data packet containing 64 words. It shows why you should avoid using char
for local variables.
At first sight it looks as though declaring i as a char is efficient. You may be thinking that a char uses less
register space or less space on the ARM stack than an int. On the ARM, both these assumptions are wrong. All
ARM registers are 32-bit and all stack entries are at least 32-bit.
Case i: Consider the compiler output for this function. Case ii: Now compare this to the compiler output where
instead we declare i as an unsigned int.
In the first case, the compiler inserts an extra AND instruction to reduce i to the range 0 to 255 before the
comparison with 64. This instruction disappears in the second case.
SMVITM,UDUPI Page 2
FUNCTION ARGUMENT TYPES

Converting local variables from types char or short to type int increases performance and reduces code size. The
same holds for function arguments. Consider the following simple function, which adds two 16-bit values,
halving the second, and returns a 16-bit sum:
The armcc output for add_v1 shows that the compiler casts the return value to a short type, but does not cast the
input values. It assumes that the caller has already ensured that the 32-bit values r0 and r1 are in the range of the
short type. This shows narrow passing of arguments and return value.
Whatever the merits of different narrow and wide calling protocols, you can see that char or short type function
arguments and return values introduce extra casts. These increase code size and decrease performance. It is
more efficient to use the int type for function arguments and return values, even if you are only passing an 8-bit
value.
SIGNED VERSUS UNSIGNED TYPES

 If your code uses addition, subtraction, and multiplication, then there is no performance difference between
signed and unsigned operations. However, there is a difference when it comes to division.
 It is more efficient to use unsigned types for divisions. The compiler converts unsigned power of two
divisions directly to right shifts. For general divisions, the divide routine in the C library is faster for
unsigned types.
C LOOPING STRUCTURES
This shows how the compiler treats a
loop with incrementing count i++.
SMVITM,UDUPI Page 3
It takes three instructions to implement the for loop structure:
■ An ADD to increment i
■ A compare to check if i is less than 64
■ A conditional branch to continue the loop if i < 64 This is not efficient.
On the ARM, a loop should only use two instructions:
■ A subtract to decrement the loop counter, which also sets the condition code flags on the result
■ A conditional branch instruction
The key point is that the loop counter should count down to zero rather than counting up to some arbitrary limit.
Then the comparison with zero is free since the result is stored in the condition flags. Since we are no longer
using i as an array index, there is no problem in counting down rather than up.
Below example shows the improvement if we switch to a decrementing loop rather than an incrementing loop
The SUBS and BNE instructions implement the loop.
LOOPS USING A VARIABLE NUMBER OF ITERATIONS
SMVITM,UDUPI Page 4
Notice that the compiler checks that N is nonzero on entry to the function. Often this check is unnecessary since
you know that the array won’t be empty.
 In this case a do-while loop gives better performance and code density than a for loop.
 Use a do-while loop to remove the test for N being zero that occurs in a for loop.
LOOP UNROLLING
 Each loop iteration costs two instructions in addition to the body of the loop: a subtract to decrement the
loop count and a conditional branch. We call these instructions the loop overhead. On ARM7 or ARM9
processors the subtract takes one cycle and the branch three cycles, giving an overhead of four cycles per
loop.
 You can save some of these cycles by unrolling a loop—repeating the loop body several times, and reducing
the number of loop iterations by the same proportion. For example, let’s unroll our packet checksum
example four times.
The following code unrolls our packet checksum loop by four times. We assume that the number of words in the
packet N is a multiple of four.
 We have reduced the loop overhead from 4N cycles to (4N)/4=N cycles. On the ARM7TDMI, this
accelerates the loop from 8 cycles per accumulate to 20/4 = 5 cycles per accumulate, nearly doubling the
speed! For the ARM9TDMI, which has a faster load instruction, the benefit is even higher.
 Unrolling will increase the code size with little performance benefit. Unrolling may even reduce
performance by evicting more important code from the cache.
SMVITM,UDUPI Page 5
REGISTER ALLOCATION
The compiler attempts to allocate a processor register to each local variable you use in a C function. It will try to
use the same register for different local variables if the use of the variables do not overlap. When there are more
local variables than available registers, the compiler stores the excess variables on the processor stack. These
variables are called spilled or swapped out variables since they are written out to memory (in a similar way
virtual memory is swapped out to disk). Spilled variables are slow to access compared to variables allocated to
registers.
To implement a function efficiently, you need to

■ minimize the number of spilled variables
■ ensure that the most important and frequently accessed variables are stored in registers
SMVITM,UDUPI Page 6
To ensure good assignment to registers, you should try to limit the internal loop of functions to using at most 12
local variables.
FUNCTION CALLS
 The ARM Procedure Call Standard (APCS) defines how to pass function arguments and return values in
ARM registers.
 The first point to note about the procedure call standard is the four-register rule. Functions with four or
fewer arguments are far more efficient to call than functions with five or more arguments. For functions
with four or fewer arguments, the compiler can pass all the arguments in registers. For functions with more
arguments, both the caller and callee must access the stack for some arguments.
 The first four integer arguments are passed in the first four ARM registers: r0, r1, r2, and r3. Subsequent
integer arguments are placed on the full descending stack, ascending in memory as in Figure 5.1. Function
return integer values are passed in r0.
 If C function needs more than four arguments, or your C++ method more than three explicit arguments, then
it is almost always more efficient to use structures. Group related arguments into structures, and pass a
structure pointer rather than multiple arguments.
POINTER ALIASING
Two pointers are said to alias when they point to the same address. If you write to one pointer, it will affect the
value you read from the other pointer. In a function, the compiler often doesn’t know which pointers can alias
and which pointers can’t.
SMVITM,UDUPI Page 7
The following function increments two timer values by a step amount:
Note that the compiler loads from step twice. Usually a compiler optimization called common sub expression
elimination would kick in so that *step was only evaluated once, and the value reused for the second
occurrence. However, the compiler can’t use this optimization here. The pointers timer1 and step might alias
one another. In other words, the compiler cannot be sure that the write to timer1 doesn’t affect the read from
step. In this case the second value of *step is different from the first and has the value *timer1. This forces the
compiler to insert an extra load instruction.
The same problem occurs if you use structure accesses rather than direct pointer access. The following code also
compiles inefficiently:
The compiler evaluates state->step twice in case state->step and timers->timer1 are at the same memory
address. The fix is easy: Create a new local variable to hold the value of state->step so the compiler only
performs a single load.
In the code for timers_v3 we use a local variable step to hold the value of state->step. Now the compiler does
not need to worry that state may alias with timers.
SMVITM,UDUPI Page 8
MODULE 3
Structure Arrangement, Bit-fields, Unaligned Data and Endianness, Division, Floating Point, Inline Functions and Inline
Assembly, Portability Issues.
Prepared by: Mr. Chetan R,

Sr. Asst. Professor, ECE Dept.
STRUCTURE ARRANGEMENT
 The way you lay out a frequently used structure can have a significant impact on its performance and code density.
There are two issues concerning structures on the ARM: alignment of the structure entries and the overall size of the
structure.
 ARM compilers will automatically align the start address of a structure to a multiple of the largest access width used
within the structure (usually four or eight bytes) and align entries within structures to their access width by inserting
padding. For example, consider the structure
For a little-endian memory system the compiler will lay this out adding padding to ensure that the next
object is aligned to the size of that object:
To improve the memory usage, you should reorder the elements
This reduces the structure size from 12 bytes to 8 bytes, with the following new layout:
 Therefore, it is a good idea to group structure elements of the same size, so that the structure layout doesn’t contain
unnecessary padding. The armcc compiler does include a keyword __packed that removes all padding. For example,
the structure
will be laid out in memory as
 However, packed structures are slow and inefficient to access. The compiler emulates unaligned load and store
operations by using several aligned accesses with data operations to merge the results. Only use the __packed
keyword where space is far more important than speed and you can’t reduce padding by rearragement. Also use it for
porting code that assumes a certain structure layout in memory.
SMVITM,UDUPI Page 1
BIT-FIELDS
 The compiler can choose how bits are allocated within the bit-field container. Different compilers can assign the same
bit-field different bit positions in the container. It is also a good idea to avoid bit-fields for efficiency.
 Bit-fields are structure elements and usually accessed using structure pointers; consequently, they suffer from the
pointer aliasing problems described in Section 5.6. Every bit-field access is really a memory access.
 Possible pointer aliasing often forces the compiler to reload the bit-field several times. compilers do not tend to
optimize bit-field testing very well.
 You can generate far more efficient code by using an integer rather than a bit-field. Use enum or #define masks to
divide the integer type into different fields.
 Now that a single unsigned long type contains all the bit-fields, we can keep a copy of their values in a single local
variable stages, which removes the memory aliasing problem. The compiler generates the following code giving a
saving of 33% over the previous version using ANSI bit-fields:
 You can also use the masks to set and clear the bit-fields, just as easily as for testing them. The following code shows
how to set, clear, or toggle bits using the STAGE masks:
stages |= STAGEA; /* enable stage A */

stages &= ∼STAGEB; /* disable stage B */
stages ∧= STAGEC; /* toggle stage C */
 These bit set, clear, and toggle operations take only one ARM instruction each, using ORR, BIC, and EOR
instructions, respectively. Another advantage is that you can now manipulate several bit-fields at the same time, using
one instruction. For example:
stages |= (STAGEA | STAGEB); /* enable stages A and B */
stages &= ∼(STAGEA | STAGEC); /* disable stages A and C */
UNALIGNED DATA AND ENDIANNESS
 Unaligned data and endianness are two issues that can complicate memory accesses and portability. Is the array
pointer aligned? Is the ARM configured for a big-endian or little endian memory system?
 The ARM load and store instructions assume that the address is a multiple of the type you are loading or storing. If
you load or store to an address that is not aligned to its type, then the behavior depends on the particular
implementation. The core may generate a data abort or load a rotated value. For well-written, portable code you
should avoid unaligned accesses.
 You are likely to meet alignment problems when reading data packets or files used to transfer information between
computers. Network packets and compressed image files are good examples. Two- or four-byte integers may appear at
arbitrary offsets in these files. Data has been squeezed as much as possible, to the detriment of alignment.
 Endianness (or byte order) is also a big issue when reading data packets or compressed files. The ARM core can be
configured to work in little-endian (least significant byte at lowest address) or big-endian (most significant byte at
lowest address) modes. Little-endian mode is usually the default.
 The endianness of an ARM is usually set at power-up and remains fixed thereafter. Tables 5.6 and 5.7 illustrate how
the ARM’s 8-bit, 16-bit, and 32-bit load and store instructions work for different endian configurations. We assume
that byte address A is aligned to the size of the memory transfer. The tables show how the byte addresses in memory
map into the 32-bit register that the instruction loads or stores.
SMVITM,UDUPI Page 2
 What is the best way to deal with endian and alignment problems? If speed is not critical, then use functions like
readint_little and readint_big in Example 5.10, which read a four-byte integer from a possibly unaligned address in
memory. The address alignment is not known at compile time, only at run time.
 If you’ve loaded a file containing bigendian data such as a JPEG image, then use readint_big. For a bytestream
containing little-endian data, use readint_little. Both routines will work correctly regardless of the memory endianness
ARM is configured for.
DIVISION
 The ARM does not have a divide instruction in hardware. Instead the compiler implements divisions by calling
software routines in the C library.
 There are many different types of division routine that you can tailor to a specific range of numerator and denominator
values. The standard integer division routine provided in the C library can take between 20 and 100 cycles, depending
on implementation, early termination, and the ranges of the input operands.
 Division and modulus (/ and %) are such slow operations that you should avoid them as much as possible. However,
division by a constant and repeated division by the same denominator can be handled efficiently. This section
describes how to replace certain divisions by multiplications and how to minimize the number of division calls.
 Circular buffers are one area where programmers often use division, but you can avoid these divisions completely.
Suppose you have a circular buffer of size buffer_size bytes and a position indicated by a buffer offset. To advance the
offset by increment bytes you could write
offset = (offset + increment) % buffer_size;
SMVITM,UDUPI Page 3
Instead it is far more efficient to write
 The first version may take 50 cycles; the second will take 3 cycles because it does not involve a division.
 If you can’t avoid a division, then try to arrange that the numerator and denominator are unsigned integers. Signed
division routines are slower since they take the absolute values of the numerator and denominator and then call the
unsigned division routine. They fix the sign of the result afterwards.
 Many C library division routines return the quotient and remainder from the division. In other words a free remainder
operation is available to you with each division operation and vice versa.
 The following routine, scale, shows how to convert divisions to multiplications in practice. It divides an array of N
elements by denominator d. We first calculate the value of s as above. Then we replace each divide by d with a
multiplication by s. The 64-bit multiply is cheap because the ARM has an instruction UMULL, which multiplies two
32-bit values, giving a 64-bit result.
Here we have assumed that the numerator and denominator are 32-bit unsigned integers. Of course, the algorithm works
equally well for 16-bit unsigned integers using a 32-bit multiply, or for 64-bit integers using a 128-bit multiply. You
should choose the narrowest width for your data. If your data is 16-bit, then set s = (216 − 1)/d and estimate q using a
standard integer C multiply.
SMVITM,UDUPI Page 4
FLOATING POINT
 The majority of ARM processor implementations do not provide hardware floating-point support, which saves on
power and area when using ARM in a price-sensitive, embedded application.
 With the exceptions of the Floating Point Accelerator (FPA) used on the ARM7500FE and the Vector Floating Point
accelerator (VFP) hardware, the C compiler must provide support for floating point in software.
 In practice, this means that the C compiler converts every floating-point operation into a subroutine call. The C library
contains subroutines to simulate floating-point behavior using integer arithmetic.
 This code is written in highly optimized assembly. Even so, floating-point algorithms will execute far more slowly
than corresponding integer algorithms.
 If you need fast execution and fractional values, you should use fixed-point or blockfloating algorithms. Fractional
values are most often used when processing digital signals such as audio and video.
INLINE FUNCTIONS AND INLINE ASSEMBLY
 You can remove the function call overhead completely by inlining functions. Additionally many compilers allow you
to include inline assembly in your C source code. Using inline functions that contain assembly you can get the
compiler to support ARM instructions and optimizations that aren’t usually available.
 The inline assembler is part of the C compiler. The C compiler still performs register allocation, function entry, and
exit. The compiler also attempts to optimize the inline assembly you write, or deoptimize it for debug mode. Although
the compiler output will be functionally equivalent to your inline assembly, it may not be identical.
 The main benefit of inline functions and inline assembly is to make accessible in C operations that are not usually
available as part of the C language. It is better to use inline functions rather than #define macros because the latter
doesn’t check the types of the function arguments and return value.
PORTABILITY ISSUES
■ The char type. On the ARM, char is unsigned rather than signed as for many other processors. A common problem concerns
loops that use a char loop counter i and the continuation condition i ≥ 0, they become infinite loops. In this situation, armcc
produces a warning of unsigned comparison with zero. You should either use a compiler option to make char signed or change
loop counters to type int.
■ The int type. Some older architectures use a 16-bit int, which may cause problems when moving to ARM’s 32-bit int type
although this is rare nowadays. Note that expressions are promoted to an int type before evaluation.
■ Unaligned data pointers. Some processors support the loading of short and int typed values from unaligned addresses. A C
program may manipulate pointers directly so that they become unaligned, for example, by casting a char * to an int *. ARM
architectures up to ARMv5TE do not support unaligned pointers. To detect them, run the program on an ARM with an
alignment checking trap.
■ Endian assumptions. C code may make assumptions about the endianness of a memory system, for example, by casting a
char * to an int *.
■ Function prototyping. The armcc compiler passes arguments narrow, that is, reduced to the range of the argument type. If
functions are not prototyped correctly, then the function may return the wrong answer. Other compilers that pass arguments
wide may give the correct answer even if the function prototype is incorrect.
■ Use of bit-fields. The layout of bits within a bit-field is implementation and endian dependent. If C code assumes that bits are
laid out in a certain order, then the code is not portable.
■ Inline assembly. Using inline assembly in C code reduces portability between architectures. You should separate any inline
assembly into small inlined functions that can easily be replaced. It is also useful to supply reference, plain C implementations
of these functions that can be used on other architectures, where this is possible.
SMVITM,UDUPI Page 5
MODULE 4
MODULE-4
Embedded System Components: Embedded Vs General computing system, History of embedded systems,
Classification of Embedded systems, Major applications areas of embedded systems, purpose of embedded
systems.
Core of an Embedded System including all types of processor/controller, Memory, Sensors, Actuators, LED, 7
segment LED display, stepper motor, keyboard, Push button switch, Communication Interface (onboard and
external types), Embedded firmware, Other system components.
Textbook 2: Chapter 1 (Sections 1.2 to 1.6), Chapter 2 (Sections 2.1 to 2.6)
Text book: Shibu K V, “Introduction to Embedded Systems”, Tata McGraw Hill Education Private Limited, 2nd
Edition
Notes prepared by:

Mr.Chetan.R, Sr.Asst.Professor.
WHAT IS AN EMBEDDED SYSTEM?
“An embedded system is an electronic/electro-mechanical system designed to perform a specific function
and is a combination of both hardware and firmware (software)”.
Every embedded system is unique, and the hardware as well as the firmware is highly specialized to the
application domain. Embedded systems are becoming an inevitable part of any product or equipment in all
fields including household appliances, telecommunications, medical equipment, industrial control, consumer
products, etc.
EMBEDDED SYSTEMS VS GENERAL COMPUTING SYSTEMS

The Embedded System and the General purpose computer are at two extremes. The embedded system is
designed to perform a specific task whereas as per definition the general purpose computer is meant for general
use. It can be used for playing games, watching movies, creating software, work on documents or spreadsheets
etc. Following are certain specific points of difference between embedded systems and general purpose
computers:
SMVITM,UDUPI Page 1
EMBEDDED SYSTEM COMPONENTS
CLASSIFICATION OF EMBEDDED SYSTEMS

1. Based on generation
2. Complexity and performance requirements
3. Based on deterministic behavior
4. Based on triggering
1. On generation
First generation (1G):
Built around 8bit microprocessor & microcontroller.
Simple in hardware circuit & firmware developed.
Examples: Digital telephone keypads.
Second generation (2G):
Built around 16-bit μp & 8-bit μc.
They are more complex & powerful than 1G μp & μc.
Examples: SCADA systems
Third generation (3G):
Concepts like Digital Signal Processors(DSPs), Application Specific Integrated Circuits(ASICs)
evolved.
Examples: Robotics, Media, etc.
Fourth generation:
The concept of System on Chips (SoC), Multicore Processors evolved.
Highly complex & very powerful.
Examples: Smart Phones.
2. On complexity & performance

Small-scale:
Simple in application need Performance not time-critical.
Built around low performance &
low cost 8 or 16 bitμp/μc.
Example: an electronic toy
Medium-scale:
Slightly complex in hardware & firmware requirement.
Built around medium performance & low cost 16 or 32 bit μp/μc.
Usually contain operating system.
Examples: Industrial machines.
Large-scale:
Highly complex hardware & firmware.
Built around 32 or 64 bit RISC μp/μc or PLDs or MulticoreProcessors.
Response is time-critical.
Examples: Mission critical applications.
SMVITM,UDUPI Page 2
3. On deterministic behaviour
This classification is applicable for “Real Time” systems.
The task execution behaviour for an embedded systemmay be deterministic or non-deterministic.
Based on execution behaviour Real Time embeddedsystems are divided into Hard and Soft.
4. On triggering
Embedded systems which are “Reactive” in nature canbe based on triggering.
Reactive systems can be:
Event triggered
Time triggered
MAJOR APPLICATION AREA OF EMBEDDED SYSTEMS
The application areas and the products in the embedded domain are countless.
1. Consumer Electronics: Camcorders, Cameras.

2. Household appliances: Washing machine, Refrigerator.
3. Automotive industry: Anti-lock breaking system(ABS), engine control.
4. Home automation & security systems: Air conditioners, sprinklers, fire alarms.
5. Telecom: Cellular phones, telephone switches.
6. Computer peripherals: Printers, scanners.
7. Computer networking systems: Network routers and switches.
8. Healthcare: EEG, ECG machines.
9. Banking & Retail: Automatic teller machines, point of sales.
10. Card Readers: Barcode, smart card readers.
PURPOSE OF EMBEDDED SYSTEM
1. Data collection/Storage/Representation
2. Data communication
3. Data (signal) processing
4. Monitoring
5. Control
6. Application specific user interface
1. Data Collection/Storage/Representation:
 Embedded system designed for the purpose of data collection performs acquisition of data from the
external world.
 Data collection is usually done for storage, analysis, manipulation and transmission.
 Data can be analog or digital.
Embedded systems with analog data capturing techniques collect data directly in the form of analog signal
whereas an embedded system with digital data collection mechanism converts the analog signal to the
digital signal using analog to digital converters.
SMVITM,UDUPI Page 3
2. Data communication:
 Embedded data communication systems are deployed in applications from complex satellite
communication to simple home networking systems.
 The transmission of data is achieved either by a wire-line medium or by a wire-less medium.
 Data can either be transmitted by analog means or by digital means.
 Wireless modules-Bluetooth, Wi-Fi.
 Wire-line modules-USB, TCP/IP.
 Network hubs, routers, switches are examples of dedicated data transmission embedded systems.
3. Data signal processing:

 Embedded systems with signal processing functionalities are employed in applications demanding
signal processing like speech coding, audio video codec, transmission applications.
 A digital hearing aid is a typical example of an embedded system employing data processing.
 Digital hearing aid improves the hearing capacity of hearing impaired person
4. Monitoring:
 All embedded products coming under the medical domain are with monitoring functions.
 Electro cardiogram machine is intended to do the monitoring of the heartbeat of a patient but it cannot
impose control over the heartbeat.
 Other examples with monitoring function are digital CRO, digital multi-meters, and logic analyzers.
5. Control:
 A system with control functionality contains both sensors and actuators.
 Sensors are connected to the input port for capturing the changes in environmental variable and the
actuators connected to the output port are controlled according to the changes in the input variable.
 Air conditioner system used to control the room temperature to a specified limit is a typical example
for CONTROL purpose.
6. Application specific user interface:

 Buttons, switches, keypad, lights, bells, display units etc are application specific user interfaces.
 Mobile phone is an example of application specific user interface.
 In mobile phone the user interface is provided through the keypad, system speaker, vibration alert etc.
SMVITM,UDUPI Page 4
ELEMENTS OF EMBEDDED SYSTEMS
· As defined earlier, an embedded system is a combination of 3 things:

 Hardware
 Software
 Mechanical Components
And it is supposed to do one specific task only.
Embedded systems are basically designed to regulate a physical variable (such Microwave Oven) or to
manipulate the state of some devices by sending some signals to the actuators or devices connected to the output
port system (such as temperature in Air Conditioner), in response to the input signal provided by the end users
or sensors which are connected to the input ports.
 Hence the embedded systems can be viewed as a reactive system.

 Examples of common user interface input devices are keyboards, push button, switches, etc.
 The memory of the system is responsible for holding the code (control algorithm and other important
configuration details).
 An embedded system without code (i.e. the control algorithm) implemented memory has all the peripherals
but is not capable of making decisions depending on the situational as well as real world changes.
 Memory for implementing the code may be present on the processor or may be implemented as a separate
chip interfacing the processor In a controller based embedded system, the controller may contain internal
memory for storing code
 Such controllers are called Micro-controllers with on-chip ROM, eg. Atmel AT89C51.
SMVITM,UDUPI Page 5
CORE THE OF EMBEDDED SYSTEM
Embedded systems are domain and application specific and are built around a central core. The core of the
embedded system falls into any of the following categories:
1. General purpose and Domain Specific Processors
 Microprocessors
 Microcontrollers
 Digital Signal Processors
2. Application Specific Integrated Circuits. (ASIC)
3. Programmable logic devices(PLD‟s)
4. Commercial off-the-shelf components (COTs)
1. GENERAL PURPOSE AND DOMAIN SPECIFIC PROCESSOR.

• Almost 80% of the embedded systems are processor/ controller based.
• The processor may be microprocessor or a microcontroller or digital signal processor, depending on the
domain and application.
1.1 Microprocessors
A microprocessor is a silicon chip representing a central processing unit. · A microprocessor is a
dependent unit and it requires the combination of other hardware like memory, timer unit, and interrupt
controller, etc. for proper functioning. · Developers of microprocessors.
o Intel – Intel 4004 – November 1971(4-bit).
o Intel – Intel 4040.
o Intel – Intel 8008 – April 1972.
o Intel – Intel 8080 – April 1974(8-bit).
o Motorola – Motorola 6800.
o Intel – Intel 8085 – 1976.
o Zilog - Z80 – July 1976.
1.2 Microcontrollers
 A microcontroller is a highly integrated chip that contains a CPU, scratch pad RAM, special and general
purpose register arrays ,on chip ROM/FLASH memory for program storage , timer and interrupt control
units and dedicated I/O ports.
 Texas Instrument‟s TMS 1000 Is considered as the world‟s first microcontroller.
 Some embedded system application require only 8 bit controllers whereas some requiring superior
performance and computational needs demand 16/32 bit controllers.
 The instruction set of a microcontroller can be RISC or CISC
 Microcontrollers are designed for either general purpose application requirement or domain specific
application requirement.
SMVITM,UDUPI Page 6
1.3. Digital Signal Processors

 DSP are powerful special purpose 8/16/32 bit microprocessor designed to meet the computational
demands and power constraints of today‟s embedded audio, video and communication applications.
 DSP are 2 to 3 times faster than general purpose microprocessors in signal processing applications. This
is because of the architectural difference between DSP and general purpose microprocessors.
DSP includes following key units:

i. Program memory: It is a memory for storing the program required by DSP to process the data.
ii. Data memory: It is a working memory for storing temporary variables and data/signal to be
processed.
iii. Computational engine: It performs the signal processing in accordance with the stored program
memory computational engine incorporated many specialized arithmetic units and each of them
operates simultaneously to increase the execution speed. It also includes multiple hardware shifters
for shifting operands and saves execution time.
iv. I/O unit: It acts as an interface between the outside world and DSP. It is responsible for capturing
signals to be processed and delivering the processed signals.
HARVARD V/S VON- NEUMANN
Microprocessors/controllers based on the von-neumann architecture shares a single common bus for fetching
both instructions and data. Program instructions and data are stored in a common main memory. Von-Neumann
architecture based processors/controllers first fetch an instruction and them fetch the data to support the
instruction from code memory. The two separate fetches slows down the controller‟s operation. Von-Neumann
architecture is also referred as Princeton architecture, since it was developed by the Princeton University.
Microprocessors/controllers based on the Harvard architecture will have separate data bus and instruction bus.
This allows the data transfer and program fetching to occur simultaneously on both buses. With Harvard
architecture, the data memory can be read and written while the program memory is being accessed. These
separated data memory and code memory buses allow one instruction to execute while the next instruction is
fetched (“pre-fetching”). The pre-fetch theoretically allows much faster execution than Von-Neumann
architecture. Since some additional hardware logic is required for the generation of control signals for this type
of operation it adds silicon complexity to the system. Fig 2.2 explain the Harvard and Von-Neumann
architecture concept.
I/O CPU Memory Program CPU Data

Memory Memory
Fig. 2.2 Princeton v/s Harvard architecture
SMVITM,UDUPI Page 7
RISC AND CISC
SMVITM,UDUPI Page 8
BIG-ENDIAN VS. LITTLE-ENDIAN PROCESSORS/CONTROLLERS
Endianness specifies the order in which the data is stored in the memory by processor operations in a multi byte
system (Processors whose word size is greater than one byte). Suppose the word length is two byte then data can
be stored in memory in two different ways:
(1) Higher order of data byte at the higher memory and lower order of data byte at location just below the higher
memory.
(2) Lower order of data byte at the higher memory and higher order of data byte at location just below the
higher memory.
Little-endian (Fig. 2.3) means the lower-order byte of the data is stored in memory at the lowest address, and the
higher-order byte at the highest address. (The little end comes first.)
For example, a 4 byte long integer Byte3 Byte2 Byte1 Byte0 will be stored in the memory as shown below:
Big-endian (Fig. 2.4) means the higher-order byte of the data is stored in memory at the lowest address, and the
lower-order byte at the highest address. (The big end comes first.) For example, a 4 byte long integer Byte3
Byte2 Byte1 Byte0 will be stored in the memory as follows :
SMVITM,UDUPI Page 9
2. APPLICATION SPECIFIC INTEGRATED CIRCUITS. (ASIC)

o ASICs is a microchip design to perform a specific and unique applications.
o Because of using single chip for integrates several functions there by reduces the system development
cost.
o Most of the ASICs are proprietary (which having some trade name) products, it is referred as Application
Specific Standard Products(ASSP).
o As a single chip ASIC consumes a very small area in the total system. Thereby helps in the design of
smaller system with high capabilities or functionalities.
o The developers of such chips may not be interested in revealing the internal detail of it .
3. PROGRAMMABLE LOGIC DEVICES(PLD’S)

A PLD is an electronic component. It used to build digital circuits which are reconfigurable.
 A logic gate has a fixed function but a PLD does not have a defined function at the time of
manufacture.
 PLDs offer customers a wide range of logic capacity, features, speed, voltage characteristics.
 PLDs can be reconfigured to perform any number of functions at any time.
 A variety of tools are available for the designers of PLDs which are inexpensive and help  to
develop, simulate and test the designs.
PLDs having following two major types.

1) CPLD(Complex Programmable Logic Device): CPLDs offer much smaller amount of logic up to
1000 gates.
2) FPGAs(Field Programmable Gate Arrays): It offers highest amount of performance as well as
highest logic density, the most features.
Advantages of PLDs :-
1) PLDs offer customer much more flexibility during the design cycle.
2) PLDs do not require long lead times for prototypes or production parts because PLDs are already on a
distributors shelf and ready for shipment.
4. COMMERCIAL OFF-THE-SHELF COMPONENTS (COTS)

a) A Commercial off the Shelf product is one which is used 'asis'.
b) The COTS components itself may be develop around a general purpose or domain specific processor or an
ASICs or a PLDs.
c) The major advantage of using COTS is that they are readily available in the market, are chip and a developer
can cut down his/her development time to a great extent
d) The major drawback of using COTS components in embedded design is that the manufacturer of the COTS
component may withdraw the product or discontinue the production of the COTS at any time if rapid change in
technology occurs.
e) Advantages of COTS: Ready to use, Easy to integrate, & Reduces development time
f) Disadvantages of COTS: No operational or manufacturing standard (all proprietary) & Vendor or
manufacturer may discontinue production of a particular COTS product
SENSORS & ACTUATORS

SENSOR:
 A Sensor is used for taking Input ·
 It is a transducer that converts energy from one form to another for any measurement or control purpose
· Ex. A Temperature sensor ·
ACTUATORS:
 Actuator is used for output.
 It is a transducer that may be either mechanical or electrical which converts signals to corresponding
physical actions.
LED (LIGHT EMITTING DIODE)
 LED is a p-n junction diode and contains a CATHODE and ANODE
 For functioning the anode is connected to +ve end of power supply and cathode is connected to –ve end of
power supply.
 The maximum current flowing through the LED is limited by connecting a RESISTOR in series
between the power supply and LED as shown in the figure below.
There are two ways to interface an LED to a microprocessor/microcontroller:

1. The Anode of LED is connected to the port pin and cathode to Ground : In this approach the port pin
sources the current to the LED when it is at logic high(ie. 1).
2. The Cathode of LED is connected to the port pin and Anode to Vcc : In this approach the port pin
sources the current to the LED when it is at logic high (ie. 1). Here the port pin sinks the current and the
LED is turned ON when the port pin is at Logic low (ie. 0).
SEVEN SEGMENT LED DISPLAY

The 7-segment LED display is an output device for displaying alphanumeric characters. It contains 8 light-
emitting diode (LED) segments arranged in a special form. Out of the 8 LED segments, 7 are used for
displaying alphanumeric characters and 1 is used for representing „decimal point‟ in decimal number display.
The LED segments are named A to G and the decimal point LED segment is named as DP. The 7-segment LED
displays are available in two different configurations, namely; Common Anode and Common Cathode. In the
common anode configuration, the anodes of the 8 segments are connected commonly whereas in the common
cathode configuration, the 8 LED segments share a common cathode line.
PUSH BUTTON SWITCH

It is an input device. Push button switch comes in two configurations, namely „Push to Make‟ and
„Push to Break‟.
In the „Push to Make‟ configuration, the switch is normally in the open state and it makes a circuit
contact when it is pushed or pressed.
In the „Push to Break‟ configuration, the switch is normally in the closed state and it breaks the
circuit contact when it is pushed or pressed.
In the embedded application push button is generally used as reset and start switch.
Stepper Motor:
 Stepper motor is an electro mechanical device which generates discrete displacement (motion) in response
to dc electrical signals.
 It differs from the normal dc motor in its operation. The dc motor produces continuous rotation on applying
dc voltage whereas a stepper motor produces discrete rotation in response to the dc voltage applied to it.
 Stepper motors are widely used in industrial embedded applications, consumer electronic products and
robotics control systems.
 The paper feed mechanism of a printer/fax makes use of stepper motors for its functioning.
Based on the coil winding arrangements, a two phase stepper motor is classified into
 Unipolar
 Bipolar
Unipolar:
A unipolar stepper motor contains two windings per phase. The direction of rotation (clockwise or
anticlockwise) of a stepper motor is controlled by changing the direction of current flow. Current in one
direction flows through one coil and in the opposite direction flows through the other coil. It is easy to shift the
direction of rotation by just switching the terminals to which the coils are connected .
Bipolar:
A bipolar stepper motor contains single winding per phase. For reversing the motor rotation the current flow
through the windings is reversed dynamically. It requires complex circuitry for current flow reversal.
In the wave step mode only one phase is energized at a time and each coils of the phase is energized alternatively. The
coils A, B, C, and D are energized in the following order:
The rotation of the stepper motor can be reversed by reversing the order in which the coil is energised.
Two-phase unipolar stepper motors are the popular choice for embedded applications. The current requirement for stepper
motor is little high and hence the port pins of a microcontroller/processor may not be able to drive them directly. Also the
supply voltage required to operate stepper motor varies normally in the range 5V to 24 V. Depending on the current and
voltage requirements, special driving circuits are required to interface the stepper motor with microcontroller/processors.
Commercial off-the-shelf stepper motor driver ICs are available in the market and they can be directly interfaced to the
microcontroller port. ULN2803 is an octal peripheral driver array available from Texas Instruments and ST
microelectronics for driving a 5V stepper motor. Simple driving circuit can also be built using transistors. The following
circuit diagram (Fig. 2.20) illustrates the interfacing of a stepper motor through a driver circuit connected to the port pins
of a microcontroller/processor.
Keyboard
 Keyboard is an input device for user interfacing. If the number of keys required is very limited, push
button switches can be used and they can be directly interfaced to the port pins for reading.
 However, there may be situations demanding a large number of keys for user input. In such situations it
may not be possible to interface each keys to a port pin due to the limitation in the number of general
purpose port pins available for the processor/controller in use and moreover it is wastage of port pins.
 Matrix keyboard is an optimum solution for handling large key requirements.
 It greatly reduces the number of interface connections.
 In a matrix keyboard, the keys are arranged in matrix fashion (i.e. they are connected in a row and column
style).
 For detecting a key press, the keyboard uses the scanning technique, where each row of the matrix is pulled
low and the columns are read.
 After reading the status of each columns corresponding to a row, the row is pulled high and the next row is
pulled low and the status of the columns are read. This process is repeated until the scanning for all rows are
completed.
 When a row is pulled low and if a key connected to the row is pressed, reading the column to which the key
is connected will give logic 0.
MEMORY
Memory is an important part of a processor/controller based embedded systems. Some of the
processors/controllers contain built in memory and this memory is referred as on-chip memory. Others do not
contain any memory inside the chip and requires external memory to be connected with the controller/processor
to store the control algorithm. It is called off-chip memory. Also some working memory is required for
holding data temporarily during certain operations. This section deals with the different types of memory used
in embedded system applications.
Program Storage Memory (ROM)

The program memory or code storage memory of an embedded system stores the program instructions and it
can be classified into different types as per the block diagram representation given in Fig. 2.8.
Fig. 2.8 Classification of Program Memory (ROM)
The code memory retains its contents even after the power to it is turned off. It is generally known as
non-volatile storage memory. Depending on the fabrication, erasing, and programming techniques they are
classified into the following types.
Masked ROM (MROM) Masked ROM is a one-time programmable device. Masked ROM makes use of
the hardwired technology for storing data. The device is factory programmed by masking and metallisation
process at the time of production itself, according to the data provided by the end user. The primary
advantage of this is low cost for high volume production. They are the least expensive type of solid state
memory. Different mechanisms are used for the masking process of the ROM, like
(1) Creation of an enhancement or depletion mode transistor through channel implant.
(2) By creating the memory cell either using a standard transistor or a high threshold transistor. In the high
threshold mode, the supply voltage required to turn ON the transistor is above the normal ROM IC
operating voltage. This ensures that the transistor is always off and the memory cell stores always logic 0.
Masked ROM is a good candidate for storing the embedded firmware for low cost embedded devices.
Once the design is proven and the firmware requirements are tested and frozen, the binary data (The firmware
cross compiled/assembled to target processor specific machine code) corresponding to it can be given to the
MROM fabricator. The limitation with MROM based firmware storage is the inability to modify the device
firmware against firmware upgrades. Since the MROM is permanent in bit storage, it is not possible to alter
the bit information.

Programmable Read Only Memory (PROM) / (OTP) Unlike Masked ROM Memory, One Time
Programmable Memory (OTP) or PROM is not pre-programmed by the manufacturer. The end user is
responsible for programming these devices. This memory has nichrome or polysilicon wires arranged in
a matrix. These wires can be functionally viewed as fuses. It is programmed by a PROM programmer
which selectively burns the fuses according to the bit pattern to be stored. Fuses which are not blown/burned
represents a logic “1” whereas fuses which are blown/burned represents a logic “0”. The default state is
logic “1”. OTP is widely used for commercial production of embedded systems whose proto-typed versions
are proven and the code is finalised. It is a low cost solution for commercial production. OTPs cannot be
reprogrammed.
Erasable Programmable Read Only Memory (EPROM) OTPs are not useful and worth for development
purpose. During the development phase, the code is subject to continuous changes and using an OTP each
time to load the code is not economical. Erasable Programmable Read Only Memory (EPROM) gives the
flexibility to re-program the same chip. EPROM stores the bit information by charging the floating gate of
an FET. Bit information is stored by using an EPROM programmer, which applies high voltage to charge
the floating gate. EPROM contains a quartz crystal window for erasing the stored information. If the
window is exposed to ultraviolet rays for a fixed duration, the entire memory will be erased. Even though the
EPROM chip is flexible in terms of re-programmability, it needs to be taken out of the circuit board and put
in a UV eraser device for 20 to 30 minutes. So it is a tedious and time-consuming process.
Electrically Erasable Programmable Read Only Memory (EEPROM) As the name indicates, the
information contained in the EEPROM memory can be altered by using electrical signals at the register/
Byte level. They can be erased and reprogrammed in-circuit. These chips include a chip erase mode and in
this mode they can be erased in a few milliseconds. It provides greater flexibility for system design. The only
limitation is their capacity is limited when compared with the standard ROM (A few kilobytes).
FLASH FLASH is the latest ROM technology and is the most popular ROM technology usedin today‟s
embedded designs. FLASH memory is a variation of EEPROM technology. It combines the re-
programmability of EEPROM and the high capacity of standard ROMs. FLASH memory is organised as
sectors (blocks) or pages. FLASH memory stores information in an array of floating gate MOSFET transistors.
The erasing of memory can be done at sector level or page level without affecting the other sectors or pages.
Each sector/page should be erased before re-programming. The typical erasable capacity of FLASH is of the
order of a few 1000 cycles. SST39LF010 from Microchip (www.microchip.com) is an example of 1Mbit
(Organised as 128K x8) Flash memory with typical endurance of 100,000 cycles.
NVRAM Non-volatile RAM is a random access memory with battery backup. It contains static RAM based
memory and a minute battery for providing supply to the memory in the absence of external power
supply. The memory and battery are packed together in a single package. The life span of
NVRAM is expected to be around 10 years. DS1644 from Maxim/Dallas is an example of 32KB
NVRAM.

Read-Write Memory/Random Access Memory (RAM)

RAM is the data memory or working memory of the controller/processor. Controller/processor can read from
it and write to it. RAM is volatile, meaning when the power is turned off, all the contents are destroyed. RAM
is a direct access memory, meaning we can access the desired memory location directly without the need for
traversing through the entire memory locations to reach the desired memory position (i.e. random access of
memory location). This is in contrast to the Sequential Access Memory (SAM), where the desired memory
location is accessed by either traversing through the entire memory or through a „seek‟ method. Magnetic
tapes, CD ROMs, etc. are examples of sequential access memories. RAM generally falls into three categories:
Static RAM (SRAM), dynamic RAM (DRAM), and non-volatile RAM (NVRAM) (Fig. 2.9).
Read/Write
Memory (RAM)
SRAM DRAM NVRAM
Fig. 2.9 Classification of Working Memory (RAM)
Static RAM (SRAM) Static RAM stores data in the form of voltage. They are made up of flip- flops.
Static RAM is the fastest form of RAM available. In typical implementation, an SRAM cell (bit) is realised
using six transistors (or 6 MOSFETs). Four of the transistors are used for building the latch (flip- flop) part
of the memory cell and two for controlling the access. SRAM is fast in operation due to its resistive
networking and switching capabilities. In its simplest representation an SRAM cell can be visualised as
shown in Fig. 2.10.
Bit Line B\ Bit Line B

Q1 Q3
Q5 Q6
Q2 Q4
Vcc
Word Line
Fig. 2.10 SRAM cell implementation
This implementation in its simpler form can be visualised as two-cross coupled inverters with read/write
control through transistors. The four transistors in the middle form the cross-coupled inverters. This can be
visualised as shown in Fig. 2.11.
From the SRAM implementation diagram, it is clear Write control Read control
that access to the memory cell is controlledby the
line Word Line, which controls the access
transistors (MOSFETs) Q5 and Q6. The access Data to Data
transistors control the connection to bit lines B & B\. write read
In order to write a value to the memory cell, apply the
desired value to the bit control lines (For writing 1,
make B = 1 and B\ =0; For writing 0, make B = 0 and Fig. 2.11 Visualisation of SRAM cell
B\ =1) and assert the Word Line (Make Word line
high). This operation latches the bit written in the flip-flop. For reading the content of the memory cell, assert
both B and B\ bit lines to 1 and set the Word line to 1.
The major limitations of SRAM are low capacity and high cost. Since a minimum of six transistors
are required to build a single memory cell, imagine how many memory cells we can fabricate on a silicon
wafer.
Dynamic RAM (DRAM) Dynamic RAM stores data in the form of Bit line B
charge. They are made up of MOS transistor gates. The advantages of
DRAM are its high density and low cost compared to SRAM. The
disadvantage is that since the information is stored as charge it gets leaked
off with time and to prevent this they need to be refreshed periodically. Word line
Special circuits called DRAM controllers are used for the refreshing
operation. The refresh operation is done periodically in milliseconds +
interval. Figure 2.12 illustrates the typical implementation of a DRAM –
cell.
The MOSFET acts as the gate for the incoming and outgoing data
whereas the capacitor acts as the bit storage unit. Table given below Fig. 2.12 DRAM cell implementation
summarises the relative merits and demerits of SRAM and DRAM
technology.
SRAM cell DRAM cell

Made up of 6 CMOS transistors (MOSFET) Made up of a MOSFET and a capacitor
Doesn‟t require refreshing Requires refreshing
Low capacity (Less dense) High capacity (Highly dense)
More expensive Less expensive
Fast in operation. Typical access time is 10ns Slow in operation due to refresh requirements. Typical access
time is 60ns. Write operation is faster than read operation.
NVRAM Non-volatile RAM is a random access memory with battery backup. It contains static RAM based
memory and a minute battery for providing supply to the memory in the absence of external power supply.
The memory and battery are packed together in a single package. NVRAM is used for the non- volatile
storage of results of operations or for setting up of flags, etc. The life span of NVRAM is expected to be
around 10 years. DS1744 from Maxim/Dallas is an example for 32KB NVRAM.
COMMUNICATION INTERFACES
INTER INTEGRATED CIRCUIT: I2C
 Developed and patented by Philips for connecting low speed peripherals to a motherboard, embedded
system or cell phone
 Two wire bus , Half duplex, Serial communication, Synchronous, data up to 100 kbits/sec
 Serial data line (SDA)‫‏‬
 Serial clock line (SCL)‫‏‬
 Master controls clock for slaves
 Each connected slave has a unique 7-bit address
a) Transfers are byte oriented, MSB first

b) Start: SDA goes low while SCL is high
c) Master sends address of slave (7-bits) on next 7 clocks
d) Master sends read/write request bit
a. 0-write to slave
b. 1-read from slave
e) Slave ACKs by pulling SDA low on next clock
f) Data transfers now commence
SERIAL PERIPHERAL INTERFACE (SPI)
1. Synchronous, bidirectional (full duplex)

2. 4 wire serial interface
3. Introduced by Motorola
4. Single master multi slave system
a) MOSI: Master out slave input
b) MISO: Master input slave output
c) SCL: Serial clock
d) SS\: active low, slave device select
5. No acknowledgement mechanism
6. Both Master & slave consist of shift register
7. Configuration can be set/monitored through special built-in register SPCR,SPDR,SPSR
IRDA: INFRARED DATA ASSOCIATION
1. Serial, Half duplex, Line of sight based wireless technology

2. Point-Point, Point-Many Point provided that involved devices are within the line of sight
communication
3. Range 10cm-1m
4. Data rate: SIR (9600bps), MIR (1.15Mbps), FIR (4Mbps),VFIR (16Mbps),UFIR (96Mbps), GIR
(1Gbps)
5. Tx: LED, Rx: Photo diode & Trans-receivers
6. Example: TV Remote control
7. It was popular before Bluetooth in mobile phones for file exchange & low cost device.
8. Two layers: Physical layer & Protocol layer (User defined protocols).
BLUETOOTH
1. Low cost, Low power, short range wireless technology for data and audio communication.
2. Proposed by Ericsson in 1994.
3. Operates at 2.4Ghz and uses frequency hopping spread spectrum(FHSS).
4. Data rate 1Mbps to 24Mbps.
5. Data range 30 to 100 feet.
6. Two layers: Physical layer & Protocol layer (User defined protocols).
7. Each bluetooth device will have a 48 bit unique identification number.
8. P-P (Master-slave), P-MP( Piconet- Max. slaves 7).
9. File transfer in mobiles, Medical sectors.
WiFi- WIRELESS FEDILITY
1. It uses IEEE 802.11 standard

2. Operating frequency 2.4Ghz or 5Ghz
3. Range 100 to 1000 feet
4. Data rate 1Mbps to 1300Mbps
5. Wifi router:
a. Restricts the access to a n/w
b. Assign IP address to device
c. Route data packet to the Intended devices
For communicating with devices over a Wi-Fi network, the device when its Wi-Fi radio is turned ON,
searches the available Wi- Fi network in its vicinity and lists out the Service Set Identifier (SSID) of the
available networks. If the network is security enabled, a password may be required to connect to a particular
SSID. Wi-Fi employs different security mechanisms like Wired Equivalency Privacy (WEP) Wireless
Protected Access (WPA), etc. for securing the data communication.
ZIGBEE
1. Serial, Wireless, Short range.

2. Data range: Up to 75mtrs to 100mtrs.
3. Data rate: 20kbps to 250kbps.
4. Cost approximately 800- 2000 Rupees.
5. The purpose of the technology to Control and Sensors networks.
6. Operating frequency is 2.4Ghz.
7. It connects 65000 devices (in theory), 240 devices (in practice).
8. Home automation, Medical data collection, Industrial control system and IOT.
9. Low power device.
10. Six layers
d. Application layer
e. Interface layer
f. Security layer
g. Network layer
h. Medium access control layer
i. Physical layer
j.
Mesh Topology
ZigBee Coordinator (ZC)/Network Coordinator: The ZigBee coordinator acts as the root of the ZigBee
network. The ZC is responsible for initiating the ZigBee network and it has the capability to store
information about the network.
ZigBee Router (ZR)/Full func on Device (FFD): Responsible for passing information from device to
another device or to another ZR.
ZigBee End Device (ZED) /Reduced Function on Device (RFD): End device containing ZigBee
functionality for data communication. It can talk only with a ZR or ZC and doesn’t have the capability to act
as a mediator for transferring data from one device to another. The diagram shown in Fig. 2.34 gives an
overview of ZC, ZED and ZR in a ZigBee network.
UNIVERSAL ASYNCHRONOUS RECEIVER TRANSMITER (UART)
 Universal Asynchronous Receiver Transmitter (UART) based data transmission is an asynchronous form
of serial data transmission.
 The serial communication settings (Baud rate, number of bits per byte, parity, number of start bits and stop
bit and flow control) for both transmitter and receiver should be set as identical.
 The start and stop of communication is indicated through inserting special bits in the data stream. While
sending a byte of data, a start bit is added first and a stop bit is added at the end of the bit stream. The least
significant bit of the data byte follows the ‘start’ bit.
 The ‘start’ bit informs the receiver that a data byte is about to arrive. The receiver device starts polling its
‘receive line’ as per the baudrate settings. If the baudrate is ‘x’ bits per second, the time slot available for
one bit is 1/x seconds. If parity is enabled for communication, the UART of the transmitting device adds a
parity bit (bit value is 1 for odd number of 1s in the transmitted bit stream and 0 for even number of 1s).
 The UART of the receiving device calculates the parity of the bits received and compares it with the
received parity bit for error checking.
 The UART of the receiving device discards the ‘Start’, ‘Stop’ and ‘Parity’ bit from the received bit stream
and converts the received serial bit data to a word (In the case of 8 bits/byte, the byte is formed with the
received 8 bits with the first received bit as the LSB and last received data bit as MSB).
 For proper communication, the ‘Transmit line’ of the sending device should be connected to the ‘Receive
line’ of the receiving device.
RS-232 C
 RS-232 C (Recommended Standard number 232, revision C) is a full duplex, wired, asynchronous serial
communication interface.
 The RS-232 interface is developed by the Electronics Industries Association (EIA) during the early 1960s.
 RS-232 extends the UART communication signals for external data communication.
 UART uses the standard TTL/CMOS logic (Logic ‘High’ corresponds to bit value 1 and Logic ‘Low’
corresponds to bit value 0) for bit transmission whereas RS-232 follows the EIA standard for bit
transmission.
 As per the EIA standard, a logic ‘0’ is represented with voltage between +3 and +25V and a logic ‘1’ is
represented with voltage between –3 and –25V.

 The RS-232 interface defines various handshaking and control signals for communication apart from the
‘Transmit’ and ‘Receive’ signal lines for data communication.
 RS-232 supports two different types of connectors, namely; DB-9: 9-Pin connector and DB-25: 25-Pin
connector.
 RS-232 is a point-to-point communication interface and the devices involved in RS-232 communication are
called ‘Data Terminal Equipment (DTE)’ and ‘Data Communication Equipment (DCE)’.
 If no data flow control is required, only TXD and RXD signal lines and ground line (GND) are required for
data transmission and reception. The RXD pin of DCE should be connected to the TXD pin of DTE and vice
versa for proper data transmission.
 As per the EIA standard RS-232 C supports baudrates up to 20Kbps (Upper limit 19.2 Kbps) The commonly
used baudrates by devices are 300bps, 1200bps, 2400bps, 9600bps, 11.52Kbps and 19.2Kbps. 9600 is the
popular baudrate setting used for PC communication. The maximum operating distance supported by RS-
232 is 50 feet at the highest supported baudrate.
 Embedded devices contain a UART for serial communication and they generate signal levels conforming to
TTL/CMOS logic. A level translator IC like MAX 232 from Maxim Dallas semiconductor is used for
converting the signal lines from the UART to RS-232 signal lines for communication. On the receiving side
the received data is converted back to digital logic level by a converter IC. Converter chips contain
converters for both transmitter and receiver.
EMBEDDED FIRMWARE
 Embedded firmware refers to the control algorithm (Program instructions) and or the configuration settings
that an embedded system developer dumps into the code (Program) memory of the embedded system.
 There are various methods available for developing the embedded firmware. They are listed below.
(1) Write the program in high level languages like Embedded C/C++ using an Integrated Development
Environment. (2) Write the program in Assembly language using the instructions supported by your
application’s target processor/controller.
 The process of converting the program written in either a high level language or processor/controller
specific Assembly code to machine readable binary code is called ‘HEX File Creation’.
 The methods used for ‘HEX File Creation’ is different depending on the programming techniques used. If
the program is written in Embedded C/C++ using an IDE, the cross compiler included in the IDE converts it
into corresponding processor/controller understandable ‘HEX File’.
 If you are following the Assembly language based programming technique (method 2), you can use the
utilities supplied by the processor/controller vendors to convert the source code into ‘HEX File’. Also third
party tools are available, which may be of free of cost, for this conversion.
 For a beginner in the embedded software field, it is strongly recommended to use the high level language
based development technique. The reasons for this being: writing codes in a high level language is easy, the
code written in high level language is highly portable
 The embedded software development process in assembly language is tedious and time consuming. The
developer needs to know about all the instruction sets of the processor/controller or at least s/he should carry
an instruction set reference manual with her/him.
 Two types of control algorithm design exist in embedded firmware development.
o The first type of control algorithm development is known as the infinite loop or ‘super loop’ based
approach, where the control flow runs from top to bottom and then jumps back to the top of the
program in a conventional procedure. It is similar to the while (1) { }; based technique in C.
o The second method deals with splitting the functions to be executed into tasks and running these
tasks using a scheduler which is part of a General Purpose or Real Time Embedded Operating
System (GPOS/RTOS).
OTHER SYSTEM COMPONENTS:

The other system components refer to the components/circuits/ICs which are necessary for the proper
functioning of the embedded system.
RESET CIRCUIT
 The reset circuit is essential to ensure that the device is not operating at a voltage level where the device is
not guaranteed to operate, during system power ON.
 The reset signal brings the internal registers and the different hardware systems of the processor/ controller
to a known state and starts the firmware execution from the reset vector (Normally from vector address
0x0000 for conventional processors/controllers.
 The reset signal can be either active high (The processor undergoes reset when the reset pin of the processor
is at logic high) or active low (The processor undergoes reset when the reset pin of the processor is at logic
low).
 Since the processor operation is synchronised to a clock signal, the reset pulse should be wide enough to
give time for the clock oscillator to stabilise before the internal reset state starts.
BROWN-OUT PROTECTION CIRCUIT

 It prevents the processor/controller from unexpected program execution behavior when the supply voltage to
the processor/controller falls below a specified voltage.
 The processor behavior may not be predictable if the supply voltage falls below the recommended operating
voltage. It may lead to situations like data corruption.
 A brown-out protection circuit holds the processor/controller in reset state, when the operating voltage falls
below the threshold, until it rises above the threshold voltage.
 The Zener diode Dz and transistor Q forms the heart of this circuit. The transistor conducts always when the
supply voltage Vcc is greater than that of the sum of VBE and Vz (Zener voltage).
 The transistor stops conducting when the supply voltage falls below the sum of VBE and Vz. Select the
Zener diode with required voltage for setting the low threshold value for Vcc.
 Microprocessor Supervisor ICs like DS1232 from Maxim Dallas, provides Brown-out protection
OSCILLATOR UNIT
 Oscillator unit of the embedded system is responsible for generating the precise clock for the processor.
 Certain processors/controllers integrate a built-in oscillator unit and simply require an external ceramic
resonator/quartz crystal for producing the necessary clock signals.
 The speed of operation of a processor is primarily dependent on the clock frequency. However we cannot
increase the clock frequency blindly for increasing the speed of execution.
 The total system power consumption is directly proportional to the clock frequency. The power consumption
increases with increase in clock frequency.
 The accuracy of program execution depends on the accuracy of the clock signal.
REAL-TIME CLOCK (RTC)

 It is a system component responsible for keeping track of time.
 RTC holds information like current time (In hours, minutes and seconds) in 12 hour/24 hour format, date,
month, year, day of the week, etc. and supplies timing reference to the system.
 RTCs are available in the form of Integrated Circuits from different semiconductor manufacturers like
Maxim/Dallas, ST Microelectronics etc.
 The RTC chip contains a microchip for holding the time and date related information and backup battery
cell for functioning in the absence of power, in a single IC package.
 The RTC can be configured to interrupt the processor at pre defined intervals or to interrupt the processor
when the RTC register reaches a specified value (used as alarm interrupt).
WATCHDOG TIMER
 In desktop Windows systems, if we feel our application is behaving in an abnormal way or if the system
hangs up, we have the ‘Ctrl + Alt + Del’ to come out of the situation.
 In Embedded system, we have a watchdog to monitor the firmware execution and reset the system
processor/microcontroller when the program execution hangs up.
 A watchdog timer is a hardware timer for monitoring the firmware execution.
 Depending on the internal implementation, the watchdog timer increments/decrements a free running
counter with each clock pulse and generates a reset signal to reset the processor.
 The Microprocessor supervisor IC DS1232 integrates a hardware watchdog timer in it.
MODULE 5
MODULE 5
RTOS & IDE FOR EMBEDDED SYSTEM DESIGN

Prepared by:
Mr.Chetan.R, Sr.Asst. Professor
1. With a neat diagram, explain operating system architecture
 The operating system acts as a bridge between the user applications/tasks and the underlying system resources
through a set of system functionalities and services.
 The OS manages the system resources and makes them available to the user applications/tasks on a need basis.
 A normal computing system is a collection of different I/O subsystems, working, and storage memory.
 The primary functions of an operating system is

 Make the system convenient to use
 Organise and manage the system resources efficiently and correctly
Figure 10.1 gives an insight into the basic components of an operating system and their interfaces with rest of the
world.
The Kernel
 The kernel is the core of the operating system and is responsible for managing the system resources and the
communication among the hardware and other system services.
 Kernel acts as the abstraction layer between system resources and user applications.
 Kernel contains a set of system libraries and services.
For a general purpose OS, the kernel contains different services for handling the following.
Process Management: It includes setting up the memory space for the process, loading the process’s code into the
memory space, allocating system resources, scheduling and managing the execution of the process, setting up and
managing the Process Control Block (PCB), Inter Process Communication and synchronisation, process termination/
deletion, etc.
Primary Memory Management: The term primary memory refers to the volatile memory (RAM) where processes are
loaded and variables and shared data associated with each process are stored.
The Memory Management Unit (MMU) of the kernel is responsible for
 Keeping track of which part of the memory area is currently used by which process
 Allocating and De-allocating memory space on a need basis (Dynamic memory allocation).
File System Management: File is a collection of related information. A file could be a program (source code or executable),
text files, image files, word documents, audio/video files, etc. Each of these files differ in the kind of information they
hold and the way in which the information is stored. The file operation is a useful service provided by the OS.
The file system management service of Kernel is responsible for
 The creation, deletion and alteration of files

 Creation, deletion and alteration of directories
 Saving of files in the secondary storage memory (e.g. Hard disk storage)
 Providing automatic allocation of file space based on the amount of free space available
 Providing a flexible naming convention for the files
I/O System (Device) Management: Kernel is responsible for routing the I/O requests coming from different user applications
to the appropriate I/O devices of the system. In a well-structured OS, the direct accessing of I/O devices are not allowed
and the access to them are provided through a set of Application Programming Interfaces (APIs) exposed by the kernel.
The kernel maintains a list of all the I/O devices of the system. This list may be available in advance, at the time of
building the kernel. Some kernels, dynamically updates the list of available devices as and when a new device is installed
(e.g. Windows NT kernel keeps the list updated when a new plug ‘n’ play USB device is attached to the system).
The service ‘Device Manager’ of the kernel is responsible for handling all I/O device related operations. The kernel talks
to the I/O device through a set of low-level systems calls, which are implemented in a service, called device drivers.
The device drivers are specific to a device or a class of devices. The Device Manager is responsible for
 Loading and unloading of device drivers
 Exchanging information and the system specific control signals to and from the device
Secondary Storage Management The secondary storage management deals with managing the secondary storage memory
devices, if any, connected to the system. Secondary memory is used as backup medium for programs and data since the
main memory is volatile. In most of the systems, the secondary storage is kept in disks (Hard Disk).
The secondary storage management service of kernel deals with

 Disk storage allocation
 Disk scheduling (Time interval at which the disk is activated to backup data)
 Free Disk space management
Protection Systems Most of the modern operating systems are designed in such a way to support multiple users with
different levels of access permissions (e.g. Windows 10 with user permissions like ‘Administrator’, ‘Standard’,
‘Restricted’, etc.).
Protection deals with implementing the security policies to restrict the access to both user and system resources by
different applications or processes or users. In multiuser supported operating systems, one user may not be allowed to
view or modify the whole/portions of another user’s data or profile details. In addition, some application may not be
granted with permission to make use of some of the system resources. This kind of protection is provided by the
protection services running within the kernel.
Interrupt Handler Kernel provides handler mechanism for all external/internal interrupts generated by the system. These are
some of the important services offered by the kernel of an operating system. It does not mean that a kernel contains no
more than components/services explained above.
Depending on the type of the operating system, a kernel may contain lesser number of components/services or more
number of components/ services. In addition to the components/services listed above, many operating systems offer a
number of add- on system components/services to the kernel. Network communication, network management, user-
interface graphics, timer services (delays, timeouts, etc.), error handler, database management, etc. are examples for such
components/services.
Kernel exposes the interface to the various kernel applications/services, hosted by kernel, to the user applications through
a set of standard Application Programming Interfaces (APIs). User applications can avail these API calls to access the
various kernel application/services.
2. Differentiate between hard real time and soft real time operating system with an example for each.
Hard Real-Time:
 Real-Time Operating Systems that strictly adhere to the timing constraints for a task is referred as ‘Hard Real-Time’
systems.
 A Hard Real-Time system must meet the deadlines for a task without any slippage.
 Missing any deadline may produce catastrophic results for Hard Real-Time Systems, including permanent data lose
and irrecoverable damages to the system/users.
 A system can have several such tasks and the key to their correct operation lies in scheduling them so that they meet
their time constraints.
 Air bag control systems and Anti-lock Brake Systems (ABS) of vehicles are typical examples for Hard Real-Time
Systems.
 The Air bag control system should be into action and deploy the air bags when the vehicle meets a severe
accident. Ideally speaking, the time for triggering the air bag deployment task, when an accident is sensed by the
Air bag control system, should be zero and the air bags should be deployed exactly within the time frame, which
is predefined for the air bag deployment task.
Soft Real-Time:
 Real-Time Operating System that does not guarantee meeting deadlines, but offer the best effort to meet the deadline
are referred as ‘Soft Real-Time’ systems.
 Missing deadlines for tasks are acceptable for a Soft Real- time system if the frequency of deadline missing is within
the compliance limit of the Quality of Service (QoS).
 Automatic Teller Machine (ATM) is a typical example for Soft-Real-Time System.
 If the ATM takes a few seconds more thanthe ideal operation time, nothing fatal happens.
 An audio-video playback system is another example for Soft Real-Time system.
3. Define Task, Process and Threads
The term ‘task’: It is defined as the program in execution and the related information maintained by the operating
system for the program. Task is also known as ‘Job’. A program or part of it in execution is also called a ‘Process’. The
terms ‘Task’, ‘Job’ and ‘Process’ refer to the same entity in the operating system and most often they are used
interchangeably.
A ‘Process’ is a program, or part of it, in execution. Process is also known as an instance of a program in execution.
Multiple instances of the same program can execute simultaneously. A process requires various system resources like
CPU for executing the process, memory for storing the code corresponding to the process and associated variables, I/O
devices for information exchange, etc. A process is sequential in execution.
A thread is the primitive that can execute code. A thread is a single sequential flow of control within a process. ‘Thread’
is also known as lightweight process. A process can have many threads of execution. Different threads, which are part of
a process, share the same address space; meaning they share the data memory, code memory and heap memory area.
Threads maintain their own thread status (CPU register values), Program Counter (PC) and stack.
4. Explain the process structure, process states and state transistions
 The concept of ‘Process’ leads to concurrent execution (pseudo parallelism) of tasks and thereby the efficient
utilisation of the CPU and other system resources.
 Concurrent execution is achieved through the sharing of CPU among the processes.
 A process mimics a processor in properties and holds a set of registers, process status, a Program Counter (PC) to
point to the next executable instruction of the process, a stack for holding the local variables associated with the
process and the code corresponding to the process. This can be visualisedas shown in Fig. 10.4.
 A process which inherits all the properties of the CPU can be considered as a virtual processor, awaiting its turn to
have its properties switched into the physical processor. When the process gets its turn, its registers and the program
counter register becomes mapped to the physical registers of the CPU.
 From a memory perspective, the memory occupied by the process is segregated into three regions, namely, Stack
memory, Data memory and Code memory (Fig. 10.5).
 The ‘Stack’ memory holds all temporary data such as variables local to the process.
 Data memory holds all global data for the process.
 The code memory contains the program code (instructions) correspondingto the process.
Process States and State Transition

 The creation of a process to its termination is not a single step operation.
 The process traverses through a series of states during its transition from the newly created state to the terminated
state.
 The cycle through which a process changes its state from ‘newly created’ to ‘execution completed’ is known as
‘Process Life Cycle’.
 The various states through which a process traverses through during a Process Life Cycle indicates the current status
of the process with respect to time and also provides information on what it is allowed to do next. Figure 10.6
representsthe various states associated with a process.
 The state at which a process is being created is referred as ‘Created State’. The Operating System recognises a
process in the ‘Created State’ but no resources are allocated to the process.
 The state, where a process is incepted into the memory and awaiting the processor time for execution, is known as
‘Ready State’. At this stage, the process is placed in the ‘Ready list’ queue maintained by the OS.
 The state where in the source code instructions corresponding to the process is being executed is called ‘Running
State’. Running state is the state at which the process execution happens.
 ‘Blocked State/Wait State’ refers to a state where a running process is temporarily suspended from execution and
does not have immediate access to resources. The blocked state might be invoked by various conditions like: the
process enters a wait state for an event to occur (e.g. Waiting for user inputs such as keyboard input) or waiting for
getting access to a shared resource.
 A state where the process completes its execution is known as ‘Completed State’.
 The transition of a process from one state to another is known as ‘State transition’.
 When a process changes its state from Ready to running or from running to blocked or terminated or from blocked to
running, the CPU allocation for the process mayalso change.
5. Explain multithreading
 A process/task in embedded application may be a complex or lengthy one and it may contain various suboperations
like getting input from I/O devices connected to the processor, performing some internal calculations/operations,
updating some I/O devices etc. If all the sub functions of a task are executed in sequence, the CPU utilisation may
not be efficient. For example, if the process is waiting for a user input, the CPU enters the wait state for the event,
and the process execution also enters a wait state.
 Instead of this single sequential execution of the whole process, if the task/process is split into different threads
carrying out the different subfunctionalities of the process, the CPU can be effectively utilised and when the thread
corresponding to the I/O operation enters the wait state, another threads which do not require the I/O event for their
operation can be switched into execution. This leads to more speedy execution of the process and the efficient
utilisation of the processor time and resources.
 The multithreaded architecture of a process can be better visualised with the thread-process diagram shown in Fig.
10.8.
 If the process is split into multiple threads, which executes a portion of the process, there will be a main thread and
rest of the threads will be created within the main thread.
Use of multiple threads to execute a process brings the following advantage.
 Better memory utilisation. Multiple threads of the same process share the address space for data memory. This
also reduces the complexity of inter thread communication since variables can be shared across the threads.
 Since the process is split into different threads, when one thread enters a wait state, the CPU can be utilised by
other threads of the process that do not require the event, which the other thread is waiting, for processing. This
speeds up the execution of the process.
 Efficient CPU utilisation. The CPU is engaged all time.
6. Differentiate between Multiprocessing and Multitasking
Multiprocessing:
 Systems which are capable of performing multiprocessing, are known as multiprocessor systems. Multiprocessor
systems possess multiple CPUs and can execute multiple processes simultaneously.
Multitasking:
 The ability of the operating system to have multiple programs in memory, which are ready for execution, is referred
as multiprogramming. In a uniprocessor system, it is not possible to execute multiple processes simultaneously.
However, it is possible for a uniprocessor system to achieve some degree of pseudo parallelism in the execution of
multiple processes by switching the execution amongdifferent processes.
 The ability of an operating system to hold multiple processes in memory and switch the processor (CPU) from
executing one process to another process is known as multitasking.
Context Switching
 Multitasking creates the illusion of multiple tasks executing in parallel. Multitasking involves the switching of CPU
fromexecuting one task to another.
 In a multitasking environment, when task/process switching happens, the virtual processor (task/process) gets its
properties converted into that of the physical processor.
 The switching of the virtual processor to physical processor is controlled by the scheduler of the OS kernel. Whenever
a CPU switching happens, the current context of execution should be saved to retrieve it at a later point of time when
the CPU executes the process, which is interrupted currentlydue to execution switching.
 The context saving and retrieval is essential for resuming a process exactly from the point where it was interrupted
due to CPU switching.
 The act of switching CPU among the processes or changing the current execution context is known as ‘Context
switching’.
 The act of saving the current context which contains the context details (Register details, memory details, system
resource usage details, execution details, etc.) for the currently running process at the time of CPU switching is
known as ‘Context saving’.
 The process of retrieving the saved context details for a process, which is going to be executed due to CPU
switching, is known as ‘Context retrieval’. Multitasking involves ‘Context switching’ (Fig. 10.11), ‘Context
saving’ and ‘Context retrieval’.
7. Explain the concept of deadlock with a neat diagram and mention how to avoid deadlock
 In a multiprogramming environment several processes may compete for a finite number of resources. A process
request resources; if the resource is available at that time a process enters the wait state. Waiting process may never
change its state because the resources requested are held by other waiting process. This situation is known as
deadlock.
Deadlock Characteristics: In a deadlock process never finish executing and system resources are tied up. A deadlock
situation can arise if the following four conditions hold simultaneously in a system.
 Mutual Exclusion: At a time only one process can use the resources. If another process requests that resource,
requesting process must wait until the resource has been released.
 Hold and wait: A process must be holding at least one resource and waiting to additional resource that is
currently held by other processes.
 No Preemption: Resources allocated to a process can’t be forcibly taken out from it, unless it releases that
resource after completing the task.
 Circular Wait: A set {P0, P1, …….Pn} of waiting state/ process must exists such that P0 is waiting for a
resource that is held by P1, P1 is waiting for the resource that is held by P2 ….. P(n – 1) is waiting for the
resource that is held by Pn and Pn is waiting for the resources that is held by P4.
To avoid deadlock:
 Deadlock Handling: Smart OS
 Ignore Deadlocks: Deadlock free
 Detect & Recover: Traffic light signal mechanism
 Avoid deadlock: Careful resource allocation
 Prevent Deadlocks: Sleep & Wakeup
8. Write a note on Message passing
Message passing is an
 Synchronous information exchange mechanism used for Inter Process/Thread Communication.
 The major difference between shared memory and message passing technique is that, through shared memory lots of
data can be shared whereas only limited amount of info/data is passed through message passing.
 Also message passing is relatively fast and free from the synchronisation overheads compared to shared memory.
Based on the message passing operation between the processes, message passing is classified into
 Message Queues: Process which wants to talk to another process posts the message to a First-In-First-Out (FIFO)
queue called „Message queue‟, which stores the messages temporarily in a system defined memory object, to pass
it to the desired process. Messages are sent and received through send (Name of the process to which the message is
to be sent, message) and receive (Name of the process from which the message is to be received, message) methods.
The messages are exchanged through a message queue. The implementation of the message queue, send and receive
methods are OS kernel dependent
 Mailbox: Mailbox is a special implementation of message queue. Usually used for one way communication, only a
single message is exchanged through mailbox whereas „message queue‟ can be used for exchanging multiple
messages. One task/process creates the mailbox and other tasks/process can subscribe to this mailbox for getting
message notification. The implementation of the mailbox is OS kernel dependent. The MicroC/ OS-II RTOS
implements mailbox as a mechanism for inter task communication
 Signalling: Signals are used for an asynchronous notification mechanism. The signal mainly used for the execution
synchronization of tasks process/ tasks. Signals do not carry any data and are not queued. The implementation of
signals is OS kernel dependent and VxWorks RTOS kernel implements „signals‟ for inter process communication.
9. Explain the concept of Semaphore
 Semaphore is a sleep and wakeup based mutual exclusion implementation for shared resource access.
 Semaphore is a system resource and the process which wants to access the shared resource can first acquire this
system object to indicate the other processes which wants the shared resource that the shared resource is currently
acquired by it.
 The resources which are shared among a process can be either for exclusive use by a process or for using by a number
of processes at a time.
 The display device of an embedded system is a typical example for the shared resource which needs exclusive
access by a process.
 The Hard disk (secondary storage) of a system is a typical example for sharing the resource among a limited
number of multiple processes. Various processes can access the different sectors of the hard-disk concurrently.
Based on the implementation of the sharing limitation of the shared resource, semaphores are classified into two;
namely ‘Counting Semaphore’ and ‘Binary Semaphore’.
 The ‘CountingSemaphore’
 It limits the access of resources by a fixed number of processes/threads.
 It maintains a count between zero and a maximum value.
 It limits the usage of the resource to the maximum value of the count supported by it.
 A real world example for the counting semaphore concept is the dormitory system for accommodation (Fig.
10.34). A dormitory contains a fixed number of beds (say 5) and at any point of time it can be shared by the
maximum number of users supported by the dormitory. If a person wants to avail the dormitory facility, he/she
can contact the dormitory caretaker for checking the availability. If beds are available in the dorm the caretaker
will hand over the keys to the user. If beds are not available currently, the user can register his/her name to get
notifications when a slot is available. Those who are availing the dormitory shares the dorm facilities like TV,
telephone, toilet, etc. When a dorm user vacates, he/she gives the keys back to the caretaker. The caretaker informs
the users, who booked in advance, about the dorm availability.
 The ‘binary semaphore’

 Provides exclusive access to shared resource by allocating the resource to a single process at a time and not
allowing the other processes to access it when it is being owned by a process.
 The implementation of binary semaphore is OS kernel dependent. Under certain OS kernel it is referred as mutex.
 Any process/thread can create a ‘mutex object’ and other processes/threads of the system can use this ‘mutex
object’ for synchronising the access to critical sections.
 Only one process/ thread can own the ‘mutex object’ at a time.
 A real world example for the mutex concept is the hotel accommodation system (lodging system) Fig. 10.35. The
rooms in a hotel are shared for the public. Any user who pays and follows the norms of the hotel can avail the
rooms for accommodation. A person wants to avail the hotel room facility can contact the hotel reception for
checking the room availability (see Fig. 10.35). If room is available the receptionist will handover the room key to
the user. If room is not available currently, the user can book the room to get notifications when a room is
available. When a person gets a room he/she is granted the exclusive access to the room facilities like TV,
telephone, toilet, etc. When a user vacates the room, he/she gives the keys back to the receptionist. The
receptionist informs the users, who booked in advance, about the room’s availability
10. Explain the functional and non functional requirements for selecting RTOS for an embedded system
Functional Requirements
 Processor Support It is not necessary that all RTOS’s support all kinds of processor architecture. It is essential to
ensure the processor support by the RTOS.
 Memory Requirements The OS requires ROM memory for holding the OS files and it is normally stored in a non-
volatile memory like FLASH. OS also requires working memory RAM for loading the OS services. Since embedded
systems are memory constrained, it is essential to evaluate the minimal ROM and RAM requirements for the OS
under consideration.
 Real-time Capabilities It is not mandatory that the operating system for all embedded systems need to be Real-time and all
embedded Operating systems are ‘Real-time’ in behaviour. The task/process scheduling policies plays an important
role in the ‘Real-time’ behaviour of an OS. Analyse the real-time capabilities of the OS under consideration and the
standards met by the operating system for real-time capabilities.
 Kernel and Interrupt Latency The kernel of the OS may disable interrupts while executing certain services and it may
lead to interrupt latency. For an embedded system whose response requirements are high, this latency should be
minimal.
 Inter Process Communication and Task Synchronisation The implementation of Inter Process Communication and
Synchronisation is OS kernel dependent. Certain kernels may provide a bunch of options whereas others provide very
limited options. Certain kernels implement policies for avoiding priorityinversion issues in resource sharing.
 Modularisation Support Most of the operating systems provide a bunch of features. At times it may not be necessary
for an embedded product for its functioning. It is very useful if the OS supports modularisation where in which the
developer can choose the essential modules and re-compile the OS image for functioning. Windows CE is an example
for a highly modular operating system.
 Support for Networking and Communication The OS kernel may provide stack implementation and driver support for a
bunch of communication interfaces and networking. Ensure that the OS under consideration provides support for all
the interfaces required by the embedded product.
 Development Language Support Certain operating systems include the run time libraries required for running
applications written in languages like Java and C#. A Java Virtual Machine (JVM) customised for the Operating
System is essential for running java applications. Similarly the .NET Compact Framework (.NETCF) is required for
running Microsoft® .NET applications on top of the Operating System. The OS may include these components as
built-in component, if not, check the availability of the same from a third party vendor for the OS under consideration.
Non-functional Requirements
 Custom Developed or Off the Shelf Depending on the OS requirement, it is possible to go for the complete development
of an operating system suiting the embedded system needs or use an off the shelf, readily available operating system,
which is either a commercial product or an Open Source product, which is in close match with the system
requirements. Sometimes it may be possible to build the required features by customising an Open source OS. The
decision on which to select is purely dependent on the development cost, licensing fees for the OS, development time
and availability of skilled resources.
 Cost The total cost for developing or buying the OS and maintaining it in terms of commercial product and custom
build needs to be evaluated before taking a decision on the selection of OS.
 Development and Debugging Tools Availability The availability of development and debugging tools is a critical decision
making factor in the selection of an OS for embedded design. Certain Operating Systems may be superior in
performance, but the availability of tools for supporting the development may be limited. Explore the different tools
available for the OS under consideration.
 Ease of Use How easy it is to use a commercial RTOS is another important feature that needs to be considered in the
RTOS selection.
 After Sales For a commercial embedded RTOS, after sales in the form of e-mail, on-call services, etc. for bug fixes,
critical patch updates and support for production issues, etc. should be analysed thoroughly.
11. Explain the role of Integrated Development Environment (IDE) for Embedded Software Development
 In embedded system development context, Integrated Development Environment (IDE) stands for an integrated
environment for developing and debugging the target processor specific embedded firmware.
 IDE is a software package which bundles a ‘Text Editor (Source Code Editor)’, ‘Cross-compiler (for cross platform
development and compiler for same
 IDEs used in embedded firmware development are slightly different from the generic IDEs used for high level
language based development for desktop applications.
In Embedded Applications, the IDE is either supplied by the target processor/controller manufacturer or by third party
vendors or as Open Source.
 MPLAB is an IDE tool supplied by microchip for developing embedded firmware using their PIC family of
microcontrollers.
 Keil µVision5 from ARMKeil is an example for a third party IDE, which is used for developing embedded
firmware for ARM family microcontrollers.
 CodeWarrior Development Studio is an IDE for ARM family of processors/MCUs and DSP chips from Freescale.
 It should be noted that in embedded firmware development applications each IDE is designed for a specific family of
controllers/processors and it may not be possible to develop firmware for all family of controllers/processors using a
single IDE
 However there is a rapid move happening towards the open source IDE, Eclipse for embedded development. Most of
the proccessor/control manufacturers and third party IDE providers are trying to build the IDE around the popular
Eclipse open source IDE.
12. Explain boundary scanning for hardware testing with diagram
 As the complexity of the hardware increase, the number of chips present in the board and the interconnection among
them may also increase.
 The device packages used in the PCB become miniature to reduce the total board space occupied by them and
multiple layers may be required to route the interconnections among the chips. With miniature device packages and
multiple layers for the PCB it will be very difficult to debug the hardware using magnifying glass, multimeter, etc. to
check the interconnection among the various chips.
 Boundary scan is a technique used for testing the interconnection among the various chips, which support JTAG
interface, present in the board.
Chips which support boundaryscan associate a boundary scan cell with each pin of the device.
 A JTAG port which contains the five signal lines namely TDI, TDO, TCK, TRST and TMS form the Test Access
Port (TAP) for a JTAG supported chip. Each device will have its own TAP.
 The PCB also contains a TAP for connecting the JTAG signal lines to the external world. A boundary scan path is
formed inside the board by interconnecting the devices through JTAG signal lines.
 The TDI pin of the TAP of the PCB is connected to the TDI pin of the first device.
 The TDO pin of the first device is connected to the TDI pin of the second device. In this way all devices are
interconnected and the TDO pin of the last JTAG device is connected to the TDO pin of the TAP of the PCB.
 The clock line TCK and the Test Mode Select (TMS) line of the devices are connected to the clock line and Test
mode select line of the Test Access Port of the PCB respectively. This forms a boundary scan path. Figure 13.41
illustrates the same.
 As mentioned earlier, each pin of the device associates a boundary scan cell with it. The boundary scan cell is a
multipurpose memory cell. The boundary scan cell associated with the input pins of an IC is known as ‘input cells’ and
the boundary scan cells associated with the output pins of an IC is known as ‘output cells’.
 The boundary scan cells can be used for capturing the input pin signal state and passing it to the internal circuitry,
capturing the signals from the internal circuitry and passing it to the output pin, and shifting the data received from the
Test Data In pin of the TAP.
 The boundary scan cells associated with the pins are interconnected and they form a chain from the TDI pin of the
device to its TDO pin.
The boundary scan cells can be operated in Normal, Capture, Update and Shift modes.
 In the Normal mode, the input of the boundaryscan cell appears directly at its output.
 In the Capture mode, the boundary scan cell associated with each input pin of the chip captures the signal from the
respective pins to the cell and the boundary scan cell associated with each output pin of the chip captures the signal
from the internal circuitry. In the
 Update mode, the boundary scan cell associated with each input pin of the chip passes the already captured data to
the internal circuitry and the boundary scan cell associated with each output pin of the chip passes the already captured
data to the respective output pin.
 In the shift mode, data is shifted from TDI pin to TDO pin of the device through the boundary scan cells. ICs
supporting boundary scan contain additional boundary scan related registers for facilitating the boundary scan
operation. Instruction Register, Bypass Register, Identification Register, etc. are examples of boundary scan related
registers.
13. Write a note on a. Disassembler b. Decompiler c. Debugging d. Emulator e. Simulator
 Disassembler is a utility program which converts machine codes into target processor specific Assembly
codes/instructions. The process of converting machine codes into Assembly code is known as ‘Disassembling’.
 De-compilers reproduce the code in a high level language. Frequently, this high level language is C, because C is
simple and primitive enough to facilitate the decompilation process. Decompilation does have its drawbacks, because
lots of data and readability constructs are lost during the original compilation process, and they cannot be reproduced.
Since the science of decompilation is still young, and results are "good" but not "great”.
 Debugging in embedded application is the process of diagnosing the firmware execution, monitoring the target
processor’s registers and memory while the firmware is running and checking the signals from various buses of the
embedded hardware.
 Debugging process in embedded application is broadly classified into two, namely; hardware debugging and
firmware debugging.
 Hardware debugging deals with the monitoring of various bus signals and checking the status lines of the target
hardware.
 Firmware debugging deals with examining the firmware execution, execution flow, changes to various CPU
registers and status registers on execution of the firmware to ensure that the firmware is runningas per the design.
 Emulator is a self-contained hardware device which emulates the target CPU. The emulator hardware contains
necessary emulation logic and it is hooked to the debugging application running on the development PC on one end
and connects to the target board through some interface on the other end.
 Simulator is a software application that precisely duplicates (mimics) the target CPU and simulates the various
features and instructions supported by the target CPU.

Mces Notes

Uploaded by

Mces Notes

Uploaded by

VTU: 21CS43

MICROCONTROLLER AND EMBEDDED SYSTEMS

Textbook 1: Chapter 1 - 1.1 to 1.4, Chapter 2 - 2.1 to 2.5

Calls, Pointer Aliasing,

Textbook 1: Chapter 3: Sections 3.1 to 3.6 (Excluding 3.5.2), Chapter 5

Teaching-Learning Process 1. Demonstration of sample code using Keil software.

Teaching-Learning Process 1. Demonstration of sample code using Keil software.

Textbook 2: Chapter 1 (Sections 1.2 to 1.6), Chapter 2 (Sections 2.1 to 2.6)

Semester End Examination:

Suggested Learning Resources:

Activity Based Learning (Suggested Activities in Class)/ Practical Based learning

1. Compare microprocessor and microcontroller

1. With example,explain the following ARM instructions

1. Differentiate between Embedded Vs General Computing system.

1. With a neat diagram, explain operating system architecture

Prepared by: Mr.Chetan R, Sr.Asst.Professor, ECE Dept.

ARM Embedded System

ARM Design Philosophy

Embedded System Hardware

The four main hardware components are:

ARM Bus Technology

Embedded System Software

Current Program Status Register (CPSR)

 Fetch loads an instruction from memory.

Exceptions, Interrupts, and the Vector Table

Software Interrupt Instruction

Program Status Register Instructions

HERE B HERE NOTE: (BEFORE THE

RESULT DCD 0X40000000

AREA ARMPGM,CODE,READONLY AREA ARMPGM,CODE,READONLY

SUB R6, #0X1

Prepared by: Mr. Chetan R, Sr. Asst. Professor, ECE Dept.

BASIC C DATA TYPES

LOCAL VARIABLE TYPES

FUNCTION ARGUMENT TYPES

SIGNED VERSUS UNSIGNED TYPES

It takes three instructions to implement the for loop structure:

On the ARM, a loop should only use two instructions:

The SUBS and BNE instructions implement the loop.

LOOPS USING A VARIABLE NUMBER OF ITERATIONS

To implement a function efficiently, you need to

The following function increments two timer values by a step amount:

Prepared by: Mr. Chetan R,

To improve the memory usage, you should reorder the elements

will be laid out in memory as

stages |= STAGEA; /* enable stage A */

UNALIGNED DATA AND ENDIANNESS

offset = (offset + increment) % buffer_size;

Instead it is far more efficient to write

INLINE FUNCTIONS AND INLINE ASSEMBLY

Notes prepared by:

EMBEDDED SYSTEMS VS GENERAL COMPUTING SYSTEMS

CLASSIFICATION OF EMBEDDED SYSTEMS

2. On complexity & performance

MAJOR APPLICATION AREA OF EMBEDDED SYSTEMS

1. Consumer Electronics: Camcorders, Cameras.

PURPOSE OF EMBEDDED SYSTEM

3. Data signal processing:

6. Application specific user interface:

ELEMENTS OF EMBEDDED SYSTEMS

· As defined earlier, an embedded system is a combination of 3 things:

 Hence the embedded systems can be viewed as a reactive system.

CORE THE OF EMBEDDED SYSTEM

1. GENERAL PURPOSE AND DOMAIN SPECIFIC PROCESSOR.

1.3. Digital Signal Processors

DSP includes following key units:

HARVARD V/S VON- NEUMANN

I/O CPU Memory Program CPU Data

Fig. 2.2 Princeton v/s Harvard architecture

RISC AND CISC

BIG-ENDIAN VS. LITTLE-ENDIAN PROCESSORS/CONTROLLERS

2. APPLICATION SPECIFIC INTEGRATED CIRCUITS. (ASIC)

3. PROGRAMMABLE LOGIC DEVICES(PLD’S)

PLDs having following two major types.

4. COMMERCIAL OFF-THE-SHELF COMPONENTS (COTS)

SENSORS & ACTUATORS