0% found this document useful (0 votes)
40 views49 pages

02 - Embedded and Edge Hardware

The document outlines a lecture on Embedded and Edge Hardware for Artificial Intelligence, covering technological platforms, algorithms, and the architecture of embedded systems. It discusses the role of sensors, data types, and memory requirements in embedded AI applications, as well as various sensor types and their functionalities. Additionally, it highlights practical examples of AI applications in different domains, including environmental monitoring and human interaction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views49 pages

02 - Embedded and Edge Hardware

The document outlines a lecture on Embedded and Edge Hardware for Artificial Intelligence, covering technological platforms, algorithms, and the architecture of embedded systems. It discusses the role of sensors, data types, and memory requirements in embedded AI applications, as well as various sensor types and their functionalities. Additionally, it highlights practical examples of AI applications in different domains, including environmental monitoring and human interaction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Technologies for Artificial Intelligence

Prof. Manuel Roveri – [Link]@[Link]

Lecture 2 – Embedded and Edge Hardware


Course Topics

1. Introduction to technological platforms for AI


2. Embedded and Edge AI
a. The technology
b. The Algorithms
c. Machine Learning for Embedded and Edge AI
d. Deep Learning for Embedded and Edge AI
3. Cloud computing and AI
a. Cloud computing and the ”as-a-service” approach
b. Machine and Deep Learning as a service
c. Time-series: analysis and prediction
d. Generative AI

2
An overview of an embedded and edge AI system
Software
Preprocessing
Sensors extracting the features and reducing the noise and perform
Feature Extraction classification

Wake-word detection
they are part of the the hardware Machine and Deep
Learning Algorithm

Postprocessing
Actuators Decision making for example opening the door
Person detection

Hardware
Gesture recognition

3
An overview of an embedded and edge AI system
Software
Preprocessing
This lecture!
Sensors
Feature Extraction

Wake-word detection
Machine and Deep
the hardware Learning Algorithm

Postprocessing
Actuators Decision making
Person detection

Hardware
Gesture recognition

4
An overview of an embedded and edge AI system
Software
Preprocessing
Next lecture!
Sensors
Feature Extraction

Wake-word detection
Machine and Deep
Learning Algorithm

Postprocessing
Actuators Decision making
Person detection

Hardware
Gesture recognition

5
The embedded perspective
we have one single application in embedded systems

• Embedded systems are the


computers that control
the electronics of all sorts
of physical devices

• Embedded software is
software that runs on
embedded systems
like mobile phones and laptops

6
7

Differently from
general purpose
computers (e.g., a
laptop or smartphone)
embedded systems are
usually meant to
perform one specific,
dedicated task.

7
8
Embedded and Edge AI Hardware architecture

do the specific operation faster

memory that is kept every


time we turn off the device

Image taken from [1]


8
9
Embedded and Edge AI Hardware architecture

for example for speed up mass multiplications

connecton with sensors and other hardwares

The general purpose processor running the embedded application

9
10
Embedded and Edge AI Hardware architecture

for example we need floating point unit and instead we use multiple integer operation and each of them has different hardwares
Built-in additional hardware meant to provide highly efficient computation on certain
operation (e.g, a floating point unit –FPU- to perform FP operations)

10
11
Embedded and Edge AI Hardware architecture

the bottleneck of any embedded and edge AI

Working memory meant to support program execution


(additional external RAM could be available)

11
RAM is typically the bottleneck in Embedded and Edge AI
12

ü Very fast memory, but it is very energy consuming


ü It’s volatile (content is lost when power shuts down)we lose every thing when we shut down
the computer

ü Cost
ü Large physical space on the device

12
13
Embedded and Edge AI Hardware architecture

is typically a flash memory


for example data that we used in
mobile phone and it is larger than
volatile and it is slower than
volatile

The non-volatile memory is the Flash memory. Used to store things that do not change often and
that must preserved when the system shuts down (e.g., the software program, the device
configruation). Slow to be read, and extremely slow to be written.
13
14
Embedded and Edge AI Hardware architecture

Discrete coprocessors are external co-processor supporting efficent and fast matemathical operations
(e.g., a GPU or a TPU)

14
15
Embedded and Edge AI Hardware architecture

Provide an interface with the rest of the system (e.g., sensor and network hardware) through
technological standardized solutions (e.g, GPIO, I2C, SPI, and UART)
peripheral are interfaces

15
16
Embedded and Edge AI Hardware architecture: the sensors
- discrete is isseparated chip
Integrated built-into CPU
- Integrated
Gaming PC is Uses built-into CPU
a discrete GPU (NVIDIA RTX 4090) for high-quality rendering.
-
discrete is separated chip

Let’s see them in detail …

16
17
Sensors and signals: the edge AI perspective

• Sensors allow to acquire measurements


from the environment or from humans
• Sensors generate streams of data
Sensors • Other sources of data are available
and (beyond sensors such as digital device
logs, network packets and radio
signals
transmissions
• Sensors can provide different data formats:
time series, audio, image, video, others…

17
How to store data values?
18

• Boolean (1 bit): a number with 2 possible values


• 8-bit integer: a non-decimal number with 256 possible values
• 16-bit integer: a non-decimal number with 65536 possible values
• 32-bit floating point: can represent a wide range of numbers with up
to 7 decimal places, with a maximum of 3.4028235 × 1038

Quantization techniques (see specific lecture) will


allow to reduce the memory demand for each
specific value

18
Time series data
19

• A time series is a sequence of data points indexed in chronological order: X = (x1,x2, ... ,xN,...)
• In other words, a time series is a series of data points taken at successive equally spaced
points in time.
• Sampling period (secs, min, hour, days, etc..) and number of bits (n) for the representation
• Memory requirements: n bit per sample (e.g., 4byte FP per sample)

19
Audio data
20

• A special case of time series data, audio signals represent the oscillation of sound waves as
they travel through the air.
• Sampling (Hz) and quantization (number of bit), length of the signal (in s) and number of
channels (e.g., mono or stereo) are key aspects

Memory requirements:
length x sampling x bit x channel

For example a 10s audio sampled at


100Hz quantized over 32bit and
acquired in stereo manner requires:
10s x 100Hz x 4Byte x 2= 8Kb

320KB RAM
Memory 8
20
Image data
21

• Images are data that represent the measurements taken by a sensor that captures an
entire scene
• Images have two or more dimensions. In their typical form, they can be thought of as a
grid of “pixels”, where the value of each pixel represents some property of the scene at the
corresponding point in space (stored in N bit)
• Memory requirements: W x K x N x channels

For example a 128 x 128 RGB


channels
image with 1byte per pixel
requires:

W K W K 128 x 128 x 1Byte x 3= 49Kb

320KB RAM
Memory 49KB
21
Video data
22

A sequence of (fixed) images reproduced with a suitably high frame rate


Similarly to image we have W, K, N and channels. In addition we have a frame rate (in
frame/s) and the length of the video (in s)
•Memory requirements: W x K x N x channels x frame rate x length

time
For example a 10s video
recorded at 30 fps at 128
… x 128 RGB image with
1byte per pixel requires:
30 x 10 x 128 x 128 x
W 1Byte x 3= 14.7Mb
K
(uncompressed)
channels
320KB RAM
Memory
14.7 MB …
22
Exercise slide
23

• Compute the memory demand for a 5s audio sampled at 44.1Khz


over one channel (e.g., wake-word detection) with 2Byte per sample

• Compute the memory demand for a 28 x 28 grayscale image (1Byte),


see MNIST

• Compute the memory demand for a 224 x 224 RGB image (1Byte per
channel), see IMAGENET

For all the exercises consider un-compressed data!

23
Exercise slide (solution)
24

• Compute the memory demand for a 5s audio sampled at 44.1Khz


over one channel (e.g., wake-word detection) with 2Byte per sample
5s x 44.100 sample/s x 2 Byte/sample = 441 KB
• Compute the memory demand for a 28 x 28 grayscale image (1Byte),
see MNIST
28 x 28 x 1 Byte/pixel = 784 Byte
• Compute the memory demand for a 224 x 224 RGB image (1Byte per
channel), see IMAGENET
224 x 224 x 3 channel x 1 Byte/pixel = 150 KB

24
Types of sensors and signals
25

• There are thousands of different types of sensors on the market


• Families of sensors for embedded and edge AI:
1. Acoustic and vibration
2. Visual and scene
3. Motion and position
4. Force and tactile
5. Optical, electromagnetic, and radiation
6. Environmental and chemical

25
1) Acoustic and vibration
26

• “Hear” vibrations is a crucial ability in the field of embedded and edge AI


• Such an ability allows to detect the effects of movement, vibration, and
human and animal communication at a distance.
• Acoustic sensors measure the effect of vibrations travelling through a
medium:
• air (microphones)
• water (hydrophones) or
• ground (geophones and seismometers)
• Information is distributed across frequencies (the sampling frequency is
crucial for a given application scenario)

Acoustic sensors generally produce an audio data


describing the variation of pressure in the medium

26
Birdsong detection and recognition (in the wild)
A TinyML-based IoT units endowed with CNNs to detect and recognize bird vocalizations

Sampling frequency 2kHz


Memory footprint 133Kb Cortex M7
Accuracy approx. 76% 480 MHz
RAM 512Kb
Inference time 3.5s
Lifetime 12 days (3200 mAh battery)

S. Disabato, G. Canonaco, P. Flikkema, M. Roveri, C. Alippi, “Birdsong Detection at the Edge with Deep Learning”, in Proc. 7th IEEE International Conference on Smart Computing (SMARTCOMP 2021)

27
2) Visual and scene
28

Acquiring information without contact (passively):


• tiny, low-power cameras
• super high quality multi-megapixel sensors
• Image sensors capture light using a grid of sensor elements
• Features of cameras:
ü Color channels—grayscale or color (red, green, and blue, or RGB).
ü Spectral response—e.g., infrared radiation for thermal cameras to “see” heat.
ü Pixel size—larger sensors can capture more light per pixel, increasing their sensitivity.
ü Sensor resolution—the more elements on a sensor, the more details
ü Frame rate—how frequently a sensor can capture an image, typically in frames per second.

The output of cameras is an image (2D/3D) or a video

28
2) Visual and scene (part 2)
29

LIDAR RADAR

29
People detection on the conveyor belt in Malpensa Airport (T2)

30
UWB-based smart sensors for retirement homes
Privacy-preserving behaviour recognition in patients

M. Pavan, A. Caltabiano and M. Roveri, “TinyML for UWB-radar based presence detection,” accepted for publication in IEEE World Congress of Computational Intelligence 2022 (IEEE WCCI 2022)

31
3) Motion and position
32

Many different types of motion and position sensors:


• Tilt sensor—a mechanical switch measuring the orientation
• Accelerometer—measures the acceleration (the change in velocity over time) of
an object across one or more axes (smart watches or predictive maintenance).
• Gyroscopes—measure the rate of rotation of an object.
• Time of flight—electromagnetic emission (light or radio) to measure the distance
from a sensor to whatever object is directly in its line of sight.
• Real time locating systems—multiple transceivers in fixed locations around a
building to track the position of individual objects
• Global Positioning System (GPS)—satellites to determine the location of a device

Measurements are typically represented as a time series

32
Building and critical infrastructure monitoring (Italy)
A network of TinyML-based IoT units for intelligent structural monitoring

33
Internet-of-Birds tracking system (Italy)
Tracking greater flamingos through ultra-lightweight smart IoT devices

1Y of working activities, 4 greater flamingos, sampling rate 20h, weight of the device < 20g

C. Alippi, R. Ambrosini, D. Cogliati, V. Longoni, M. Roveri, “A lightweight and energy-efficient Internet-of-Birds Tracking System”, in Proc. IEEE International Conference on Pervasive Computing and Communications (PerCom 2017), Hawaii, USA, 13-17 March 2017.

34
Time-of-Flight Sensors for cup recognition

35
4) Force and tactile
36

Helpful in facilitating user interaction, understanding the flow of liquids and gases, or
measuring the mechanical strain on an object:
• Buttons and switches— provide a binary signal
• Capacitative touch sensors—measure the amount that a surface is being touched
by a conductive object, like a human finger.
• Strain gauges—measure how much an object is being deformed
• Load cells—these measure the amount of physical load that is applied
• Flow sensors—measure the rate of flow in liquids and gases
• Pressure sensors—used to measure pressure of a gas or liquid, either
environmental or inside a system (e.g., inside a car tire).

Measurements are typically represented as a time series

36
Rock-collapse and landslide forecasting (Italy and Switzerland)

A TinyML-based monitoring system to identify critical situations

C. AIippi, R. Camplani, C. Galperti, A. Marullo, M. Roveri, “A high frequency sampling monitoring system for environmental and structural applications”, ACM Transactions on Sensor Networks, Vol. 9, No. 4, Art. 41, 32 pages, July 2013

37
5) Optical, electromagnetic, and radiation

Sensors measuring electromagnetic radiation, magnetic fields as well as current and voltage:
• Photosensors—detect light at various wavelengths, both visible and invisible to the human eye.
• Color sensor—photosensors to measure the precise color of a surface, helpful for recognizing different types
of objects.
• Spectroscopy sensors—measure the way that various wavelengths of light are absorbed and reflected by
materials, giving an edge AI system insight into their composition.
• Magnetometer—measure the strength and direction of magnetic fields (a digital compass)
• Inductive proximity sensors—electromagnetic field to detect nearby metal (detect vehicles for traffic
monitoring)
• Electro-magnetic field (EMF) meters—measure the strength of electromagnetic fields (e.g., emitted by
industrial equipment)
• Current and voltage sensor

Measurements are typically represented as a time series

38
Acquatic monitoring system (Great Coral Reef, Australia)
A TinyML-based system to monitor the water pollution and forecast hurricanes

Deployment
Area

Center for
Node Unit Marine Studies
Gateway

C. Alippi, R. Camplani, C. Galperti, M. Roveri, “A robust, adaptive, solar powered WSN framework for aquatic environmental monitoring”, IEEE Sensors Journal, Vol. 11, No. 1, pp. 45-55, Jan. 2011.

39
6) Environmental, biological, and chemical
40

• Temperature sensors
• Gas sensors—e.g., humidity or carbon dioxide sensors.
• Particulate matter sensor—monitor pollution levels.
• Biosignals sensors—e.g., measurement of electrical activity in the human
heart (electrocardiography) and brain (electroencephalography).
• Chemical sensors—measure the presence or concentration of specific
chemicals.

Measurements are typically represented as a time series

40
Rock collapse and landslide forecasting (Torrioni di Rialba)

41
42
Let’s go back the application processors…

The general purpose processor running the embedded application

42
43
Microcontrollers and digital signal processors

• Microcontrollers are the basic technological block of our pervasive


applications
• Tiny and cheap computers used for single purpose applications
• No operating system:
• The ”firmware”, i.e., the software running on a microcontroller, is directly
executed on the hardware and includes the low-level instructions to connect
the peripherals
• But…. the key aspect of microcontrollers is the fact that they embed all
the components in a single piece of silicon!

43
The (embedded) architecture of a micro-controller

• Have a fixed hardware built around a central processing unit (CPU)


• The CPU controls a range of peripherals, which may provide both digital and
analog functions such as timers and analog-to-digital converters.
• Small devices usually include both volatile and non-volatile memory on the chip
but larger processors may need separate memory
• Their operation is usually programmed using a machine-language like Assembly
or high-level language such as C

Microcontrollers are the best candidates


for embedded systems, because their
on-chip peripherals and memory enable
the embedded system designers to save
circuit space and overall dimensions, by
not having to add these things on the
board.

44
Microcontroller vs microprocessors

• A microcontroller contain a microprocessor (uP), Program Memory, Data


Memory and a number of internal peripherals devices such as timers, serial
ports, general purpose input/output (I/O) pins, counters, analog-to-digital
(ADC) converters, etc. all inside a single silicon chip.
• A microprocessor (uP) is a silicon chip containing an advanced Central
Processing Unit (CPU) that fetches program instructions from an external
memory and execute them.

Microcontroller Microprocessor
Data
ALU Memory
Data
CPU Memory
Control Unit Program
Memory
Program
Peripherals
Memory Register Array
Peripherals

45
The wide range of microcontroller…
Low-end MCUs High-end MCUs

Description low cost, small size, and energy efficiency. Widely used microcontroller

Architecture 4-bit to 16-bit 32-bit

Clock speed <100 MHz <1000 MHz

Optional HW - Floating point unit (FPU), Single instruction, multiple


data (SIMD) instructions
Flash memory 2-64 KB 16KB to 2MB

RAM memory 64 bytes to 2KB 2KB to 1MB

Input-output Digital Digital and Analog

Current draw Running: Tens of mA at 1.5-5V Running: Several tens of mA at 1.5-5V


Sleeping: uA Sleeping: uA
Cost 1-2$ 10$

Applications Running conditional logic, acquire and transmit data Signal processing and embedded machine learning

46
The wide range of microcontroller…
High-end MCUs
• Faster clock and 32-bit architecture
Widely used microcontroller
• SIMD to run several computations in parallel
(very useful for Singnal Processing and ML) 32-bit

• Larger RAM (that is always the bottleneck) <1000 MHz


• Reasonably good energy efficiency Floating point unit (FPU), Single instruction, multiple
data (SIMD) instructions
16KB to 2MB
• Examples of high-end MCUs (based on ARM
Cortex-M cores): 2KB to 1MB

• STM32 Digital and Analog

• Nordic Running: Several tens of mA at 1.5-5V


Sleeping: uA
• Arduino 10$

Signal processing and embedded machine learning

47
A different perspectives: System on Chip (SoC)

SoC devices aims at integrating all the


System on Chip

Powerful edge computing system


functionalities of a traditional computing
64-bit architecture
system into a single chip:
>1 GHz clock speed
• Run Oss Hardware accelerator (GPU, TPUs,)
• Tools and libraries to write server and Flash in the order of GBs
desktop applications are used
RAM in the order of GBs

High performance Digital and Analog


Drawbacks: Running: hundreds of mA at 5V
Sleeping: uA
• Low energy efficiency (w.r.t. Tens of $ per unit
microcontroller) Technological scenario of mobile phones, smart cars,
• Larger costs etc.. (e.g., Qualcomm Snapdragon)

48
A technological overview for embedded and edge AI
Device Low High Audio Low High Video
frequency frequency resolution resolution
time series time series image image
Low-end MCU Limited Limited None None None None
High-end MCU Full Full Full Full Limited Limited
High-end MCU Full Full Full Full Full Limited
with
accelerator
SoC Full Full Full Full Full Full
SoC with Full Full Full Full Full Full
accelerator
Edge server Full Full Full Full Full Full
Cloud Full Full Full Full Full Full

49

You might also like