Technologies for Artificial Intelligence
Prof. Manuel Roveri – [Link]@[Link]
Lecture 2 – Embedded and Edge Hardware
Course Topics
1. Introduction to technological platforms for AI
2. Embedded and Edge AI
a. The technology
b. The Algorithms
c. Machine Learning for Embedded and Edge AI
d. Deep Learning for Embedded and Edge AI
3. Cloud computing and AI
a. Cloud computing and the ”as-a-service” approach
b. Machine and Deep Learning as a service
c. Time-series: analysis and prediction
d. Generative AI
2
An overview of an embedded and edge AI system
Software
Preprocessing
Sensors extracting the features and reducing the noise and perform
Feature Extraction classification
Wake-word detection
they are part of the the hardware Machine and Deep
Learning Algorithm
Postprocessing
Actuators Decision making for example opening the door
Person detection
Hardware
Gesture recognition
3
An overview of an embedded and edge AI system
Software
Preprocessing
This lecture!
Sensors
Feature Extraction
Wake-word detection
Machine and Deep
the hardware Learning Algorithm
Postprocessing
Actuators Decision making
Person detection
Hardware
Gesture recognition
4
An overview of an embedded and edge AI system
Software
Preprocessing
Next lecture!
Sensors
Feature Extraction
Wake-word detection
Machine and Deep
Learning Algorithm
Postprocessing
Actuators Decision making
Person detection
Hardware
Gesture recognition
5
The embedded perspective
we have one single application in embedded systems
• Embedded systems are the
computers that control
the electronics of all sorts
of physical devices
• Embedded software is
software that runs on
embedded systems
like mobile phones and laptops
6
7
Differently from
general purpose
computers (e.g., a
laptop or smartphone)
embedded systems are
usually meant to
perform one specific,
dedicated task.
7
8
Embedded and Edge AI Hardware architecture
do the specific operation faster
memory that is kept every
time we turn off the device
Image taken from [1]
8
9
Embedded and Edge AI Hardware architecture
for example for speed up mass multiplications
connecton with sensors and other hardwares
The general purpose processor running the embedded application
9
10
Embedded and Edge AI Hardware architecture
for example we need floating point unit and instead we use multiple integer operation and each of them has different hardwares
Built-in additional hardware meant to provide highly efficient computation on certain
operation (e.g, a floating point unit –FPU- to perform FP operations)
10
11
Embedded and Edge AI Hardware architecture
the bottleneck of any embedded and edge AI
Working memory meant to support program execution
(additional external RAM could be available)
11
RAM is typically the bottleneck in Embedded and Edge AI
12
ü Very fast memory, but it is very energy consuming
ü It’s volatile (content is lost when power shuts down)we lose every thing when we shut down
the computer
ü Cost
ü Large physical space on the device
12
13
Embedded and Edge AI Hardware architecture
is typically a flash memory
for example data that we used in
mobile phone and it is larger than
volatile and it is slower than
volatile
The non-volatile memory is the Flash memory. Used to store things that do not change often and
that must preserved when the system shuts down (e.g., the software program, the device
configruation). Slow to be read, and extremely slow to be written.
13
14
Embedded and Edge AI Hardware architecture
Discrete coprocessors are external co-processor supporting efficent and fast matemathical operations
(e.g., a GPU or a TPU)
14
15
Embedded and Edge AI Hardware architecture
Provide an interface with the rest of the system (e.g., sensor and network hardware) through
technological standardized solutions (e.g, GPIO, I2C, SPI, and UART)
peripheral are interfaces
15
16
Embedded and Edge AI Hardware architecture: the sensors
- discrete is isseparated chip
Integrated built-into CPU
- Integrated
Gaming PC is Uses built-into CPU
a discrete GPU (NVIDIA RTX 4090) for high-quality rendering.
-
discrete is separated chip
Let’s see them in detail …
16
17
Sensors and signals: the edge AI perspective
• Sensors allow to acquire measurements
from the environment or from humans
• Sensors generate streams of data
Sensors • Other sources of data are available
and (beyond sensors such as digital device
logs, network packets and radio
signals
transmissions
• Sensors can provide different data formats:
time series, audio, image, video, others…
17
How to store data values?
18
• Boolean (1 bit): a number with 2 possible values
• 8-bit integer: a non-decimal number with 256 possible values
• 16-bit integer: a non-decimal number with 65536 possible values
• 32-bit floating point: can represent a wide range of numbers with up
to 7 decimal places, with a maximum of 3.4028235 × 1038
Quantization techniques (see specific lecture) will
allow to reduce the memory demand for each
specific value
18
Time series data
19
• A time series is a sequence of data points indexed in chronological order: X = (x1,x2, ... ,xN,...)
• In other words, a time series is a series of data points taken at successive equally spaced
points in time.
• Sampling period (secs, min, hour, days, etc..) and number of bits (n) for the representation
• Memory requirements: n bit per sample (e.g., 4byte FP per sample)
19
Audio data
20
• A special case of time series data, audio signals represent the oscillation of sound waves as
they travel through the air.
• Sampling (Hz) and quantization (number of bit), length of the signal (in s) and number of
channels (e.g., mono or stereo) are key aspects
Memory requirements:
length x sampling x bit x channel
For example a 10s audio sampled at
100Hz quantized over 32bit and
acquired in stereo manner requires:
10s x 100Hz x 4Byte x 2= 8Kb
320KB RAM
Memory 8
20
Image data
21
• Images are data that represent the measurements taken by a sensor that captures an
entire scene
• Images have two or more dimensions. In their typical form, they can be thought of as a
grid of “pixels”, where the value of each pixel represents some property of the scene at the
corresponding point in space (stored in N bit)
• Memory requirements: W x K x N x channels
For example a 128 x 128 RGB
channels
image with 1byte per pixel
requires:
W K W K 128 x 128 x 1Byte x 3= 49Kb
320KB RAM
Memory 49KB
21
Video data
22
A sequence of (fixed) images reproduced with a suitably high frame rate
Similarly to image we have W, K, N and channels. In addition we have a frame rate (in
frame/s) and the length of the video (in s)
•Memory requirements: W x K x N x channels x frame rate x length
time
For example a 10s video
recorded at 30 fps at 128
… x 128 RGB image with
1byte per pixel requires:
30 x 10 x 128 x 128 x
W 1Byte x 3= 14.7Mb
K
(uncompressed)
channels
320KB RAM
Memory
14.7 MB …
22
Exercise slide
23
• Compute the memory demand for a 5s audio sampled at 44.1Khz
over one channel (e.g., wake-word detection) with 2Byte per sample
• Compute the memory demand for a 28 x 28 grayscale image (1Byte),
see MNIST
• Compute the memory demand for a 224 x 224 RGB image (1Byte per
channel), see IMAGENET
For all the exercises consider un-compressed data!
23
Exercise slide (solution)
24
• Compute the memory demand for a 5s audio sampled at 44.1Khz
over one channel (e.g., wake-word detection) with 2Byte per sample
5s x 44.100 sample/s x 2 Byte/sample = 441 KB
• Compute the memory demand for a 28 x 28 grayscale image (1Byte),
see MNIST
28 x 28 x 1 Byte/pixel = 784 Byte
• Compute the memory demand for a 224 x 224 RGB image (1Byte per
channel), see IMAGENET
224 x 224 x 3 channel x 1 Byte/pixel = 150 KB
24
Types of sensors and signals
25
• There are thousands of different types of sensors on the market
• Families of sensors for embedded and edge AI:
1. Acoustic and vibration
2. Visual and scene
3. Motion and position
4. Force and tactile
5. Optical, electromagnetic, and radiation
6. Environmental and chemical
25
1) Acoustic and vibration
26
• “Hear” vibrations is a crucial ability in the field of embedded and edge AI
• Such an ability allows to detect the effects of movement, vibration, and
human and animal communication at a distance.
• Acoustic sensors measure the effect of vibrations travelling through a
medium:
• air (microphones)
• water (hydrophones) or
• ground (geophones and seismometers)
• Information is distributed across frequencies (the sampling frequency is
crucial for a given application scenario)
Acoustic sensors generally produce an audio data
describing the variation of pressure in the medium
26
Birdsong detection and recognition (in the wild)
A TinyML-based IoT units endowed with CNNs to detect and recognize bird vocalizations
Sampling frequency 2kHz
Memory footprint 133Kb Cortex M7
Accuracy approx. 76% 480 MHz
RAM 512Kb
Inference time 3.5s
Lifetime 12 days (3200 mAh battery)
S. Disabato, G. Canonaco, P. Flikkema, M. Roveri, C. Alippi, “Birdsong Detection at the Edge with Deep Learning”, in Proc. 7th IEEE International Conference on Smart Computing (SMARTCOMP 2021)
27
2) Visual and scene
28
Acquiring information without contact (passively):
• tiny, low-power cameras
• super high quality multi-megapixel sensors
• Image sensors capture light using a grid of sensor elements
• Features of cameras:
ü Color channels—grayscale or color (red, green, and blue, or RGB).
ü Spectral response—e.g., infrared radiation for thermal cameras to “see” heat.
ü Pixel size—larger sensors can capture more light per pixel, increasing their sensitivity.
ü Sensor resolution—the more elements on a sensor, the more details
ü Frame rate—how frequently a sensor can capture an image, typically in frames per second.
The output of cameras is an image (2D/3D) or a video
28
2) Visual and scene (part 2)
29
LIDAR RADAR
29
People detection on the conveyor belt in Malpensa Airport (T2)
30
UWB-based smart sensors for retirement homes
Privacy-preserving behaviour recognition in patients
M. Pavan, A. Caltabiano and M. Roveri, “TinyML for UWB-radar based presence detection,” accepted for publication in IEEE World Congress of Computational Intelligence 2022 (IEEE WCCI 2022)
31
3) Motion and position
32
Many different types of motion and position sensors:
• Tilt sensor—a mechanical switch measuring the orientation
• Accelerometer—measures the acceleration (the change in velocity over time) of
an object across one or more axes (smart watches or predictive maintenance).
• Gyroscopes—measure the rate of rotation of an object.
• Time of flight—electromagnetic emission (light or radio) to measure the distance
from a sensor to whatever object is directly in its line of sight.
• Real time locating systems—multiple transceivers in fixed locations around a
building to track the position of individual objects
• Global Positioning System (GPS)—satellites to determine the location of a device
Measurements are typically represented as a time series
32
Building and critical infrastructure monitoring (Italy)
A network of TinyML-based IoT units for intelligent structural monitoring
33
Internet-of-Birds tracking system (Italy)
Tracking greater flamingos through ultra-lightweight smart IoT devices
1Y of working activities, 4 greater flamingos, sampling rate 20h, weight of the device < 20g
C. Alippi, R. Ambrosini, D. Cogliati, V. Longoni, M. Roveri, “A lightweight and energy-efficient Internet-of-Birds Tracking System”, in Proc. IEEE International Conference on Pervasive Computing and Communications (PerCom 2017), Hawaii, USA, 13-17 March 2017.
34
Time-of-Flight Sensors for cup recognition
35
4) Force and tactile
36
Helpful in facilitating user interaction, understanding the flow of liquids and gases, or
measuring the mechanical strain on an object:
• Buttons and switches— provide a binary signal
• Capacitative touch sensors—measure the amount that a surface is being touched
by a conductive object, like a human finger.
• Strain gauges—measure how much an object is being deformed
• Load cells—these measure the amount of physical load that is applied
• Flow sensors—measure the rate of flow in liquids and gases
• Pressure sensors—used to measure pressure of a gas or liquid, either
environmental or inside a system (e.g., inside a car tire).
Measurements are typically represented as a time series
36
Rock-collapse and landslide forecasting (Italy and Switzerland)
A TinyML-based monitoring system to identify critical situations
C. AIippi, R. Camplani, C. Galperti, A. Marullo, M. Roveri, “A high frequency sampling monitoring system for environmental and structural applications”, ACM Transactions on Sensor Networks, Vol. 9, No. 4, Art. 41, 32 pages, July 2013
37
5) Optical, electromagnetic, and radiation
Sensors measuring electromagnetic radiation, magnetic fields as well as current and voltage:
• Photosensors—detect light at various wavelengths, both visible and invisible to the human eye.
• Color sensor—photosensors to measure the precise color of a surface, helpful for recognizing different types
of objects.
• Spectroscopy sensors—measure the way that various wavelengths of light are absorbed and reflected by
materials, giving an edge AI system insight into their composition.
• Magnetometer—measure the strength and direction of magnetic fields (a digital compass)
• Inductive proximity sensors—electromagnetic field to detect nearby metal (detect vehicles for traffic
monitoring)
• Electro-magnetic field (EMF) meters—measure the strength of electromagnetic fields (e.g., emitted by
industrial equipment)
• Current and voltage sensor
Measurements are typically represented as a time series
38
Acquatic monitoring system (Great Coral Reef, Australia)
A TinyML-based system to monitor the water pollution and forecast hurricanes
Deployment
Area
Center for
Node Unit Marine Studies
Gateway
C. Alippi, R. Camplani, C. Galperti, M. Roveri, “A robust, adaptive, solar powered WSN framework for aquatic environmental monitoring”, IEEE Sensors Journal, Vol. 11, No. 1, pp. 45-55, Jan. 2011.
39
6) Environmental, biological, and chemical
40
• Temperature sensors
• Gas sensors—e.g., humidity or carbon dioxide sensors.
• Particulate matter sensor—monitor pollution levels.
• Biosignals sensors—e.g., measurement of electrical activity in the human
heart (electrocardiography) and brain (electroencephalography).
• Chemical sensors—measure the presence or concentration of specific
chemicals.
Measurements are typically represented as a time series
40
Rock collapse and landslide forecasting (Torrioni di Rialba)
41
42
Let’s go back the application processors…
The general purpose processor running the embedded application
42
43
Microcontrollers and digital signal processors
• Microcontrollers are the basic technological block of our pervasive
applications
• Tiny and cheap computers used for single purpose applications
• No operating system:
• The ”firmware”, i.e., the software running on a microcontroller, is directly
executed on the hardware and includes the low-level instructions to connect
the peripherals
• But…. the key aspect of microcontrollers is the fact that they embed all
the components in a single piece of silicon!
43
The (embedded) architecture of a micro-controller
• Have a fixed hardware built around a central processing unit (CPU)
• The CPU controls a range of peripherals, which may provide both digital and
analog functions such as timers and analog-to-digital converters.
• Small devices usually include both volatile and non-volatile memory on the chip
but larger processors may need separate memory
• Their operation is usually programmed using a machine-language like Assembly
or high-level language such as C
Microcontrollers are the best candidates
for embedded systems, because their
on-chip peripherals and memory enable
the embedded system designers to save
circuit space and overall dimensions, by
not having to add these things on the
board.
44
Microcontroller vs microprocessors
• A microcontroller contain a microprocessor (uP), Program Memory, Data
Memory and a number of internal peripherals devices such as timers, serial
ports, general purpose input/output (I/O) pins, counters, analog-to-digital
(ADC) converters, etc. all inside a single silicon chip.
• A microprocessor (uP) is a silicon chip containing an advanced Central
Processing Unit (CPU) that fetches program instructions from an external
memory and execute them.
Microcontroller Microprocessor
Data
ALU Memory
Data
CPU Memory
Control Unit Program
Memory
Program
Peripherals
Memory Register Array
Peripherals
45
The wide range of microcontroller…
Low-end MCUs High-end MCUs
Description low cost, small size, and energy efficiency. Widely used microcontroller
Architecture 4-bit to 16-bit 32-bit
Clock speed <100 MHz <1000 MHz
Optional HW - Floating point unit (FPU), Single instruction, multiple
data (SIMD) instructions
Flash memory 2-64 KB 16KB to 2MB
RAM memory 64 bytes to 2KB 2KB to 1MB
Input-output Digital Digital and Analog
Current draw Running: Tens of mA at 1.5-5V Running: Several tens of mA at 1.5-5V
Sleeping: uA Sleeping: uA
Cost 1-2$ 10$
Applications Running conditional logic, acquire and transmit data Signal processing and embedded machine learning
46
The wide range of microcontroller…
High-end MCUs
• Faster clock and 32-bit architecture
Widely used microcontroller
• SIMD to run several computations in parallel
(very useful for Singnal Processing and ML) 32-bit
• Larger RAM (that is always the bottleneck) <1000 MHz
• Reasonably good energy efficiency Floating point unit (FPU), Single instruction, multiple
data (SIMD) instructions
16KB to 2MB
• Examples of high-end MCUs (based on ARM
Cortex-M cores): 2KB to 1MB
• STM32 Digital and Analog
• Nordic Running: Several tens of mA at 1.5-5V
Sleeping: uA
• Arduino 10$
Signal processing and embedded machine learning
47
A different perspectives: System on Chip (SoC)
SoC devices aims at integrating all the
System on Chip
Powerful edge computing system
functionalities of a traditional computing
64-bit architecture
system into a single chip:
>1 GHz clock speed
• Run Oss Hardware accelerator (GPU, TPUs,)
• Tools and libraries to write server and Flash in the order of GBs
desktop applications are used
RAM in the order of GBs
High performance Digital and Analog
Drawbacks: Running: hundreds of mA at 5V
Sleeping: uA
• Low energy efficiency (w.r.t. Tens of $ per unit
microcontroller) Technological scenario of mobile phones, smart cars,
• Larger costs etc.. (e.g., Qualcomm Snapdragon)
48
A technological overview for embedded and edge AI
Device Low High Audio Low High Video
frequency frequency resolution resolution
time series time series image image
Low-end MCU Limited Limited None None None None
High-end MCU Full Full Full Full Limited Limited
High-end MCU Full Full Full Full Full Limited
with
accelerator
SoC Full Full Full Full Full Full
SoC with Full Full Full Full Full Full
accelerator
Edge server Full Full Full Full Full Full
Cloud Full Full Full Full Full Full
49