Module 4 Continued
Module 4 Continued
wi wi + wi
where wi = (t – o) xi
Where:
t = c(x) is target value
o is perceptron output
is small constant (e.g., 0.1) called learning rate
Can prove it will converge
If training data is linearly separable
Gradient descent
Derivation of gradient descent
◆ Gradient descent
- Error (for all training examples.):
− t d log od + (1 − t d ) log(1 − od )
d∈ D
Recurrent
Networks
(a) (b) (c)
Architecture of SOM
Kohonen SOM (Self
Organizing Maps)
Structure of Neighborhoods
Kohonen SOM (Self
Organizing Maps)
Structure of Neighborhoods
Kohonen SOM (Self
Organizing Maps)
Structure of Neighborhoods
Kohonen SOM (Self
Organizing Maps)
Neighborhoods do not wrap around from one side of the
grid to other side which means missing units are simply
ignored.
Algorithm:
Kohonen SOM (Self
Organizing Maps)
Algorithm:
Architecture
neuron i
Kohonen layer
wi
Winning neuron
Input vector X
X=[x1,x2,…xn] Rn
wi=[wi1,wi2,…,win] Rn
Kohonen SOM (Self
Organizing Maps)
Example
Kohonen SOM (Self
Organizing Maps)
Kohonen SOM (Self
Organizing Maps)
Convolutional Neural Network
History
• In 1995, Yann LeCun
and Yoshua Bengio
introduced the concept
of convolutional neural
networks.
“Deep Learning doesn’t do different things,
it does things differently”
Performance vs Sample Size
Performance
Traditional ML algorithms
Size of Data
Outline
Supervised Learning
Convolutional Neural Network
Sequence Modelling: RNN and its extensions
Unsupervised Learning
Autoencoder
Stacked Denoising Autoencoder
• Unsupervised Learning (+Supervised)
Generative Adversarial Networks
Reinforcement Learning
Deep Reinforcement Learning
Outline
GAN
Output
Generator
Generated
Input
Network
Shakespeare Fake
Poetry Discriminator
Network
Real Real
Shakespeare
Poetry
Supervised Learning
Traditional pattern recognition models work with hand crafted
features and relatively simple trainable classifiers.
Trainable
Extract Hand Output
Classifier
Crafted (e.g. Outdoor
(e.g. SVM,
Features Yes or No)
Random
Limitations Forrest)
Limitations
Image
Pixel Edge Texture Motif Part Object
Text
Character Word Word-group Clause Sentence Story
x1 a1(1)
x2 a2(1)
a1(2) Y
x3 a3(1)
x4 a4(1)
x3 a3(1)
w4
x4 a4(1)
𝑎1(1) = 𝑓 𝑤1 ∗ 𝑥1 + 𝑤2 ∗ 𝑥2 + 𝑤3 ∗ 𝑥3 + 𝑤4 ∗ 𝑥4
𝑅𝑒𝑙𝑢: max(0, 𝑥)
𝑎1(1) = 𝑚𝑎𝑥 0, 𝑤1 ∗ 𝑥1 + 𝑤2 ∗ 𝑥2 + 𝑤3 ∗ 𝑥3 + 𝑤4 ∗ 𝑥4
Number of Parameters
Softmax
x1 a1(1)
x2 a2(1)
a1(2) Y
x3 a3(1)
x4 a4(1)
4*4 + 4 +1
If the input is an Image?
x1 a1(1)
x2 a2(1)
a1(2) Y
x3 a3(1)
400 X 400 X 3
a480000(1)
x480000
Number of Parameters
480000*480000 + 480000 +1 = approximately 230 Billion !!!
480000*1000 + 1000 +1 = approximately 480 million !!!
Let us see how convolutional layers
help.
Convolutional Layers
0 1 0
Filter 1 -4 1
0 1 0
1 1 1 1 1 10.0156860.0156860.0117650.0156860.0156860.0156860.0156860.9647060.9882350.9647060.8666670.0313730.0235290.007843
0.0078430.741176 1 10.9843140.0235290.0196080.0156860.0156860.0156860.0117650.1019610.972549 1 10.9960780.9960780.9960780.0588240.015686
0.0196080.513726 1 1 10.0196080.0156860.0156860.0156860.0078430.011765 1 1 10.9960780.0313730.0156860.019608 10.011765
0.0156860.733333 1 10.9960780.0196080.0196080.0156860.0156860.0117650.984314 1 10.9882350.0274510.0156860.0078430.007843 10.352941
0.0156860.823529 1 10.9882350.0196080.0196080.0156860.0156860.019608 1 10.9803920.0156860.0156860.0156860.0156860.996078 10.996078
0.0156860.913726 1 10.9960780.0196080.0196080.0196080.019608 1 10.9843140.0156860.0156860.0156860.0156860.952941 1 10.992157
0.0196080.913726 1 10.9882350.0196080.0196080.0196080.0392160.996078 10.0156860.0156860.0156860.0156860.996078 1 1 10.007843
0.0196080.898039 1 10.9882350.0196080.0156860.0196080.9686280.9960780.9803920.0274510.0156860.0196080.9803920.972549 1 1 10.019608
0.0431370.905882 1 1 10.0156860.0352940.968628 1 10.023529 10.7921570.996078 1 10.9803920.9921570.0392160.023529
1 1 1 1 10.9921570.992157 1 10.9843140.0156860.0156860.8588240.996078 10.9921570.5019610.0196080.0196080.023529
0.9960780.992157 1 1 10.9333330.0039220.996078 10.988235 10.992157 1 1 10.988235 1 1 1 1
0.015686 0.74902 1 10.9843140.0196080.0196080.0313730.9843140.0235290.0156860.015686 1 1 1 00.0039220.0274510.980392 1
0.0196080.023529 1 1 10.0196080.0196080.5647060.8941180.0196080.0156860.015686 1 1 10.0156860.0156860.015686 0.05098 1
0.0156860.015686 1 1 10.0470590.0196080.9921570.0078430.0117650.0117650.015686 1 1 10.0156860.0196080.9960780.0235290.996078
0.0196080.0156860.243137 1 10.9764710.035294 10.0039220.0117650.0117650.015686 1 1 10.9882350.988235 10.0039220.015686
0.0196080.0196080.027451 1 10.9921570.2235290.6627450.0117650.0117650.0117650.015686 1 1 10.0156860.0235290.9960780.0117650.011765
0.0156860.0156860.011765 1 1 1 10.0352940.0117650.0117650.0117650.015686 1 1 10.0156860.0156860.9647060.0039220.996078
0.0078430.0196080.0117650.054902 1 10.9882350.0078430.0117650.0117650.0156860.011765 1 1 10.0156860.0156860.0156860.023529 1
0.0078430.0078430.0156860.0156860.960784 10.4901960.0156860.0156860.0156860.0078430.027451 1 1 10.0117650.0117650.043137 1 1
0.0235290.0039220.0078430.0235290.9803920.9764710.0392160.0196080.0078430.0196080.015686 1 1 1 1 1 1 1 1 1
a b c d w1 w2
h1 h2
e f g h w3 w4
i j k l
m n o p
ℎ2 = 𝑓 𝑏 ∗ 𝑤1 + 𝑐 ∗ 𝑤2 + 𝑓 ∗ 𝑤3 + 𝑔 ∗ 𝑤4
w1 w2
w3 w4
w5 w6
w7 w8
Filter 1
Filter 2
Input Image
Layer 1 Layer 2
Feature Map Feature Map
▪ In Convolutional neural networks, hidden units are only connected to local receptive field.
Pooling
Max pooling: reports the maximum output within a rectangular
neighborhood.
Average pooling: reports the average output of a rectangular
neighborhood.
Living Room
Bed Room
128
256
512
512
256
512
512
128
256
512
512
Kitchen
64
64
Bathroom
Outdoor
Max Pool
Filter
Fully Connected
Layers
Convolutional Neural Networks
Output: Binary, Multinomial, Continuous, Count
Input: fixed size, can use padding to make all images same
size.
Architecture: Choice is ad hoc
requires experimentation.
Optimization: Backward propagation
hyper parameters for very deep model can be estimated properly only if you
have billions of images.
Use an architecture and trained hyper parameters from other papers
(Imagenet or Microsoft/Google APIs etc)
Computing Power: Buy a GPU!!
Automatic Colorization of Black and White Images
Optimizing Images
Kernel
f * g (x) = f ( )g(x − )d
=−
N −1 Output is
= f ( )g(x − ) sometimes called
=0 Feature map
2D (continuous, discrete) :
f * g (x, y) = f ( , )g(x − , y − )d d
=− =−
N −1 N −1
= f ( , )g(x − , y − )
=0 =0
Convolution Properties
• Commutative:
f*g = g*f
• Associative:
(f*g)*h = f*(g*h)
• Homogeneous:
f*(g)= f*g
• Additive (Distributive):
f*(g+h)= f*g+f*h
• Shift-Invariant
f*g(x-x0,y-yo)= (f*g) (x-x0,y-yo)
ConvNet
• ConvNet architectures for images:
– fully-connected structure does not scale to large
images
– the explicit assumption that the inputs are images
– allows us to encode certain properties into the
architecture.
– These then make the forward function more efficient
to implement
– Vastly reduce the amount of parameters in the
network.
• 3D volumes: neurons arranged in 3 dimensions:
width, height, depth.
Convnets
translated
image image
32x32x3 image
32 height
32 width
3 depth
Convolutions: More detail
32x32x3 image
5x5x3 filter
32
32
3
Convolutions: More detail
Convolution Layer
32x32x3 image
5x5x3 filter
32
1 number:
the result of taking a dot product between the
filter and a small 5x5x3 chunk of the image
32 (i.e. 5*5*3 = 75-dimensional dot product + bias)
3
Convolutions: More detail
Convolution Layer
activation map
32x32x3 image
5x5x3 filter
32
28
32 28
3 1
Convolutions: More detail
consider a second, green filter
Convolution Layer
32x32x3 image activation maps
5x5x3 filter
32
28
32 28
3 1
Convolutions: More detail
For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps:
activation maps
32
28
Convolution Layer
32 28
3 6
32 28
CONV,
ReLU
e.g. 6
5x5x3
32 filters 28
3 6
Convolutions: More detail
Preview: ConvNet is a sequence of Convolutional Layers, interspersed with activation
functions
32 28 24
….
CONV, CONV, CONV,
ReLU ReLU ReLU
e.g. 6 e.g. 10
5x5x3 5x5x6
32 filters 28 filters 24
3 6 10
Convolutions: More detail
[From recentYann
Preview LeCun slides]
Convolutions: More detail
one filter =>
one activation map example 5x5 filters
(32 total)
28
32 28
3 1
Convolutions: More detail
A closer look at spatial dimensions:
• 7
• 7x7 input
(spatially)
assume 3x3
filter
• 7
Convolutions: More detail
A closer look at spatial dimensions:
• 7
• 7x7 input
(spatially)
assume 3x3
filter
• 7
Convolutions: More detail
A closer look at spatial dimensions:
• 7
• 7x7 input
(spatially)
assume 3x3
filter
• 7
Convolutions: More detail
A closer look at spatial dimensions:
• 7
• 7x7 input
(spatially)
assume 3x3
filter
• 7
Convolutions: More detail
A closer look at spatial dimensions:
• 7
• 7x7 input (spatially)
assume 3x3 filter
7 => 5x5 output
Convolutions: More detail
A closer look at spatial dimensions:
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 2
7
Convolutions: More detail
A closer look at spatial dimensions:
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 2
7
Convolutions: More detail
A closer look at spatial dimensions:
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 2
=> 3x3 output!
7
Convolutions: More detail
A closer look at spatial dimensions:
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 3?
7
Convolutions: More detail
A closer look at spatial dimensions:
7
7x7 input (spatially)
assume 3x3 filter
applied with stride 3?
7 doesn’t fit!
cannot apply 3x3 filter on
7x7 input with stride 3.
Convolutions: More detail
N
Output size:
(N - F) / stride + 1
F
e.g. N = 7, F = 3:
F N
stride 1 => (7 - 3)/1 + 1 = 5
stride 2 => (7 - 3)/2 + 1 = 3
stride 3 => (7 - 3)/3 + 1 = 2.33 :\
Convolutions: More detail
In practice: Common to zero pad the border
0 0 0 0 0 0
e.g. input 7x7
0
3x3 filter, applied with stride 1
0 pad with 1 pixel border => what is the output?
0
(recall:)
(N - F) / stride + 1
Convolutions: More detail
In practice: Common to zero pad the border
0 0 0 0 0 0
e.g. input 7x7
0
3x3 filter, applied with stride 1
0 pad with 1 pixel border => what is the output?
0
0
7x7 output!
Convolutions: More detail
In practice: Common to zero pad the border
0 0 0 0 0 0
e.g. input 7x7
0
3x3 filter, applied with stride 1
0 pad with 1 pixel border => what is the output?
0
0
7x7 output!
in general, common to see CONV layers with
stride 1, filters of size FxF, and zero-padding with
(F-1)/2. (will preserve size spatially)
e.g. F = 3 => zero pad with 1
F = 5 => zero pad with 2
F = 7 => zero pad with 3
(N + 2*padding - F) / stride + 1
Convolutions: More detail
Examples time:
Max
Sum
3. Spatial Pooling
• Sum or max over non-overlapping / overlapping regions
• Role of pooling:
• Invariance to small transformations
• Larger receptive fields (neurons see more of input)
Pooling Layer
• Insertion of pooling layer:
– reduce the spatial size of the representation
reduce the amount of parameters and computation in the network, and
hence also control overfitting.
• The Pooling Layer operates independently on every depth slice of
the input and resizes it spatially, using the MAX operation.
• The most common form is a pooling layer with filters of size 2x2
applied with a stride of 2 -- downsamples every depth slice in the
input by 2 along both width and height,
• MAX operation would in take a max over 4 numbers (little 2x2
region in some depth slice).
• The depth dimension remains unchanged.
General pooling layer
• Accepts a volume of size W1×H1×D1
• Requires two hyperparameters:
– their spatial extent F
– the stride S
• Produces a volume of size W2×H2×D2 where:
– W2=(W1−F)/S+1
– H2=(H1−F)/S+1
– D2=D1
• Introduces zero parameters
• Other pooling functions: Average pooling, L2-
norm pooling
General pooling
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale
image recognition." arXiv preprint arXiv:1409.1556 (2014).
Fully-connected layer
• Neurons in a fully connected layer have full connections to all
activations in the previous layer
• Their activations can hence be computed with a matrix
multiplication followed by a bias offset.
• Converting FC layers to CONV layers
• the only difference between FC and CONV layers is that the
neurons in the CONV layer are connected only to a local
region in the input, and that many of the neurons in a CONV
volume share parameters.
• However, the neurons in both layers still compute dot
products, so their functional form is identical.
Converting FC layers to CONV layers
• For any CONV layer there is an FC layer that implements the same forward
function.
• The weight matrix would be a large matrix that is mostly zero except for at
certain blocks (due to local connectivity) where the weights in many of the
blocks are equal (due to parameter sharing).
• Conversely, any FC layer can be converted to a CONV layer.
• For example, an FC layer with K=4096 that is looking at some input volume
of size 7×7×512
• can be equivalently expressed as a CONV layer with F=7,P=0,S=1,K=4096.
• In other words, we are setting the filter size to be exactly the size of the
input volume, and hence the output will simply be 1×1×4096 since only a
single depth column “fits” across the input volume, giving identical result
as the initial FC layer.
ConvNet Architectures
Layer Patterns
• The most common architecture
• stacks a few CONV-RELU layers,
• follows them with POOL layers,
• and repeats this pattern until the image has been merged spatially
to a small size.
• At some point, it is common to transition to fully-connected layers.
The last fully-connected layer holds the output, such as the class
scores. In other words, the most common ConvNet architecture
follows the pattern:
INPUT -> [[CONV -> RELU]*N -> POOL?]*M ->[FC -> RELU]*K -> FC
• N >= 0 (and usually N <= 3), M >= 0, K >= 0
Prefer a stack of small filter CONV to one large receptive field CONV layer.
three layers of 3x3 CONV vs a single CONV layer with 7x7
receptive fields.
• The receptive field size is identical in spatial extent (7x7), but
with several disadvantages.
1. The neurons would be computing a linear function over the input,
while the three stacks of CONV layers contain non-linearities that
make their features more expressive.
2. If we suppose that all the volumes have C channels, the single 7x7
CONV layer would contain C×(7×7×C)=49C2 parameters, while the
three 3x3 CONV layers would contain 3×(C×(3×3×C))=27C2
parameters.
• Intuitively, stacking CONV layers with tiny filters as opposed to
having one CONV layer with big filters allows us to express
more powerful features of the input, and with fewer
parameters.
Recent Departures
• The conventional paradigm of a linear list of layers
has recently been challenged, in
1. Google’s Inception architectures
2. current (state of the art) Residual Networks from
Microsoft Research Asia.
• Both of these feature more intricate and different
connectivity structures.
DIABETIC RETINOPATHY
LEARNING OBJECTIVES
• Recognize the importance of diabetic retinopathy as a public
health problem
• Discuss diabetic retinopathy as a leading cause of blindness in
developed countries
• Identify the risk factors for diabetic retinopathy
• Describe and distinguish between the stages of diabetic
retinopathy
• Understand the role of risk factor control and annual dilated eye
exams in the prevention of vision loss
DIABETES MELLITUS
Diabetes Mellitus is a group of diseases characterized by high blood glucose
levels. Diabetes results from defects in the body's ability to produce and/or use
insulin.
• Type 1 diabetes is usually diagnosed in children and young adults, and was
previously known as juvenile diabetes. In type 1 diabetes, the body does not
produce insulin. 5% of people with diabetes have this form of the disease.
• In Type 2 diabetes, either the body does not produce enough insulin or the
cells ignore the insulin. This is the most common form of diabetes.
DIABETIC RETINOPATHY (DR)
DEFINITION
• Progressive dysfunction of the retinal blood vessels
caused by chronic hyperglycemia.
• DR can be a complication of diabetes type 1 or
diabetes type 2.
• Initially, DR is asymptomatic, if not treated though it
can cause low vision and blindness.
http://www.mdconsult .com/da s/boo k/pdf /282715756 -3/978 -0-3 23-04332 -8/4 -u1.0 - B978- 0-323 -04332 -8..00 092-5 ..DOC PD F.pdf? isbn=97 8-0 -323- 04332-8 &e id= 4-u1. 0- B978 -0-323 -04332 -8..0 0092-5 ..DO CPD F
WHAT IS THE RETINA?
• The retina is a multilayered, light sensitive neural tissue
lining the inner eye ball. Light is focused onto the retina
and then transmitted to the brain through the optic
nerve.
• The macula is a highly sensitive area in the center of
the retina, responsible for central vision. The macula is
needed for reading, recognizing faces and executing
other activities that require fine, sharp vision.
RETINA
Healthy Retina Diabetic Retinopathy
DIABETIC RETINOPATHY
EPIDEMIOLOGY
• Duration of diabetes
• Poor Blood Sugar control
• HTN
• Hyperlipidemia
• Barriers to care
http://jama.ama-assn.org/content/304/6/649.short?rss=1
The Effect of Intensive Diabetes Treatment
On the Progression of Diabetic Retinopathy
In Insulin-Dependent Diabetes Mellitus
http://one.aao.org/CE/PracticeGuidelines/PPP_Content.aspx?cid=d0c853d3-219f-487b-a524-326ab3cecd9a
HOW DIABETES CAUSES VISION LOSS
How diabetes cause vision loss
Macular Clinical
significant
edema
macular edema
Vitreous hemorrhage
Preproliferative Proliferative and/or Retinal
DR DR detachment and/or
neovascular glaucoma
PATHOPHYSIOLOGY
Diabetic Retinopathy is a microvasculopathy that
causes:
• Retinal capillary occlusion
• Retinal capillary leakage
MICROVASCULAR OCCLUSION
Microvascular occlusion is caused by:
• Thickening of capillary basement membranes
• Abnormal proliferation of capillary endothelium
• Increased platelet adhesion
• Increased blood viscosity
• Defective fibrinolysis
Ischemia
Infarction
Increased VEFG
Neovascularization
Vitreous Neovascular
Fibrovascular bands
hemorrhage glaucoma
Tractional retinal
detachment Retina in systemic disease : a color manual of
ophthalmoscopy / Homayoun Tabandeh, Morton F.
Goldberg 2009
MICROVASCULAR LEAKAGE
Microvascular leakage is caused by:
• Impairment of endothelial tight junctions
• Loss of pericytes
• Weakening of capillary walls
• Elevated levels of vascular endothelial growth factor (VEGF)
Retinal
Edema Hard exudates
hemorrhage
.
RECOMMENDED
DiabeticEYE EXAMINATION
Eye Disease
SCHEDULE Key Points
Diabetes Type Recommended Time of Recommended Follow-
First Examination up*
Microaneurysms
MODERATE NONPROLIFERATIVE DIABETIC
RETINOPATHY (NPDR)
Characteristics
• More than just microaneurysms but less than severe NPDR but
less than severe NPD
MODERATE NONPROLIFERATIVE DIABETIC
RETINOPATHY (NPDR)
Microaneurysm
Hard exudates
Flamed shaped
hemorrhage
MODERATE NONPROLIFERATIVE
DIABETIC RETINOPATHY (NPDR)
Hard exudates
microaneurysm
SEVERE NONPROLIFERATIVE
DIABETIC RETINOPATHY (NPDR)
Any of the following:
• More than 20 intraretinal hemorrhages in each of four
quadrants
• Definite venous beading in two or more quadrants
• Prominent Intraretinal Microvascular Abnormalities
(IRMA) in one or more quadrants
• And no signs of proliferative retinopathy
Severe Nonproliferative Diabetic Retinopathy
(NPDR)
Venous beading
Proliferative Diabetic Retinopathy (PDR)
Characteristics
• Neovascularization
• Vitreous/preretinal
hemorrhage
PROLIFERATIVE
DIABETIC
RETINOPATHY Cotton-wool
spot
Neovascularization
Neovascularization
Hard exudate
Blot hemorrhage
HIGH-RISK PROLIFERATIVE DIABETIC
RETINOPATHY
Basic and Clinical Science Course, Section 12: Retina and Vitreous AAO
DIABETIC MACULAR EDEMA
• Diabetic macular edema is the leading cause of legal
blindness in diabetics.
• Diabetic macular edema can be present at any stage of
the disease, but is more common in patients with
proliferative diabetic retinopathy.
Meta analysis and review on the effect on bevacizumab id diabetic macular edema
Graefes Arch Clin Exp Ophthalmol(2011) 249:15-27
Why is Diabetic macular edema so important?
• The macula is responsible for central vision.
• Diabetic macular edema may be asymptomatic at
first. As the edema moves in to the fovea (the center
of the macula) the patient will notice blurry central
vision. The ability to read and recognize faces will be
compromised.
Macula
Fovea
Normal Macular Edema
CLINICALLY SIGNIFICANT MACULAR EDEMA
(CSME)
• Thickening of the retina at or within 500 µm of the
center of the macula.
• Hard exudates at or within 500 µm of the center of the
macula, if associated with thickening of the adjacent
retina.
• Area of retinal thickening 1 disc area or larger, within 1
disc diameter of the center of the macula.
ETDRS
INTERNATIONAL CLINICAL DIABETIC MACULAR EDEMA
DISEASE SEVERITY SCALE
Secondary prevention
Annual eye exams
Tertiary prevention
Retinal Laser photocoagulation
Vitrectomy
DIABETIC RETINOPATHY TREATMENT
Diabetic Retinopathy is
preventable through strict
glycemic control and annual
dilated eye exams by an
ophthalmologist.
The Guerrilla Eye Service of the UPMC Eye Center is dedicated
to eliminating barriers to eye care for patients in the Western
Pennsylvania area.
Self-driving cars
Question
How would you define a self-driving car?
Definition: What is an autonomous car?
● Autonomous Car: A driverless vehicle capable of fulfilling the main
transportation capabilities of a traditional car.
Classifications of Autonomy according to the NHTSA.
● Level 0: The driver completely controls the vehicle at all times.
● Level 1: Individual vehicle controls are automated, such as electronic stability
control or automatic braking.
● Level 2: At least two controls can be automated in unison, such as adaptive
cruise control in combination with lane keeping.
● Level 3: The driver can fully cede control of all safety-critical functions in certain
conditions and the car provides a "sufficiently comfortable transition time" for the
driver to do so.
● Level 4: The vehicle performs all safety-critical functions for the entire trip, with
the driver not expected to control the vehicle at any time.
Purpose
What kinds of things does a self-driving car need to be able to do?
Purpose
● navigate to a given destination based on passenger-provided instructions
● The range finder mounted on the top is a Velodyne 64-beam laser. This laser
allows the vehicle to generate a detailed 3D map of its environment.
● The car uses data collected from these mechanisms to drive itself.
Google’s Technology
How it works: Lidar system
● Laser + radar
● The system detects obstacles and tells the car when to avoid them to
navigate safely.
● It uses a 3D point cloud output provide the necessary data for robot software
to determine where potential obstacles exist in the environment and where
the car is is located relative to those obstacles.
How it works: Velodyne
● Company started experimenting with laser distance in 2005 with the DARPA
Grand Challenge
● Since then, they have vastly reduced the size of the sensor and weight while
improving its performance.
● It is a premier lidar system
How does communication among driverless cars
work?
● vehicles and roadside units as the communicating nodes
○ DSRC devices- 5.9 GHz band with bandwith of 75 MHz- range of 1000m
Communication among driverless cars cont.
● Smart intersections
Bosch Peugeot
Nissan Uber
Renault Google
Toyota Tesla
Mercedes Benz
Audi
Tesla’s Current Auto Pilot
Potential advantages
● being able to get things done while in traffic or on the road
● fewer traffic collisions. Experts estimate 300,000 lives can be saved per
decade
● Software reliability
● Loss of privacy
Legislation
In the United States, state vehicle codes generally do not envisage — but do not
necessarily prohibit — highly automated vehicles.
Public Opinion
What do you think?
● By 2018, Elon Musk expects Tesla Motors to have developed mature serial
production version of fully self-driving cars, where the driver can fall asleep
behind the wheel.
Predictions: Possible Developments
● By 2018, Nissan anticipates to have a feature that can allow the vehicle
maneuver its way on multi-lane highways.
● By 2020, Google autonomous car project head's goal to have all outstanding
problems with the autonomous car be resolved.
SMART SPEAKER
CONSUMERADOPTION
REPORT
MARCH 2019
U.S.
G I V I N G V OI C E T O A REV OLU T I ON
Table of Contents About Voicebot About Voicify
Introduction // 3 Voicebot produces the l e adi ng online publication, Voicify i s the market leader in voice experience
newsletter and po dcas t fo cused on the voice and AI manage me nt software that co mbi ne s voice
Smart Speaker Ownership // 6 industries. Tho us ands of entrepreneurs, developers, optimized content manage me nt, cross-platform
investors, anal ys ts and other industry leaders look deployment, and voice-specific customer i ns i ghts.
Smart Speaker Use Cases //15 to Voicebot e ach week for the latest news, data, The Voicify Voice Experience Platform™ e nabl e s
anal ys i s and i ns i ghts de fining the trajectory of the marketers to connect with their cus to me rs by
Voice Assistants on Smart Phones // 22
next great co mputi ng platform. At Voicebot, we gi ve creating hi ghl y e ngagi ng and personalized voice
Voice App Discovery // 25 voice to a revolution. experiences that are automaticall y deployed to
a broad array of voice platforms s uch as voice
Consumer Sentiment about Smart Speakers // 29 as s i s tants (Amazon Alexa, Google As s i stant and
Microsoft Cortana), chatbo ts and other services.
Conclusion // 32 Methodology The platform e nabl e s non-technical us e r s to
deploy feature-rich voice appl i cati ons quickly and
Additional Resources // 33 The survey w as conducted online during the first
efficiently while offering the flexibility of unlimited
week of Januar y 2019 and w as completed by 1,038
customization.
U.S. adul ts age 18 or older that were representative
of U.S. C e ns us de mographic ave r ages . B e caus e we
reached only online adul ts which represent 89% of Voicify.com
the population acco r di ng to Pe w Re s earch Center,
s o me totals are adj us te d downward to provide
device and us age numbe r s relevant to the entire
adult population. Other fi ndi ngs are relative to device
ownership and do not require adj us tment.
SMART SPEAKER CONSUMERADOPTIONREPORT
Reg a rd less, wh en more th a n one-in-four con su mers are u sin g a device a n d its voice a ssista n t, the
med ia , brands, g a me makers, service providers, independent developers, a n d even g ov ern ments are
sure to take notice. This recognition is p la y ing out with more voice a p p s published. The number of
Alexa skills rose by 2.2 times to nearly 60,000 in the U.S. a lon e in 2 018. During the sa me period Google
Actions grew a t a slightly fa ster rate of 2.5 times to over 4,000.
A Different Smart Speaker Ecosystem, but the Same Leaders Smart Speakers are Solidly in the Early Majority Market
Voicebot reported in the fal l of 2018 that Phas e 1 of s mar t speaker adoption One w ay we can put the current state of s mar t speaker adoption in pers pe cti ve i s
w as over and we were entering Phas e 2. The s e co nd phas e i s characterized by to consider a s tandar d technolo gy adoption life cycle first de velope d i n the 1950’s
the influx of more cas ual us e r s but al s o by the introduction of new product form at Iowa S tate University and popularized in the 1990’s by Geoffrey Moore.
factors and new manufactur ers.
The model po s i ts that about 16% of the user population will be “innovators” and
The mo s t s i gnificant of these change s has been the e me r ge nce of s mar t “early adopters” followed by 34% that will be amo ng the “early majority.” With more
di s pl ays . When Amazo n w as the only manufactur ers of these voice-first de vi ce s than 26% population adoption, s mar t s pe ak er s are securely in the “early majority”
with di s pl ay screens, adoption w as mi ni mal. However, the introduction of Google s e gme nt today.
As s i s tant e nabl ed s mart di splays has helped drive s al e s , including Amazon, as it
An interesting as pe ct of mo vi ng al o ng the adoption curve i s that later adopters
brought more attention to the product category.
have different preferences than early adopters. Two ar e as of difference are
There are als o many more manufactur ers today than in 2017. B i g names in audio typically pl aci ng higher val ue in broader feature s e ts and integrations with other
s uch as Bose, B ang & Olufse n, and Klipsch all entered the s mar t speaker s e gme nt de vi ce s . You should expect to s e e s mar t speaker mak e r s e mphas i ze features,
in 2018 offering more co ns umer choice. However, the mo s t s igni fi cant new s mar t convenience of acce s s , and third-party integrations more in the co mi ng year.
speaker l aunch in 2018 w as Apple HomePod. That appe ar s to have captured
a s i gni ficant number of new s al e s in Q1 and Q2, but s e e ms to have tapered o ff
in Q3 and Q4. Although Apple w as threatening to break up the s mar t speaker 2019
duopoly, it appe ar s that Amazo n and Google enter 2019 nearly as strong as they
did in 2018 by mai ntai ni ng 85% in total installed bas e market share.
Amazo n continued to have the l e adi ng installed bas e of s mar t s pe ak ers in 2018 despite its market
s har e shrinking from about 72% to 61%. Google w as a bi g mover s hi fting from 18.4% to nearly 24%,
acco unti ng for precisely hal f of Amazon’s market s har e decline. U.S. Smart Speaker Market Share by Brand
January 2018 &2019
The “Other” category w as led by Apple and Sonos, and overall the non-Amazon, non-Google device
market s har e rose by 50% over 2017. More than hal f of this growth i s attributed to Apple HomePod
2019
which had a strong debut in the first hal f of 2018, but then tapered o ff in s al e s as the year went on.
There were several new s mar t s pe ak ers introduced in 2018 and many fo cus e d on s o und quality. It
61.1% 23.9% 15.0%
appe ar s co ns ume rs are open to addi ng these higher end s mar t s pe ak er s to their device collection as
Amazon Google Other
over three-quarters of “Other” category s mar t speaker owners al s o report havi ng either an Amazo n
Echo or Google Home device.
2018
S o no s went public in 2018 and w as clear in its investor do cume nts that voice as s i s tant integration
w as critical to the co mpany’s future competitive ness. However, the inability to l aunch a Google
As s i s tant e nabl ed speaker may have hurt its appe al with co ns ume r s as the co mpany’ s overall s mar t 71.9% 18.4% 9.7%
Amazon Google Other
speaker market s har e fell during the year. We can s ur mi se that mo s t of the S o no s fans that wanted
an Alexa-based speaker already bought their device in 2017. As the overall market e xpande d in 2018,
Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019
fe w additional S o no s One de vices were pur chased and the co mpany’ s relative market s har e fell.
Adding Google As s i s tant support in 2019 may help reverse this market s har e slide.
Amazo n Echo Dot i s the mo s t widely adopted s mar t speaker by a s i gni ficant
mar gi n. The s ub $50 list price device i s frequently avai l able for l e s s than $30 and
U.S. Smart Speaker Market Share by Device - January 2019
refurbished mo de l s can be acquired for under $20. This device has proven more
popular than Amazon’s higher priced o ffe r ings s uch as the Echo, Echo Pl us , Echo
31.4% 11.2% 10.0%
Spot, and Echo Show.
Amazon Echo Dot Google Home Other
In the Google portfolio, the Home and Home Mini appe ar to be equally popular
with 11.2% s har e e ach. There are likely to be more Home Mi ni s in us e today in
terms of total de vi ce s as this anal ys i s reflects the number of us e r s with acce s s to
a device. If you have one Home and three Minis, you are counted as one in e ach
category. And, this may be co mmo n as 87% of Google s mar t speaker owners
report havi ng both de vi ce s. 11.2%
23.2% Home Mini
Apple HomePod and S o no s One l e ad with s mar t speaker market s har e in the Echo or Plus
“Other” category. It appe ar s that s mar t di s pl ays with Google As s i s tant al ong with
2.7%
the introduction of Apple HomePod in February 2018 were the key drivers l e adi ng Apple
to a 50% growth in this category during the year. Keep in mi nd that as i de from HomePo
HomePod, the “Other” category de vi ce s all have Alexa or Google As s i stant on d
board, s o the do mi nance of Amazo n and Google voice as s i s tants extends beyond 3.5% // Echo Spot
2.2%
1.2% // Home Hub Sonos One
their own products. 3.0% Voicebot
Source: // Amazon Echo Show
Smart Speaker Consumer Adoption Report Jan 2019 0.2% // Home Max
U.S. Smart Speaker Frequency of Use 2018 New Smart Speaker Owners are Less
63.6%
Likely to be Daily Users
Maybe the bi gge s t change in the composition of s mar t speaker owners i s the
47.4% influx of more cas ual us e r s of the de vi ce s . Nearly 64% of device owners in
Januar y 2018 reported be i ng daily users. In Januar y 2019, that number fell to only
about 47%. Monthly us e r s were fairly similar with the offsetting difference be i ng
the infrequent us e r s which rose from 13% to over 26%.
26.5% 26.1%
23.5%
This s e e ms like a natural progression. Early
12.9% adopters of technology are more likely to
incorporate them quickly into their daily habi ts than
co ns ume rs that tend to adopt later. However, this
2018 2019 2018 2019 2018 2019 will be a metric to monitor go i ng forward. Three
NEVER ORRARELY MONTHLY DAILY
out of four s mar t speaker owners still report be i ng
Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019 monthly active users. As long
as we s e e that type of consistent us age al o ng
with continued growth, s mar t s pe ak e rs will
continue to grow in importance as a voice
as s i s tant channel for co ns umer e ngage ment.
The data indicate that the industry sold about 48 million s mar t s pe ak er s in the U.S. in 2018 bringing the total
in us e to about 133 million up from about 85 million at the end of 2017. Of the 19 million new s mar t speaker
owners, 31% have pur chase d multiple de vi ce s . That co mpar e s to 49% of U.S. adul ts that have owned s mar t
s pe ak e rs for more than a year and have multiple de vi ce s.
8.0% 14.4%
3 devices
3 devices
65.7% 58.1%
19.3% 1 device 1 device
2 devices
23.2%
2 devices
2018 2019
© VOICEBOT.AI - All Rights Reserved 2019 Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019 PAGE 11
SMART SPEAKER CONSUMERADOPTIONREPORT
For the s e co nd straight year, the living room w as the Where Consumers Have Smart Speakers
mo s t co mmo n location for s mar t s pe ak ers . At just
under 45% it w as ahe ad of the bedroom at 37.6%
2.3% // Garage
which had about the s ame percentage as l as t year 37.6% 14.4%
but mo ve d up from third to the s e co nd mo s t popular Bedroom Home Office 32.7%
spot. Third pl ace went to the kitchen. At right around
Kitchen
33%, the kitchen s e e ms to have fallen from favor a
bit amo ng s mar t speaker owners. It’s still popular, 44.4%
but down from 41% in Januar y 2018. Living Room
Mo s t of the other locations were fairly similar to l as t
year with the exception of the ho me office which
grew by about one-third. As co ns ume rs have been
addi ng more s mar t s pe ak er s to their collection,
the ho me office s e e ms to be a co mmo n s e co nd
location. 2.0%
6.2% // Bathroom 6.5% // Dining Room Work
Office
Note: Multiple responses accepted, numbers total more than 100%
© VOICEBOT.AI - All Rights Reserved 2019 Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019 PAGE 12
SMART SPEAKER CONSUMERADOPTIONREPORT
Amazon Prime and Gmail Users More Likely to be Smart Speaker Owners
AMAZON P RIME GMAIL USE RS
It will surprise fe w people that Amazo n Prime me mbe r s are Gmail us e r s are al s o more likely than all us e r s to own a s mar t
50% more likely to own a s mar t speaker and more likely to speaker, in this cas e by about 31%. However, Gmail us e r s are
own an Echo branded device. Amazo n Echo s mar t s pe ak er s no more likely than all us e r s to own a Google Home device.
co mmand a 70% market s har e amo ng Prime members, but In fact, they are al mo s t exactly representative of all us e r s
surprisingly al s o adopt Google Home products at al mo s t a when it co me s to Amazon, Google, and third-party branded
22% rate. Non-Prime me mbe r s are more likely to adopt third s mar t s pe ake rs. Whereas a Prime me mbe r s hip and Gmail us e
party s mar t s pe ake rs made by manufacture rs other than s ugge s ts a bi as toward early technology adoption, only the
Amazo n or Google. Prime me mbe r ship s e e ms to materially influence consumer
choice of s mar t s peake rs.
al w ays . When you look at monthly and daily use, it provides a far more accur ate
For the s e co nd straight year, as k i ng general que s tio ns i s the top indication of why co ns ume rs are us i ng the de vi ce s and in many cas e s why they
us e cas e mo s t co mmo nly tried by s mar t speaker owners. may be buyi ng a s e co nd or third s mar t speaker for the ho me . For example, nearly
However, it i s not the top us e cas e employed on a monthly or one-in-four s mar t speaker owners s ay they s e t an al ar m to pl ay on their s mar t
daily bas i s . That distinction go e s to listening to s tr e ami ng music speaker daily. That would s ugge s t a device location for the bedroom may be co me
s e r vi ces as it did in 2018. Third pl ace both ye ar s w as as k i ng increasingly important.
about the weather which i s followed by Timers and Alarms in the
fourth and fifth positions. Number six in 2019 w as listening to the The bi gge s t variance i s s mar t ho me control which i s ninth in terms of “ever tried”
radio. and fourth for “daily active use.” You mus t have a s mar t ho me device to us e
this feature s o that automaticall y e l i mi nates s o me people from trial. However,
You may have noticed that four of the top five us e cas e s are what controlling l i ghts or thermostats are already daily functions and if you have s mar t
are considered first-party services. That me ans they are provided ho me de vi ce s for these features, then s w i tchi ng your habi ts from s mar tphone
by the voice as s i s tant natively. Two of the top six us e cas e s app control to voice interaction i s a relatively e as y change . What s mar t speaker
involve mus i c which are third-party entertainment services. and voice as s i s tant developers want to s e e i s co ns umers us ing these de vi ce s
Positions 7-9 all go to the more traditional third-party services, frequently and incorporating them into daily routines. This not only l e ads to a
many of which were made by independent developers of Alexa higher perception of val ue by co ns ume rs but al s o l e ads to s ti ckiness which
skills and Google Actions. So, the order of us e frequency at a me ans the de vi ce s are l e s s likely to be removed or s w appe d out by co ns ume r s for
category level are first-party utilities, third-party entertainment, and a co mpe ti ng product.
third-party apps and services.
Frequency Sometimes
© VOICEBOT.AI - All Rights Reserved 2019 More Important Than Trial PAGE 15
VOICE COMMERCE
Voice co mme r ce w as a mover for a different reason. It had
Monthly Active U.S. Smart Speaker Voice Commerce Users
the lowest frequency of us e cas e s tracked this year. However,
it al s o s ho we d relative growth in monthly active us e r s during
2018. Monthly active us e r s rose 10.5% from 13.6% to 15.0%.
This i s still a relatively new us e cas e that co ns ume r s are
be co mi ng accus tome d to, but the growth i s indicative of the 13.6% 15.0%
utility of s ho ppi ng by voice. And, this isn’t just us e r s s e arching 2017 2018
for products. The r e s po nse s were s pe cific to mak i ng purchases .
When it co me s to product search, over 40% of us e r s have
attempted this us e cas e on s mar t s pe ak er s and 28% do s o
monthly. These are figures that are increasingly difficult for
co ns ume r br ands to ignore.
Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019
© VOICEBOT.AI - All Rights Reserved 2019 PAGE 17
SMART SPEAKER CONSUMERADOPTIONREPORT
S mar t ho me de vi ce s are popular amo ng Smart Home Devices Used by U.S. Smart Speaker Owners
s mar t speaker owners. More than 55%
of s mar t speaker owners s ay they have
at l e as t one s mar t ho me device that they
control by voice. Of course, be i ng abl e to
interact by voice with your s mar t ho me
de vi ce s doesn’t me an you are go i ng to
us e it as about one-in-five co ns ume rs 33.3% 21.2% 14.4% 12.4%
with s mar t ho me de vi ces have never tried
Smart TV Smart Lights S mart media controller, Smart Thermostat
controlling them with their s mar t speaker. game console or cable box
S mar t speaker owners are 10% more likely to have us e d a voice as s i s tant on a Voice Assistant Use Frequency on Smartphones by Smart Speaker Ownership
smartphone. They are al s o more likely to be daily users. Of all voice as s i s tant
us e r s on s mar tphones 27.6% report be i ng daily users. Amo ng s mar t speaker
owners that figure rises to 39.8%. 49.3%
RARELY
One-third of s mar t speaker owners s ay after pur chas ing the device they are us i ng 26.8%
voice as s i s tants on their s mar tpho ne s more frequently. Jus t over 50% s ay they
are us i ng s mar tphone -bas ed voice as s i s tants about the s ame and only 14% s ay
A 31.5%
they are us i ng them l e s s . There i s a growing co ns e ns us that s mar t s pe ak er s are
T
di s pl aci ng time normally s pe nt on s mar tphones and many people posit that this L
will help reduce screen time. An accelerant for reducing screen time may be us i ng E
voice as s i s tants on s mar tphones as well. This reduces the touch, swipe, and look A
S 1
for many us e cas e s .
T
39.8%
S mar t speaker owners are about as likely as non-owners to be monthly voice
M 33.3%
O
as s i s tant us e rs on s mar tphones . However, they are twice as likely to be daily N Smartphone Voice Assistant Smartphone VoiceAssistant
T Use Frequency of Non Smart Use Frequency of Smart
users. Data i s consistently s ho w i ng that us age of voice on one platform i ncr e as e s H Speaker Owners Speaker Owners
us age of voice on others platforms. L
Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019
Y
A 9.1%
T
L
E
A
S
T
Voice App Discovery
SMART SPEAKER CONSUMERADOPTIONREPORT
For the s e co nd year in a row, just under hal f of s mar t speaker owners s ay they How Smart Speaker Owners Discover Voice Apps
don’t actual l y discover new voice apps . One-in-four device owners rely on friends I don’t
to introduce them to new voice apps followed by just 15% that note s o ci al me dia 49.7%
as a discovery channel. Friends
A smaller group of us e r s between 11-14% cite Amazo n and Google’s primary 26.8%
promotion channe l s as key s o ur ce s of discovery, s uch as their in-app and online Social media
stores and weekly e mai l s . Not far behind these channe l s i s advertising which w as 15.4%
l e s s visible in previous years, but now i s a source of voice app discovery for one- Alexa skill store / Google Assistant discover section
in-ten s mar t speaker owners. 13.7%
Discovery i s the top i s s ue faci ng third-party voice app publishers today. The voice Email newsletter from Alexa or GoogleAssistant
as s i s tant user bas e i s growing quickly, but about hal f of these us e r s are only 11.1%
discovering first-party solutions provided by the voice as s i s tants the mse lve s Ads / commercials
s uch as Alexa and Google As s i s tant. Many third-parties are havi ng more trouble 10.5%
capturing new users. Word-of-mouth appe ar s to be the mo s t effective channel, News media
but i s the hardest to tap into. So, mo s t voice app publishers should fo cus on a 7.2%
variety of techniques r angi ng from ne w s me di a coverage and s o ci al me dia to Other
advertising to drive discovery today.
2.9%
Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019
© VOICEBOT.AI - All Rights Reserved 2019 PAGE 25
SMART SPEAKER CONSUMERADOPTIONREPORT
Only 14% of s mar t speaker owners have ever left a review of a third-party voice U.S. Smart Speaker Owners That Have Left a Voice App Review
app. This i s up from just 11% at the end of 2017 and i s a reflection of the fact that
mo s t reviews mus t be submitted through a vi s ual interface for a user experience Once
de s i gne d for no vi s ual interaction. 6.9% More than once
7.2%
Amazo n introduced voice r ati ngs in late 2018 which e nabl e d us ers to offer a star 85.9%
rating for an Alexa skill by voice after us i ng it. However, this w as limited to a fe w Never
skills and not avai l abl e for skill publishers to s e t as a feature on their own. By
contrast, Google As s i s tant us er s are more likely to us e the voice as s i s tant both on
s mar tphones and s mar t s pe ak er s. That multimodal us e profile mi ght explain why
Google Home owners are about 11% more likely to have left a review than those
with Amazo n Echo de vi ces.
With all of that s ai d, 14% s e e ms like a s mal l number of s mar t speaker owners
l e avi ng reviews until you consider the fact that only 48.7% s ay they have even
us e d a third-party voice app. That me ans 29% of device owners that have tried a
third-party voice app have left a review. This i s a promising figure gi ve n the friction
involved in actual l y l e avi ng a review provided voice app publishers can increase
the proportion of s mar t speaker owners that try third-party apps .
Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019
© VOICEBOT.AI - All Rights Reserved 2019
Consumer Sentiment
About Smart Speakers
SMART SPEAKER CONSUMERADOPTIONREPORT
Consumers Want to Be Understood What Qualities U.S. Users Value in Smart Speakers
How well it understands
Two-thirds of co ns ume rs s ay how well a voice as s i s tant me when I speak 67.0%
unde r s tands them i s an important quality. That i s followed by
Sound quality 54.9%
s o und quality at 54.9%, how much it can do at 50.7%, and the
s pe e d of response at 45.1%. Every other quality i s well behind
the s e top four char acter istics. How much it can do 50.7%
Amazo n, Apple, and Google executives have s po k en many ti me s How fa st it responds 45.1%
abo ut their fo cus on addi ng personality to voice as s i s tants de s pi te
the fact that it i s considered important by only 15. 4% o f s mar t Its personality 15.4%
s pe ak er owners. That lower rating may be i nfl uenced by the fact
that personality i s offered by all of the l e adi ng voice as s i s tant Whether it has my
favorite media entertainment 14.4%
providers, but it i s clearly not s o me thing havi ng an i mpact today.
Whether the voice assistant is
the same as my mobile device 10.1%
It i s al s o notable that s mar tphone ownership only influenced
s mar t speaker selection for about one-in-ten co ns ume rs. Apple I am not interested
in a smart speaker 9.8%
and Google would like that linkage to be higher gi ve n their
do mi nance of s mar tpho ne-base d voice as s i s tants worldwide.
Whether it has goodgames 4.6%
Source: Voicebot Smart Speaker Consumer Adoption Report Jan 2019