0% found this document useful (0 votes)
5 views

Deep Learning Lab Manual (1)

The document is a lab manual for a Deep Learning course for B.Tech students in the CSE (AI & ML) department at CMR Engineering College for the academic year 2023-2024. It outlines the vision, mission, program outcomes, and specific outcomes related to AI and ML, along with detailed course objectives and a list of experiments to be conducted in the lab. Key experiments include implementing various neural network models and activation functions, as well as exploring advanced architectures like GoogLeNet.

Uploaded by

csmnba2025
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Deep Learning Lab Manual (1)

The document is a lab manual for a Deep Learning course for B.Tech students in the CSE (AI & ML) department at CMR Engineering College for the academic year 2023-2024. It outlines the vision, mission, program outcomes, and specific outcomes related to AI and ML, along with detailed course objectives and a list of experiments to be conducted in the lab. Key experiments include implementing various neural network models and activation functions, as well as exploring advanced architectures like GoogLeNet.

Uploaded by

csmnba2025
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

DEEP LEARNING

Lab Manual
For

B.TECH(R-20 Regulation)

(IV YEAR – I SEM)

(2023-2024)

Department of CSE (AI &ML)

CMR ENGINEERING COLLEGE


(Autonomous Institution – UGC, Govt. of India)
Recognized under 2(f) and 12(B) of UGC ACT 1956

(Affiliated to JNTUH, Hyderabad, Approved by AICTE-

Accredited by NBA & NAAC - ISO 9001:2015 Certified)

Kandlakoya, Medchal, –501401, Telangana.

CSE (AI & ML) Department


Vision:
 To produce admirable and competent graduates and
experts in CSE Artificial Intelligence & Machine Learning
through the quality technical education, innovations and
research to improve the life style and need of the
society.

Mission:
 M1: To impart value based technical education in CSE
specific to AI & ML through innovative teaching and
learning methods.
 M2: To produce outstanding professionals by imparting
quality training, hands-on-experience and value based
education.
 M3: To produce competent graduates suitable for
industries and organizations at global level including
research and development for Social responsibility.

CSE (AI & ML) Outcomes (POs):


Engineering Graduates will be able to satisfy these NBA graduate attributes:
 Engineering knowledge: An ability to apply knowledge of
computing, mathematics, science and engineering
fundamentals appropriate to the discipline.
 Problem analysis: An ability to analyze a problem, and
identify and formulate the computing requirements
appropriate to its solution.
 Design/development of solutions: An ability to design,
implement, and evaluate a computer-based system,
process, component, or program to meet desired needs
with appropriate consideration for public health and
safety, cultural, societal and environmental
considerations.
 Conduct investigations of complex problems: An ability to
design and conduct experiments, as well as to analyze
and interpret data.
 Modern tool usage: An ability to use current techniques,
skills, and modern tools necessary for computing practice.
 The engineer and society: An ability to analyze the local
and global impact of computing on individuals,
organizations, and society.
 Environment and sustainability: Knowledge of
contemporary issues.
 Ethics: An understanding of professional, ethical, legal,
security and social issues and responsibilities.
 Individual and team work: An ability to function effectively
individually and on teams, including diverse and
multidisciplinary, to accomplish a common goal.
 Communication: An ability to communicate effectively
with a range of audiences.
 Project management and finance: An understanding of
engineering and management principles and apply these
to one’s own work, as a member and leader in a team, to
manage projects.
 Life-long learning: Recognition of the need for and an
ability to
 Engage in continuing professional development.

CSE (AI & ML) Program Educational Outcomes


(PEOs)
PEO1. Excel in professional career and higher education
by acquiring knowledge of mathematical computing and
engineering principles.
PEO2. To provide intellectual environment to successfully
pursue higher education specific to AI & ML.
PEO3. To impart knowledge in cutting edge Artificial
Intelligence technologies in par with industrial standards.
PEO4. To create an atmosphere to explore research areas
and produce outstanding contribution in various areas of
Artificial Intelligence and Machine Learning

CSE (AI & ML) Program specific Outcome (PSOs)

1. Professional Skills and Foundations of Software


development: Ability to use knowledge in emerging
technologies in identifying research gaps and provide
solutions with innovative ideas.
2. Applications of Computing and Research Ability: Ability
to analyze the problem to provide optimal solution by
fundamental knowledge and skills in Professional,
Engineering Sciences.

AI704PC: DEEP LEARNING LAB


B.Tech. IV Year I Sem. L T P C
0 0 3 1.5

Prerequisite:
• Any programming language Python/R/Matlab
Course Objectives:
• To implement ANN and Deep Learning concepts

List of Experiments:
1. Implementation of MP model
2. Implementation of feed forward neural network
3. Implementation of back propagation neural network
4. Implement all activation function of neural network for any pattern recognition
application
5. Implement any one of ImageNet or GoogleNet
6. Implement a system to recognize hand written character using CNN
7. Classify images appropriately using CNN
8. Implement LSTM Neural Network for Time Series Prediction

1. Implementation of MP model.

DESCRIPTION:
MP Neuron model:
The McCulloch-Pitts Neuron model relates to the theory
developed in the 90s where Walter McCulloch and Warren
Pitts put forward the idea. It is similar to a human neuron in
Biology consisting of Axons, Soma etc. These are mapped to
the functioning of the neural network.

The MP Neuron model is implemented according to


the Six jars of Machine Learning in the earlier publish.
1) Data-
The MP neuron takes binary input and gives binary
output. If the input data is not binary it can be
compacted to binary before it can be feeded to the
model.
2) Classification-
The classification is also binary which is 0 or 1. The
model can give a yes or no answer based on the input
and the threshold.

3) Model-
It consists of a function with a single parameter. The
input is aggregated(g). There is a threshold value
which is decided. If the value of the function is equal to
or greater than the threshold value, it gives a positive
output and vice versa.

The MP Neuron model basically draws a line where


positive values lie on the line or above the line whereas
the negative values below the line.

4) Loss function-

The squared loss function is applied. It finds the


difference between the predicted value and the actual
prediction as a square.

5) Learning -

Image credit: One Fourth Labs


Learning in MP Neuron consists of finding the
threshold value with lowest error for prediction. This is
done with brute force for a single parameter.

6) Accuracy-

Accuracy is given by the standard matrix of the division


of the number of right predictions by the total number
of predictions.

The MP Neuron basically helps to find a line that


separates the positive value from the negative ones.

Algorithm Implementation
2. Feed-forward neural network implementation

DESCRIPTION:

Implement a feed-forward neural network in Python.

Import libraries

import numpy as num # Contains a variety of mathematical functions,


including random number generators, linear algebra procedures, Fourier
transforms, and more
from sklearn import datasets

Create sample weights

Weights are used to describe the strength of a neural


connection. It varies from 0 to 1.
num.random.seed(0)
X, y = datasets.make_moons(200, noise=0.20)# Generating and plotting
the dataset
# nodes
inputlayer_dimensionality = 4
outputlayer_dimensionality = 3
hiddenlayer_dimensionality = 6

In the above code:

 We generated and plotted the datasets.


 We defined our neural network architecture, including the three
nodes.
 On each node, we gave a different dimension. The dimensions
will be used later to calculate the weighted sum of neurons.

Include weights
a1 = num.random.randn(inputlayer_dimensionality,
hiddenlayer_dimensionality)# weights for layer 1
c1 = num.zeros((1, hiddenlayer_dimensionality))# bias for layer 1
a2= num.random.randn(hiddenlayer_dimensionality,
hiddenlayer_dimensionality)# weights for layer 2
c2 = num.zeros((1, hiddenlayer_dimensionality))# bias for layer 2

a3= num.random.randn(hiddenlayer_dimensionality,
outputlayer_dimensionality)# weights for layer 3
c3 = num.zeros((1, outputlayer_dimensionality))# bias for layer 3

Note that the weighted sum is the sum of weights, input signal, and
bias element.

In the above code:

 All weights provided in the first, second, and third layers are used
to calculate the weighted sum of neurons in the first, second, and
third hidden layers.
 A softmax function is applied in the last layer. The list of numbers
sent to this function is transformed into a probability list whose
probabilities are proportional to the numbers in the list.
3. Implementation of back propagation neural network

DESCRIPTION:

What is Backpropagation Neural Network(BPN)?

BPN was discovered by Rumelhart, Williams & Honton in 1986. The core
concept of BPN is to backpropagate or spread the error from units of output
layer to internal hidden layers in order to tune the weights to ensure lower error
rates. It is considered a practice of fine-tuning the weights of neural networks in
each iteration. Proper tuning of the weights will make a sure minimum loss and
this will make a more robust, and generalizable trained neural network.

# Import Libraries
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

Load Dataset
Let’s first load the Iris dataset using load_iris() function of scikit-learn library
and seprate them in features and target labels. This data set has three classes Iris-
setosa, Iris-versicolor, and Iris-virginica.
# Load dataset
data = load_iris()
# Get features and target
X=data.data
y=data.target

Prepare Dataset
Create dummy variables for class labels using get_dummies() function
# Get dummy variable
y = pd.get_dummies(y).values

y[:3]

array([[1, 0, 0],

[1, 0, 0],

[1, 0, 0]], dtype=uint8)

Split train and test set


To understand model performance, dividing the dataset into a training set and a
test set is a good strategy.
Let’s split dataset by using function train_test_split(). you need to pass basically
3 parameters features, target, and test_set size. Additionally, you can use
random_state in order to get the same kind of train and test set.
#Split data into train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=20,
random_state=4)
Initialize Hyperparameters and Weights
Lets initialize the hyperparameters such as learning rate, iterations, input size,
number of hidden layers, and number of output layers.
# Initialize variables
learning_rate = 0.1
iterations = 5000
N = y_train.size

# number of input features


input_size = 4

# number of hidden layers neurons


hidden_size = 2

# number of neurons at the output layer


output_size = 3

results = pd.DataFrame(columns=["mse", "accuracy"])


Lets initialize the weights for hidden and output layers with random values.
# Initialize weights
np.random.seed(10)

# initializing weight for the hidden layer


W1 = np.random.normal(scale=0.5, size=(input_size, hidden_size))

# initializing weight for the output layer


W2 = np.random.normal(scale=0.5, size=(hidden_size , output_size))

def sigmoid(x):
return 1 / (1 + np.exp(-x))

def mean_squared_error(y_pred, y_true):


return ((y_pred - y_true)**2).sum() / (2*y_pred.size)

def accuracy(y_pred, y_true):


acc = y_pred.argmax(axis=1) == y_true.argmax(axis=1)
return acc.mean()
for itr in range(iterations):

# feedforward propagation
# on hidden layer
Z1 = np.dot(x_train, W1)
A1 = sigmoid(Z1)

# on output layer
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)

# Calculating error
mse = mean_squared_error(A2, y_train)
acc = accuracy(A2, y_train)
results=results.append({"mse":mse, "accuracy":acc},ignore_index=True )

# backpropagation
E1 = A2 - y_train
dW1 = E1 * A2 * (1 - A2)

E2 = np.dot(dW1, W2.T)
dW2 = E2 * A1 * (1 - A1)

# weight updates
W2_update = np.dot(A1.T, dW1) / N
W1_update = np.dot(x_train.T, dW2) / N

W2 = W2 - learning_rate * W2_update
W1 = W1 - learning_rate * W1_update
results.mse.plot(title="Mean Squared Error")

Output:
4. Implement all activation function of neural network for any pattern
recognition application

DESCRIPTION:

Activation functions are a critical part of the design of a neural network.


The choice of activation function in the hidden layer will control how well the
network model learns the training dataset. The choice of activation function in
the output layer will define the type of predictions the model can make.

 Activation functions are a key part of neural network design.


Activation Functions
An activation function in a neural network defines how the weighted sum of the
input is transformed into an output from a node or nodes in a layer of the
network.
Sometimes the activation function is called a “transfer function.” If the output
range of the activation function is limited, then it may be called a “squashing
function.” Many activation functions are nonlinear and may be referred to as the
“nonlinearity” in the layer or the network design.
The choice of activation function has a large impact on the capability and
performance of the neural network, and different activation functions may be
used in different parts of the model.

Technically, the activation function is used within or after the internal processing
of each node in the network, although networks are designed to use the same
activation function for all nodes in a layer.
1 # example plot for the relu activation function
2 from matplotlib import pyplot
3
4 # rectified linear function
5 def rectified(x):
6 return max(0.0, x)
7
8 # define input data
9 inputs = [x for x in range(-10, 10)]
10 # calculate outputs
11 outputs = [rectified(x) for x in inputs]
12 # plot inputs vs outputs
13 pyplot.plot(inputs, outputs)
14 pyplot.show()
Running the example calculates the outputs for a range of values and creates a
plot of inputs versus outputs.

We can see the familiar kink shape of the ReLU activation function.
5. Implement any one of ImageNet or GoogleNet

Deep Learning

Deep Learning is a machine learning field concerned with utilising Artificial


Neural Networks(ANNs) to solve computer vision tasks such as image
classification, object detection, and pose estimation.

Various configurations of ANNs such as convolutional neural networks(CNN),


recurrent neural networks(RNN), deep neural networks(DNN) can extract
features from various data formats such as text, images, videos etc.

GoogLeNet

The GoogLeNet architecture consists of 22 layers (27 layers including pooling


layers), and part of these layers are a total of 9 inception modules(figure4).

The table below depicts the conventional GoogLeNet architecture. Have a quick
review of the table before reading more on the table’s characteristics and
features.
Characteristics and features of GoogLeNet configuration table (figure 1)

 The input layer of the GoogLeNet architecture takes in an image of the


dimension 224 x 224.

 Type: This refers to the name of the current layer of the component within the
architecture

 Patch Size: Refers


to the size of the sweeping window utilised across conv and
pooling layers. Sweeping windows have equal height and width.

 Stride: Defines the amount of shift the filter/sliding window takes over the
input image.

 Output Size: The


resulting output dimensions(height, width, number of feature
maps) of the current architecture component after the input is passed through
the layer.
 Depth: Refer to the number of levels/layers within an architecture component.

 #1x1 #3x3 #5x5: Refers to the various convolutions filters used within the
inception module.

 #3X3 reduce #5x5 reduce: Refers to the numbers of 1x1 filters used before the
convolutions.

 Pool Proj: This


is the number of 1x1 filters used after pooling within an
inception module.

 Refers to the number of weights within the current architecture


Params:
component.

 Ops: Refers to the number of mathematical operations carried out within the
component
6. Implement a system to recognize hand written character using CNN
import tensorflow as tf
In [18]:
import matplotlib.pyplot as plt
In [56]:
import numpy as np
import pandas as pd
In [3]:
from mnist import MNIST
In [4]:
mndata = MNIST('.\emnist_data')
In [5]:
mndata.select_emnist('balanced')
In [6]:
mndata
Out[6]:
<mnist.loader.MNIST at 0x178ce31cf98>
In [38]:
images, labels = mndata.load_training()
In [40]:
n_labels = np.array(labels)
In [24]:
n_images = np.array(images)
In [36]:
single_image = n_images[1345].reshape(28,28)
In [37]:
plt.imshow(single_image)
plt.show()

7. Classify appropriately using CNN


Image Classification using Convolutional Neural
Network with Python
One of the most popular Deep Neural Networks is Convolutional Neural
Networks(CNN).

A convolutional neural network(CNN) is a type of Artificial Neural


Network(ANN) used in image recognition and processing which is specially
designed for processing data(pixels).

Image Source: Google.com


Shape Your Future
Get a Personalized Roadmap for Your Data Science Journey with Our Tailor-Made
Course!
Explore More

Before moving further we need to understand what is the neural network? Let’s
go…

Neural Network:
A neural network is constructed from several interconnected nodes
called “neurons”. Neurons are arranged into the input layer, hidden layer,
and output layer. The input layer corresponds to our predictors/features and
the Output layer to our response variable/s.

Image Source: Google.com

Multi-Layer Perceptron(MLP):
The neural network with an input layer, one or more hidden layers, and one
output layer is called a multi-layer perceptron (MLP). MLP is Invented
by Frank Rosenblatt in the year of 1957. MLP given below has 5 input nodes, 5
hidden nodes with two hidden layers, and one output node
Image Source: Google.com

How does this Neural Network work?


– Input layer neurons receive incoming information from the data which they
process and distribute to the hidden layers.

– That information, in turn, is processed by hidden layers and is passed to the


output neurons.

– The information in this artificial neural network(ANN) is processed in terms of


one activation function. This function actually imitates the brain neurons.

– Each neuron contains a value of activation functions and a threshold


value.

– The threshold value is the minimum value that must be possessed by the
input so that it can be activated.
– The task of the neuron is to perform a weighted sum of all the input signals and
apply the activation function on the sum before passing it to the next(hidden or
output) layer.

Now we will move forward to see a case study of CNN.

1) Here we are going to import the necessary libraries which are


required for performing CNN tasks.

import NumPy as np
%matplotlib inline
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import TensorFlow as tf
tf.compat.v1.set_random_seed(2019)

2) Here we required the following code to form the CNN model

model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(16,(3,3),activation = "relu" ,
input_shape = (180,180,3)) ,
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(32,(3,3),activation = "relu") ,
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64,(3,3),activation = "relu") ,
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(128,(3,3),activation = "relu"),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(550,activation="relu"), #Adding
the Hidden layer
tf.keras.layers.Dropout(0.1,seed = 2019),
tf.keras.layers.Dense(400,activation ="relu"),
tf.keras.layers.Dropout(0.3,seed = 2019),
tf.keras.layers.Dense(300,activation="relu"),
tf.keras.layers.Dropout(0.4,seed = 2019),
tf.keras.layers.Dense(200,activation ="relu"),
tf.keras.layers.Dropout(0.2,seed = 2019),
tf.keras.layers.Dense(5,activation = "softmax") #Adding
the Output Layer
])
A convoluted image can be too large and so it is reduced without
losing features or patterns, so pooling is done.

Here Creating a Neural network is to initialize the network using the


Sequential model from Keras.

Flatten()- Flattening transforms a two-dimensional matrix of


features into a vector of features.

3) Now let’s see a summary of the CNN model

model.summary()

It will print the following output

Model: "sequential"
______________________________________________________________
___
Layer (type) Output Shape Param #
==============================================================
===
conv2d (Conv2D) (None, 178, 178, 16) 448
______________________________________________________________
___
max_pooling2d (MaxPooling2D) (None, 89, 89, 16) 0
______________________________________________________________
___
conv2d_1 (Conv2D) (None, 87, 87, 32) 4640
______________________________________________________________
___
max_pooling2d_1 (MaxPooling2 (None, 43, 43, 32) 0
______________________________________________________________
___
conv2d_2 (Conv2D) (None, 41, 41, 64) 18496
______________________________________________________________
___
max_pooling2d_2 (MaxPooling2 (None, 20, 20, 64) 0
______________________________________________________________
___
conv2d_3 (Conv2D) (None, 18, 18, 128) 73856
______________________________________________________________
___
max_pooling2d_3 (MaxPooling2 (None, 9, 9, 128) 0
______________________________________________________________
___
flatten (Flatten) (None, 10368) 0
______________________________________________________________
___
dense (Dense) (None, 550) 5702950
______________________________________________________________
___
dropout (Dropout) (None, 550) 0
______________________________________________________________
___
dense_1 (Dense) (None, 400) 220400
______________________________________________________________
___
dropout_1 (Dropout) (None, 400) 0
______________________________________________________________
___
dense_2 (Dense) (None, 300) 120300
______________________________________________________________
___
dropout_2 (Dropout) (None, 300) 0
______________________________________________________________
___
dense_3 (Dense) (None, 200) 60200
______________________________________________________________
___
dropout_3 (Dropout) (None, 200) 0
______________________________________________________________
___
dense_4 (Dense) (None, 5) 1005
==============================================================
===
Total params: 6,202,295
Trainable params: 6,202,295
Non-trainable params: 0

4) So now we are required to specify optimizers.

from tensorflow.keras.optimizers import RMSprop,SGD,Adam


adam=Adam(lr=0.001)
model.compile(optimizer='adam',
loss='categorical_crossentropy', metrics = ['acc'])

Optimizer is used to reduce the cost calculated by cross-entropy

The loss function is used to calculate the error


The metrics term is used to represent the efficiency of the model

5)In this step, we will see how to set the data directory and
generate image data.

bs=30 #Setting batch size


train_dir = "D:/Data Science/Image Datasets/FastFood/train/"
#Setting training directory
validation_dir = "D:/Data Science/Image
Datasets/FastFood/test/" #Setting testing directory
from tensorflow.keras.preprocessing.image import
ImageDataGenerator
# All images will be rescaled by 1./255.
train_datagen = ImageDataGenerator( rescale = 1.0/255. )
test_datagen = ImageDataGenerator( rescale = 1.0/255. )
# Flow training images in batches of 20 using train_datagen
generator
#Flow_from_directory function lets the classifier directly
identify the labels from the name of the directories the image
lies in
train_generator=train_datagen.flow_from_directory(train_dir,ba
tch_size=bs,class_mode='categorical',target_size=(180,180))
# Flow validation images in batches of 20 using test_datagen
generator
validation_generator =
test_datagen.flow_from_directory(validation_dir,

batch_size=bs,

class_mode = 'categorical',

target_size=(180,180))

The output will be:

Found 1465 images belonging to 5 classes.


Found 893 images belonging to 5 classes.

6) Final step of the fitting model.

history = model.fit(train_generator,
validation_data=validation_generator,
steps_per_epoch=150 // bs,
epochs=30,
validation_steps=50 // bs,
verbose=2)

The output will be:

Epoch 1/30
5/5 - 4s - loss: 0.8625 - acc: 0.6933 - val_loss: 1.1741 -
val_acc: 0.5000
Epoch 2/30
5/5 - 3s - loss: 0.7539 - acc: 0.7467 - val_loss: 1.2036 -
val_acc: 0.5333
Epoch 3/30
5/5 - 3s - loss: 0.7829 - acc: 0.7400 - val_loss: 1.2483 -
val_acc: 0.5667
Epoch 4/30
5/5 - 3s - loss: 0.6823 - acc: 0.7867 - val_loss: 1.3290 -
val_acc: 0.4333
Epoch 5/30
5/5 - 3s - loss: 0.6892 - acc: 0.7800 - val_loss: 1.6482 -
val_acc: 0.4333
Epoch 6/30
5/5 - 3s - loss: 0.7903 - acc: 0.7467 - val_loss: 1.0440 -
val_acc: 0.6333
Epoch 7/30
5/5 - 3s - loss: 0.5731 - acc: 0.8267 - val_loss: 1.5226 -
val_acc: 0.5000
Epoch 8/30
5/5 - 3s - loss: 0.5949 - acc: 0.8333 - val_loss: 0.9984 -
val_acc: 0.6667
Epoch 9/30
5/5 - 3s - loss: 0.6162 - acc: 0.8069 - val_loss: 1.1490 -
val_acc: 0.5667
Epoch 10/30
5/5 - 3s - loss: 0.7509 - acc: 0.7600 - val_loss: 1.3168 -
val_acc: 0.5000
Epoch 11/30
5/5 - 4s - loss: 0.6180 - acc: 0.7862 - val_loss: 1.1918 -
val_acc: 0.7000
Epoch 12/30
5/5 - 3s - loss: 0.4936 - acc: 0.8467 - val_loss: 1.0488 -
val_acc: 0.6333
Epoch 13/30
5/5 - 3s - loss: 0.4290 - acc: 0.8400 - val_loss: 0.9400 -
val_acc: 0.6667
Epoch 14/30
5/5 - 3s - loss: 0.4205 - acc: 0.8533 - val_loss: 1.0716 -
val_acc: 0.7000
Epoch 15/30
5/5 - 4s - loss: 0.5750 - acc: 0.8067 - val_loss: 1.2055 -
val_acc: 0.6000
Epoch 16/30
5/5 - 4s - loss: 0.4080 - acc: 0.8533 - val_loss: 1.5014 -
val_acc: 0.6667
Epoch 17/30
5/5 - 3s - loss: 0.3686 - acc: 0.8467 - val_loss: 1.0441 -
val_acc: 0.5667
Epoch 18/30
5/5 - 3s - loss: 0.5474 - acc: 0.8067 - val_loss: 0.9662 -
val_acc: 0.7333
Epoch 19/30
5/5 - 3s - loss: 0.5646 - acc: 0.8138 - val_loss: 0.9151 -
val_acc: 0.7000
Epoch 20/30
5/5 - 4s - loss: 0.3579 - acc: 0.8800 - val_loss: 1.4184 -
val_acc: 0.5667
Epoch 21/30
5/5 - 3s - loss: 0.3714 - acc: 0.8800 - val_loss: 2.0762 -
val_acc: 0.6333
Epoch 22/30
5/5 - 3s - loss: 0.3654 - acc: 0.8933 - val_loss: 1.8273 -
val_acc: 0.5667
Epoch 23/30
5/5 - 3s - loss: 0.3845 - acc: 0.8933 - val_loss: 1.0199 -
val_acc: 0.7333
Epoch 24/30
5/5 - 3s - loss: 0.3356 - acc: 0.9000 - val_loss: 0.5168 -
val_acc: 0.8333
Epoch 25/30
5/5 - 3s - loss: 0.3612 - acc: 0.8667 - val_loss: 1.7924 -
val_acc: 0.5667
Epoch 26/30
5/5 - 3s - loss: 0.3075 - acc: 0.8867 - val_loss: 1.0720 -
val_acc: 0.6667
Epoch 27/30
5/5 - 3s - loss: 0.2820 - acc: 0.9400 - val_loss: 2.2798 -
val_acc: 0.5667
Epoch 28/30
5/5 - 3s - loss: 0.3606 - acc: 0.8621 - val_loss: 1.2423 -
val_acc: 0.8000
Epoch 29/30
5/5 - 3s - loss: 0.2630 - acc: 0.9000 - val_loss: 1.4235 -
val_acc: 0.6333
Epoch 30/30
5/5 - 3s - loss: 0.3790 - acc: 0.9000 - val_loss: 0.6173 -
val_acc: 0.8000

The above function trains the neural network using the training
set and evaluates its performance on the test set. The functions
return two metrics for each epoch ‘acc’ and ‘val_acc’ which are
the accuracy of predictions obtained in the training set and
accuracy attained in the test set respectively.

9. LSTM New Network Time series prediction

Why Deep Learning

We’ve known that statistical models work for forecasting time-series.


However, there are some limitations for those methods:

1. Need complete data for training. Some missing values can cause a very bad
performing result on your model. Even though there are some ways to handle
missing values but it’s very hard to

2. Usually, deal with univariate dataset and it’s challenging to be applied


on multivariate dataset

3. Sensitive to missing values


Deep learning methods are able to deal with those challenges above:

1. Not sensitive to missing value

2. Ease of incorporating exogenous variables (apply on both univariate dataset


and multivariate dataset)

3. Captures non-linear feature interactions

4. Automatic feature extraction

I’ll briefly explain key components/concept of neural network methods


and show how to apply neural networks step by step with Keras in python
code. For each model, I will follow the 5 steps to show how to use Keras
to build a basic NNets to forecast time-series.

1. Preprocessing

2. Define neural network shape and Model Compilation

3. Fit Model

4. Evaluation

5. Visualize prediction

Now, let’s code!

LSTM — one member of RNN families

LSTM(Long short-term memory), constructed by four main components:


Input Gate, Output Gate, Memory Cell and Forget Gate

 Input Gate: controls shipping the addition of information to the cell state. In
other words, input gate will consider which information should be added to
the cell state in order to ensure adding important information not redundant
information or noise.

 Memory Cell: (1) controls the value could be removed or refreshed (2)
contains a value that might need to be preserved as additional information
for many other time steps.

 Output Gate: controls selecting useful learning information from the current
cell state as an output.

 Forget Gate: controls removing information which is no longer required for


LTSM to learn things or less important, from the cell state. This helps
optimize the performance of the LSTM network.

I modified the original image from https://en.wikipedia.org/wiki/Long_short-


term_memory

Step1: Data Preprocessing:

LSTMs are sensitive to the scale of the input data. During the
Preprocessing step, I applied MinMaxScaler preprocessing class from
the scikit-learn module to normalize/rescale dataset.

Import helper function to create matrix


Rescale dataset to range 0–1.

Split dataset into training and test dataset. Create input 3-D input shape
for LSTM.

Step 2: Define neural network shape and compile model

Here, a very simple two-layer LTSM without hidden layers

Step 3: Fit model

model=model_lstm(look_back)

history = model.fit(trainX, trainY, epochs=100, batch_size=30,


validation_data=(testX, testY),
callbacks=[EarlyStopping(monitor=’val_loss’, patience=10)], verbose=1,
shuffle=False)

A small tip here: usually, it would be nice to try GRU (Gated Recurrent
Units), a “simplified version” of LSTM when LSTM shows overfitting.

Step 4: Model Evaluation

Print out error metrics and generate model loss plot.


From the graph above, obviously we overfitted our model since the model
almost didn’t do anything after 20 epochs.

By comparing the error metrics across models in this article, we got even
better results (LSTM shows smaller error metrics). So far, LSTM
outperforms other two models. We forecast about 2 months (65 days, to
be specific) with MAE equals 3744. In other words, the 2 months of
predictive advertising spend would be off about $3744 against the
actually spend. For an average monthly spend equals $2.5MM+. This is a
very good forecast! Also, it’s a big improvement from stats model results
in my another article.

Step 5. Visualizing Prediction


Even though it’s unable to perfectly capture all peaks and troughs but it
indeed did a better forecasting job comparing to DNN or RNN model in
this use case.

You might also like