Deep Learning Lab Manual

Experiment-1
Deep Learning Concepts

1. What is Activation Function? Explain about various types of activation functions that are
used to train neural networks with examples.
In neural networks, an activation function is a mathematical function applied to the weighted

sum of the inputs of a neuron, typically followed by a non-linear transformation. Activation
functions introduce non-linearity to the network, enabling it to learn complex patterns and make
accurate predictions. The choice of activation function plays a crucial role in the performance of a
neural network.
Here are some commonly used activation functions in neural networks along with their
mathematical formulas and examples:
1. Sigmoid Function:
The sigmoid function, also known as the logistic function, maps the input to a value between 0 and
1, providing a smooth, continuous output. It is expressed as:
f(x) = 1 / (1 + e^(-x))
Example: In binary classification problems, the sigmoid activation function is often used in the
output layer to produce a probability value indicating the likelihood of the input belonging to a
particular class.
2. Rectified Linear Unit (ReLU):
ReLU is a popular activation function that introduces non-linearity by outputting 0 for negative
inputs and the input value for positive inputs. Mathematically, it is defined as:
f(x) = max(0, x)
Example: ReLU is commonly used in hidden layers of deep neural networks due to its simplicity
and ability to alleviate the vanishing gradient problem.
3. Leaky ReLU:
Leaky ReLU is a modification of the ReLU function that addresses the "dying ReLU" problem. It
introduces a small slope for negative inputs, allowing for non-zero outputs even when the input is
negative. It is expressed as:
f(x) = max(ax, x) (a is a small positive constant, typically 0.01)
Example: Leaky ReLU can be beneficial when dealing with negative inputs, preventing the
corresponding neurons from being completely deactivated.
4. Hyperbolic Tangent (tanh):
The tanh function is similar to the sigmoid function but ranges between -1 and 1. It provides
stronger non-linearity and allows negative values in the output. Mathematically, it is defined as:
f(x) = (e^(x) - e^(-x)) / (e^(x) + e^(-x))
Example: Tanh activation function is commonly used in recurrent neural networks (RNNs) and
when the output range needs to be centered around 0.
5. Softmax:
The softmax function is primarily used in the output layer of a neural network for multi-class
classification problems. It converts a vector of real numbers into a probability distribution where the
sum of all the probabilities equals 1. Mathematically, it is defined as:
f(x_i) = e^(x_i) / sum(e^(x_j)) (for each element x_i in the input vector)
Example: Softmax activation is useful when you want the neural network to classify inputs into
mutually exclusive classes.
These are just a few examples of activation functions commonly used in neural networks. Each
activation function has its own characteristics and can be chosen based on the specific requirements
of the problem at hand.
2. What is Learning? Explain various types of Learning Rules.
Learning, in the context of neural networks, refers to the process of adjusting the parameters
(weights and biases) of the network based on the input data and desired outputs. The goal of
learning is to enable the network to make accurate predictions or classifications by minimizing
the difference between the predicted output and the desired output.
There are several types of learning rules used in neural networks. Here are some commonly used
learning rules:
1. Gradient Descent:
Gradient descent is the most popular and widely used learning rule in neural networks.
It adjusts the parameters of the network by iteratively updating them in the direction of steepest
descent of the loss function. The steps involved in gradient descent are as follows:
- Compute the gradient of the loss function with respect to the network parameters.
- Update the parameters by subtracting a small fraction of the gradient, multiplied by a learning
rate, from their current values.
- Repeat these steps until the network converges or a predefined stopping criterion is met.
There are variations of gradient descent, such as stochastic gradient descent (SGD) and
mini-batch gradient descent, which update the parameters using a subset of the training data
rather than the entire dataset at each iteration.
2. Back propagation:
Back propagation is a specific algorithm used to compute the gradients of the loss
function with respect to the weights and biases in a neural network. It is often used in
conjunction with gradient descent to update the parameters. Back propagation works by
propagating the errors from the output layer back to the earlier layers of the network, allowing
the gradients to be calculated and used for parameter updates.
3. Hebbian Learning:
Hebbian learning is a type of unsupervised learning that is based on the Hebbian
principle, which states that "cells that fire together wire together." It is used to strengthen the
connections between neurons that have correlated activity. In Hebbian learning, the weight
between two neurons is increased if they are both active at the same time and decreased if they
are not.
4. Reinforcement Learning:
Reinforcement learning is a type of learning where an agent learns to take actions in an
environment to maximize a cumulative reward signal. It involves the agent interacting with the
environment, receiving feedback in the form of rewards or penalties, and adjusting its actions
based on this feedback. Reinforcement learning algorithms use techniques like Q-learning and
policy gradients to update the network parameters based on the rewards received.
5. Unsupervised Learning:
Unsupervised learning refers to learning from unlabeled data, where the network learns
to extract meaningful representations or patterns from the input data without explicit target
outputs. Common techniques in unsupervised learning include autoencoders, clustering
algorithms, and generative models like variational autoencoders (VAEs) and generative
adversarial networks (GANs).
These are just a few examples of learning rules used in neural networks. The choice of learning
rule depends on the specific problem, the type of data, and the desired outcomes. Different
learning rules have different characteristics and are suited for different types of tasks, whether
it's supervised learning, unsupervised learning, or reinforcement learning.
3 .What is Perceptron and explain about Perceptron Networks

A perceptron is a type of artificial neuron that forms the fundamental building block of a
perceptron network, also known as a single-layer neural network. The perceptron was
introduced by Frank Rosenblatt in 1957 and is a simplified model of a biological neuron.
A perceptron takes a set of input values, applies weights to those inputs, computes the
weighted sum, and passes the result through an activation function to produce an output. The
weights assigned to the inputs determine the influence of each input on the final output. The
activation function introduces non-linearity and determines the output based on the computed
sum.
Mathematically, the output of a perceptron can be represented as follows:
output = activation_function(weighted_sum)
Where the weighted sum is the sum of the products of the inputs and their corresponding
weights.
Perceptron networks consist of a single layer of perceptrons, where each perceptron is
connected to all inputs. The outputs of the perceptrons are then combined to produce the overall
output of the network. Perceptron networks are also referred to as single-layer neural networks
because they lack hidden layers.
The learning process of a perceptron network involves adjusting the weights of the
perceptrons based on the input data and the desired outputs. This adjustment is done through a
learning algorithm, typically using a form of gradient descent, which aims to minimize the error
between the predicted outputs and the target outputs.
Perceptron networks are primarily used for binary classification tasks, where the output
is a binary decision (e.g., classifying an input into one of two categories). The perceptron
network learns to separate the input space into two regions using a hyperplane. The weights
assigned to the inputs control the position and orientation of the hyperplane, enabling the
network to learn decision boundaries.
However, perceptron networks have limitations. They can only learn linearly separable
patterns, which mean they cannot handle problems that require non-linear decision boundaries.
To overcome this limitation, multi-layer neural networks, such as feedforward neural networks
with hidden layers, were introduced. These networks are capable of learning complex non-linear
relationships and have significantly enhanced the capabilities of neural networks.
In summary, a perceptron is a basic building block of a perceptron network, which is a
single-layer neural network used for binary classification tasks. It takes inputs, applies weights,
computes a weighted sum, passes it through an activation function, and produces an output.
While perceptron networks have limitations, they paved the way for the development of more
powerful and flexible multi-layer neural networks.
4 . What is Adaline Network and Madaline Networks?
An Adaline (Adaptive Linear Neuron) network, also known as a single-layer neural

network or a linear threshold unit (LTU), is a type of artificial neural network that resembles the
structure of a perceptron network. It was introduced by Bernard Widrow and Ted Hoff in 1960
as an extension of the perceptron model.
The Adaline network is designed to perform linear regression and pattern classification
tasks. Unlike the perceptron, which uses a step function as its activation function, Adaline uses
a linear activation function. This allows it to produce continuous outputs instead of binary
outputs, making it suitable for regression tasks.
Here are the main components and characteristics of an Adaline network:

1. Input Layer: The input layer receives the input features or values and passes them directly to
the output layer.
2. Weighted Sum: Each input value is multiplied by its corresponding weight, and the weighted
sums are computed. These weighted sums are then summed together.
3. Linear Activation Function: The output of the Adaline network is the linear combination of
the weighted sums, which is not limited to binary values. The output is simply the weighted
sum itself without any threshold or non-linear transformation.
4. Weight Update: The learning process in Adaline involves adjusting the weights to minimize
the error between the network's output and the desired output. This is achieved using a variant
of the Widrow-Hoff learning rule, also known as the delta rule or the Least Mean Squares
(LMS) algorithm. The weights are updated incrementally based on the difference between the
network output and the target output.
5. Convergence: The learning process continues iteratively until the network converges or a
stopping criterion is met. Convergence is achieved when the weights reach a stable
configuration that minimizes the error and produces accurate predictions.
Adaline networks are primarily used for regression tasks, where the goal is to predict
continuous output values based on input features. They can also be used for pattern
classification by interpreting the continuous output as a decision boundary. However, Adaline
networks are limited to linearly separable patterns and cannot handle non-linearly separable
patterns without the use of additional techniques like feature engineering or kernel methods.
In summary, an Adaline network is a type of single-layer neural network that uses a

linear activation function and performs linear regression and pattern classification tasks. It
differs from the perceptron by using a continuous linear output instead of binary output. While
limited to linearly separable patterns, Adaline networks have been foundational in the
development of neural networks and learning algorithms.
The Madaline Network, also known as the multi-layer Adaline network, is a type of artificial
neural network that extends the capabilities of the Adaline network by introducing multiple
layers of Adaline neurons. It was introduced by Bernard Widrow and Marcian Hoff in 1962 as
an advancement over the single-layer Adaline model.
Here are the main characteristics and components of a Madaline network:
1. Layers: The Madaline network consists of multiple layers of Adaline neurons. Typically, it
includes an input layer, one or more hidden layers, and an output layer. The hidden layers
provide the network with additional processing power and the ability to learn complex patterns
and relationships.
2. Adaline Neurons: Each neuron within the Madaline network is an Adaline neuron, which is a
linear regression model with a linear activation function. The Adaline neurons in the network
operate similarly to the single-layer Adaline model, performing weighted sum calculations and
producing continuous outputs.
3. Weight Update: The Madaline network employs a learning algorithm, similar to the Adaline
network, to adjust the weights of the Adaline neurons and minimize the error between the
network's output and the desired output. This learning algorithm, often based on the Widrow-
Hoff learning rule or variants of it, updates the weights iteratively to improve the network's
performance.
4. Activation Function: The Madaline network typically uses a threshold activation function at
the output layer to produce binary outputs or make class predictions. The activation function
can be a step function or a sigmoid function, depending on the problem being solved.
The main advantage of the Madaline network over the single-layer Adaline network is its ability
to handle non-linearly separable patterns by introducing hidden layers. The hidden layers
provide additional levels of abstraction and allow the network to learn more complex decision
boundaries.
However, like the Adaline network, the Madaline network has limitations. It is still a shallow
network and may struggle with highly complex patterns that require deeper architectures such
as modern deep neural networks. Additionally, the Madaline network may suffer from the
vanishing gradient problem and may not converge well with deeper structures.
In summary, the Madaline network is a multi-layer neural network that extends the Adaline
model by incorporating multiple layers of Adaline neurons. It provides the ability to learn non-
linear patterns and perform more complex tasks compared to the single-layer Adaline model.
However, it is a relatively simpler neural network architecture compared to more modern deep
neural networks.
5. What is Pattern Matching? Explain about the rules used for the same
Pattern matching is a fundamental concept in computer science and refers to the process
of finding specific patterns or structures within a given data set or sequence. It involves
searching for instances of a particular pattern or determining whether a given input matches a
specific pattern.
Pattern matching techniques can vary depending on the context and requirements of the
specific application. It can involve simple matching of exact patterns, approximate matching
considering variations or errors, or more complex matching based on probabilistic or statistical
models.
1. Regular Expressions: Regular expressions (regex) are a powerful rule-based approach

used for pattern matching in text processing tasks. They allow you to define patterns
using a combination of characters, metacharacters, and quantifiers. Regular expressions
can be used to match specific strings, extract certain patterns, or find occurrences of
particular patterns within a text.
2. Template Matching: Template matching is a technique commonly used in computer

vision tasks. It involves comparing a template (a small pattern or image) to different
regions of a larger image to find instances of the template. Template matching relies on
rules such as pixel-level comparison or correlation measures to determine the similarity
between the template and the image regions.
3. Feature Extraction: In pattern matching tasks, feature extraction techniques can be used
to extract relevant information from the data. For example, in image analysis,
techniques like edge detection, corner detection, or scale-invariant feature transform
(SIFT) can be employed to extract discriminative features from images. These extracted
features can then be used for pattern matching purposes.
4. Sequence Alignment: Sequence alignment techniques are commonly used in pattern

matching tasks involving sequential data, such as DNA sequences or time series data.
These techniques, including dynamic programming algorithms like Needleman-Wunsch
or Smith-Waterman, help align and compare sequences to identify patterns or
similarities.
5. Statistical Methods: Statistical methods can be applied to pattern matching tasks to

quantify the similarity or likelihood of a pattern occurring. For instance, techniques like
Hidden Markov Models (HMM) or probabilistic graphical models can be employed to
model patterns and make probabilistic inferences based on the data.
6. Machine Learning-based Techniques: While deep learning falls under the umbrella of
machine learning, there are other machine learning methods that can be used for pattern
matching tasks. These include decision trees, support vector machines (SVM), or k-
nearest neighbors (KNN), which can learn rules or decision boundaries based on
training data and classify or match patterns accordingly.
The selection of the appropriate rule or technique for pattern matching depends on the specific
problem domain, the nature of the data, and the complexity of the patterns being sought.
Combining these rule-based approaches with deep learning techniques can often enhance the
performance and accuracy of pattern matching tasks.
6. What is Time Series Data? Explain about various machine learning techniques used for
analyzing time series data.
Time series data is a sequence of data points collected at successive points in time, and
analyzing it can provide insights into trends, patterns, and future predictions. There are various
types of time series data, each requiring different algorithms and techniques for analysis. Here
are some common types of time series data and the corresponding algorithms used to analyze
them:
1. Univariate Time Series:
- Univariate time series consists of a single variable observed over time.

- Algorithms: Moving Average, Exponential Smoothing (e.g., Holt-Winters), ARIMA
(AutoRegressive Integrated Moving Average), SARIMA (Seasonal ARIMA).
2. Multivariate Time Series:
- Multivariate time series involve multiple variables observed over time, often with potential
interdependencies.
- Algorithms: Vector Autoregression (VAR), Vector Error Correction Model (VECM),

Multivariate Exponential Smoothing, Dynamic Factor Models.
3. Longitudinal Data:
- Longitudinal data involves repeated measurements on the same individuals or entities over
time.
- Algorithms: Linear Mixed Effects Models, Generalized Estimating Equations (GEE),

Random Effects Models.
4. Panel Data:
- Panel data combines time series and cross-sectional data, with observations on multiple
entities at different time points.
- Algorithms: Fixed Effects Models, Random Effects Models, Pooled OLS Regression, Panel
ARIMA.
5. Functional Time Series:
- Functional time series represent curves or functions observed over time.
- Algorithms: Functional Data Analysis (FDA), Functional Principal Component Analysis

(FPCA), Functional AutoRegressive Integrated Moving Average (FARIMA).
6. Spatial Time Series:
- Spatial time series involve data collected across different spatial locations over time.
- Algorithms: Spatial AutoRegressive (SAR) Models, Spatial Durbin Models, Space-Time

Autoregressive (STAR) Models.
7. High-Frequency Time Series:
- High-frequency time series involve data collected at very fine time intervals, often in
financial or trading contexts.
- Algorithms: High-Frequency Trading Algorithms, Tick Data Analysis, Volatility Models

(e.g., GARCH), Market Microstructure Models.
8. Event Time Series:
- Event time series focus on occurrences of events over time.
- Algorithms: Event Detection Algorithms, Point Process Models, Hidden Markov Models
(HMM) for Events.
9. Irregular Time Series:
- Irregular time series have unevenly spaced time intervals between observations.
- Algorithms: Continuous-Time Models, Kernel Smoothing, Local Polynomial Regression.
10. Seasonal Time Series:
- Seasonal time series exhibit regular patterns that repeat over fixed intervals.
- Algorithms: Seasonal Decomposition, Seasonal Exponential Smoothing, Seasonal ARIMA.
Case Study: Weather Forecasting
Time series data refers to a sequence of data points collected or recorded over a period of time,
where each data point is associated with a specific timestamp or time index. Time series data
captures observations or measurements taken at regular intervals, such as hourly, daily, weekly,
or monthly.
Weather forecasting is a complex task that involves analyzing time series data to predict future
weather conditions. Machine learning techniques are commonly used to analyze weather data
and make accurate forecasts. Here are various machine learning techniques specifically applied
to weather forecasting:
1. Autoregressive Integrated Moving Average (ARIMA):
ARIMA models capture trends, seasonality, and noise in time series data. They are widely used
for short-term weather forecasting tasks. ARIMA models consider the historical weather
observations to predict future weather conditions based on the autoregressive (AR) and moving
average (MA) components.
2. Seasonal Autoregressive Integrated Moving Average (SARIMA):
SARIMA extends ARIMA models by incorporating seasonality. It considers seasonal patterns

and variations in weather data, making it suitable for capturing long-term trends and seasonal
changes. SARIMA models are commonly applied to weather forecasting tasks that exhibit
distinct seasonal patterns.
3. Long Short-Term Memory (LSTM) Networks:

LSTM is a type of recurrent neural network (RNN) that effectively models sequential data with
long-term dependencies. LSTM models have shown promising results in weather forecasting
tasks as they can capture complex patterns and temporal dependencies. They consider historical
weather data and make predictions based on the learned patterns.
4. Convolutional Neural Networks (CNNs):
CNNs, known for their success in image analysis, can also be utilized for weather forecasting.
In the case of weather data, CNNs can be applied to analyze meteorological images or satellite
data. By leveraging convolutional layers, CNNs can extract spatial patterns and make
predictions based on weather images or spatial data.
5. Ensemble Methods:
Ensemble methods combine multiple models to improve forecasting accuracy. They leverage
the strengths of individual models and reduce the impact of biases or errors. Ensemble methods,
such as Random Forests or Gradient Boosting, can be applied to weather forecasting by
aggregating predictions from multiple models or incorporating different features.
6. Deep Neural Networks (DNNs):
DNNs, including feedforward neural networks or hybrid models, can be used for weather
forecasting tasks. By stacking multiple layers, DNNs can learn complex patterns and non-linear
relationships in weather data. These models can be customized with specific architectures,
activation functions, and regularization techniques to suit the characteristics of weather data.
7. Gaussian Processes (GPs):
Gaussian processes are probabilistic models that can capture the uncertainty and non-linear
relationships in time series data. GPs have been applied to weather forecasting by modeling the
weather observations as a stochastic process and making predictions based on the learned
distributions.
It is important to note that weather forecasting is a highly complex domain, and machine
learning techniques often complement physical models and numerical weather prediction
(NWP) methods. These techniques integrate multiple data sources, such as satellite data, radar
data, or atmospheric models, to improve the accuracy of weather forecasts.
The selection of the most appropriate technique depends on factors such as the availability and
quality of historical weather data, the forecast horizon (short-term or long-term), and the
specific weather phenomena being predicted. It is common to experiment with multiple
techniques, compare their performance, and fine-tune the models based on the requirements and
constraints of weather forecasting applications.
7. What is the Role of Threshold Values and Activation Functions in Neural Networks?
Threshold values and activation functions play crucial roles in shaping the behavior and
learning of artificial neural networks. They are integral components of individual neurons
(also called nodes or units) within neural network layers. Let's explore their roles:
1. Activation Functions:
An activation function defines the output of a neuron based on its input. It introduces non-
linearity into the network, allowing it to capture complex relationships between inputs and
outputs. Without activation functions, neural networks would be limited to representing
linear transformations, which severely restricts their capability to learn and model intricate
patterns in data.
Common activation functions include:
- Sigmoid: Maps input values to the range (0, 1). It's often used in the past but is less
favored now due to vanishing gradient problems.
- Hyperbolic Tangent (tanh): Similar to sigmoid but maps input values to the range (-1, 1).
- Rectified Linear Unit (ReLU): Replaces all negative input values with zero and leaves
positive values unchanged. It's the most widely used activation function due to its simplicity
and effectiveness.
- Leaky ReLU: Similar to ReLU, but allows a small gradient for negative values, addressing
the "dying ReLU" problem.
- Parametric ReLU (PReLU): A variant of Leaky ReLU with a learnable parameter that
determines the slope of the negative side.
- Exponential Linear Unit (ELU): Smooth approximation of ReLU for negative inputs, with
some benefits in learning speed.
- Swish: A recently introduced activation function that performs well in some scenarios.
The choice of activation function depends on the problem at hand, network architecture, and
potential vanishing/exploding gradient issues. Activation functions introduce non-linearities
that enable the network to learn complex mappings and make them suitable for a wide range
of tasks, from image recognition to natural language processing.
2. Threshold Values:
In a biological neuron, there's a certain threshold that the combined input signals must
surpass for the neuron to "fire" and transmit an output signal. While artificial neurons don't
directly mimic biological neurons, the concept of a threshold value is related to how
activation functions operate.
In most artificial neural networks, neurons apply the activation function to the weighted
sum of their inputs. The threshold value corresponds to the point at which the activation
function starts generating non-zero outputs. For example, in the case of ReLU, the threshold
value is zero. Inputs below the threshold (negative values for ReLU) result in zero outputs.
The threshold value isn't usually a separate learnable parameter, as it is in some older
models like perceptrons. Instead, it's incorporated into the bias term of the neuron. The bias
shifts the activation function horizontally, effectively setting a threshold for when the
neuron should become active. This allows the network to learn when to respond to certain
features or patterns in the data.
In summary, activation functions introduce non-linearity, enabling neural networks to

capture complex relationships in data. Threshold values, embedded in the bias term, help
determine when neurons activate in response to their inputs, allowing the network to adapt
to different patterns and features in the data during training.
8. What is a Neural Network? Explain about various Neural Networks.
a. Feed Forward
b. Recurrent
c. Back Propagation
d. Convolution Neural Network
A neural network, often referred to as an artificial neural network (ANN) or simply a neural net,
is a computational model inspired by the structure and functioning of the human brain's
interconnected neurons. It's a machine learning algorithm designed to recognize patterns, learn
from data, and make predictions or decisions based on that learned information. Neural
networks excel at tasks involving complex, non-linear relationships within data.
A neural network is composed of layers of interconnected nodes, also known as neurons or

units. Each neuron receives input, processes it, and produces an output. The structure of a
neural network consists of three main types of layers:
1. Input Layer: This layer receives the initial data, which could be images, text, numerical
values, or any other form of input. Each neuron in the input layer corresponds to a feature or
dimension of the input data.
2. Hidden Layers: These layers sit between the input and output layers and are responsible for
learning and representing complex relationships in the data. Each neuron in a hidden layer takes
inputs from the previous layer, performs computations on them using weights and biases, and
then applies an activation function to produce an output. The number of hidden layers and the
number of neurons in each layer are design choices that impact the network's capacity to learn.
3. Output Layer: The final layer produces the network's output. The number of neurons in the
output layer depends on the type of problem the network is solving. For example, in a binary
classification task, there might be one neuron in the output layer that gives the probability of
belonging to the positive class. In a multi-class classification task, the output layer could have a
neuron for each class.
The connections between neurons are defined by weights, which are adjustable parameters that
determine the strength of the connection. During training, the network adjusts these weights
based on the error between its predictions and the actual target values. The objective is to
minimize this error by iteratively updating the weights using optimization algorithms like
gradient descent.
Neural networks can be used for a wide range of tasks, including image and speech recognition,
natural language processing, recommendation systems, game playing, medical diagnosis, and
more. They have shown remarkable success in various domains, often outperforming traditional
machine learning algorithms by automatically learning features and representations from raw
data.
There are various types of neural network architectures, including feedforward neural networks
(the simplest type), convolutional neural networks (CNNs) for image processing, recurrent
neural networks (RNNs) for sequence data, and more advanced architectures like transformers
and GANs (Generative Adversarial Networks). Each architecture is tailored to specific types of
data and tasks.
1. Feedforward Neural Network (FFNN):
A feedforward neural network is a basic architecture where information flows

unidirectionally, from input to output, without feedback loops. It comprises input, hidden, and
output layers. Each neuron's output is determined by weighted inputs and an activation
function.
1. Recurrent Neural Network (RNN):
A recurrent neural network is designed for sequence data, allowing information to loop back
through connections. It maintains a hidden state that carries memory of past inputs, enabling it
to capture temporal dependencies. RNNs are suitable for tasks like language modeling and time
series analysis.
3. Backpropagation:
Backpropagation is a training algorithm for neural networks. It involves two main steps:
forward pass, where inputs generate predictions; and backward pass, where prediction errors are
propagated backward to adjust weights using gradients. This process iterates to minimize
prediction errors and improve the model's performance.
4. Convolutional Neural Network (CNN):
A convolutional neural network is tailored for image and grid-like data. It employs
convolutional layers to extract spatial features, pooling layers to reduce dimensionality, and
fully connected layers for classification. CNNs excel in tasks like image recognition, object
detection, and image generation.
Experiment-2
Implement different types of Plots using Data Visualization in Python and write about
packages used.
Packages used
There are several popular Python packages used for plotting and data visualization. Each of
these packages has its strengths and weaknesses, and the choice of package depends on your
specific needs and preferences. Here are some of the main plotting libraries in Python:
1. Matplotlib: Matplotlib is one of the most widely used and versatile plotting libraries in
Python. It provides a wide range of plotting functions, supports various plot types (line plots,
scatter plots, bar plots, histograms, etc.), and allows for extensive customization. It serves as the
foundation for many other visualization libraries.
2. Seaborn: Seaborn is built on top of Matplotlib and provides a higher-level interface for
creating attractive statistical visualizations. It simplifies the process of creating complex plots
like violin plots, box plots, pair plots, and more. Seaborn also has good support for working
with Pandas DataFrames.
3. Plotly: Plotly is a powerful interactive plotting library that allows you to create interactive,
web-based visualizations. It supports a wide range of plot types and is particularly useful for
creating interactive charts, dashboards, and 3D visualizations.
4. Pandas Visualization: Pandas, a popular data manipulation library, also offers basic plotting
functionality using the `.plot()` method. It leverages Matplotlib behind the scenes and is handy
for quick exploratory visualizations directly from DataFrames.
5. Bokeh: Bokeh is another interactive visualization library, which is suitable for creating web-
based interactive plots. It can handle big datasets and is often used in data dashboards.
6. Altair: Altair is a declarative statistical visualization library that allows you to create concise,
expressive visualizations by specifying the data transformations and mappings in a compact and
easy-to-read format.
7. Ggplot: Ggplot is a Python implementation of the popular ggplot2 library from R. It follows
the grammar of graphics philosophy and provides a straightforward way to create complex
visualizations with a consistent syntax.
8. Holoviews: Holoviews provides a high-level interface for data visualization, allowing you to
create interactive plots with minimal code. It integrates with various plotting libraries like
Matplotlib, Bokeh, and Plotly.
9. Geopandas: Geopandas is a specialized library for working with geospatial data. It allows
you to create maps and plot geospatial data easily.
Line Plot
import matplotlib.pyplot as plt
def create_line_plot(x_values, y_values, title, x_label, y_label):
plt.figure(figsize=(8, 6))
ax = plt.axes()
ax.plot(x_values, y_values, marker='o', linestyle='-')
ax.set_title(title)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
plt.show()
x_values = [1, 2, 3, 4, 5]
y_values = [10, 15, 5, 20, 12]
create_line_plot(x_values, y_values, "Line Plot ", "X-axis", "Y-axis")
Bar Chart
categories = ['Category 1', 'Category 2', 'Category 3', 'Category 4', 'Category 5']
values = [25, 18, 40, 30, 15]
plt.bar(categories, values)
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Chart')
plt.show()
Histogram

data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150]
plt.hist(data, bins=10, color='skyblue', edgecolor='black')
plt.xlabel('Values')
plt.ylabel('Frequency')
plt.title('Histogram ')
plt.show()
Pie Chart
def create_pie_chart(labels, sizes):
plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=140)
plt.axis('equal')
plt.show()
labels = ['Category 1', 'Category 2', 'Category 3 ', 'Category 4 ']
sizes = [30, 25, 20, 25]
create_pie_chart(labels, sizes)
Scatter Plot

x_values = [1, 2, 3, 4, 5]
y_values = [2, 4, 3, 5, 6]
plt.scatter(x_values ,y_values)
plt.xlabel('X-axis Label')
plt.ylabel('Y-axis Label')
plt.title('Scatter Plot')
plt.legend()
plt.show()
Area Plot

years = [2010, 2011, 2012, 2013, 2014, 2015, 2016]
category_a = [10, 12, 8, 15, 11, 9, 14]
category_b = [5, 8, 6, 9, 7, 12, 10]
plt.stackplot(years, category_a, category_b, labels=['Category A', 'Category B'], alpha=0.7)
plt.legend(loc='upper left')
plt.xlabel('Years')
plt.ylabel('Values')
plt.title('Area Plot')
plt.grid(True)
plt.tight_layout()
plt.show()
Experiment-3
Implementation of Data Pre-Processing in Python and its types and also write about
packages used in them.
Data Cleaning also known as data cleansing or data scrubbing, is the process of identifying and
correcting errors, inconsistencies, and inaccuracies in datasets to improve their quality and
reliability. Data cleaning is a crucial step in the data pre-processing pipeline, as accurate and
reliable data is essential for making informed decisions and obtaining meaningful insights from
the data.
Various methods and techniques are used for data cleaning. Here are some common ones:
1. Handling Missing Values:
- Deletion: Removing rows or columns with missing values. This can lead to loss of
information.
- Imputation: Filling in missing values using methods like mean, median, mode, or more
advanced techniques like regression or machine learning models.
2. Handling Outliers:
- Detection: Identifying outliers using statistical methods or visualization tools.
- Treatment: Depending on the nature of the data and the analysis, outliers can be corrected,
transformed, or left as-is.
3. Data Type Conversion:
- Ensuring that data types are consistent and appropriate for analysis.
- Converting categorical variables into numerical representations using techniques like one-
hot encoding.
4. Deduplication:
- Identifying and removing duplicate records from the dataset.
5. Normalization and Standardization:
- Scaling numerical features to similar ranges (normalization) or standardizing them to have

zero mean and unit variance (standardization).
6. Handling Inconsistent Data:

- Correcting inconsistencies such as typos, inconsistent naming conventions, and formatting
errors.
7. Feature Engineering:
- Creating new features or transforming existing ones to better represent the underlying
patterns in the data.
8. Handling Irrelevant Data:
- Removing features that don't contribute meaningful information to the analysis.
Python provides several packages and libraries that are commonly used to implement data
cleaning:
1. Pandas:
- Pandas is a powerful library for data manipulation and analysis.
- It provides functions for handling missing values (`fillna()`), removing duplicates

(`drop_duplicates()`), and converting data types (`astype()`).
2. NumPy:
- NumPy is used for numerical operations and array manipulation.
- It is useful for handling numerical data, performing computations, and working with
matrices.
3. Scikit-learn:
- Scikit-learn offers tools for machine learning and preprocessing.
- It provides methods for normalization (`MinMaxScaler`, `StandardScaler`), imputation

(`SimpleImputer`), and feature extraction (`FeatureHasher`, `PCA`).
4. Dedupe:
- Dedupe is a library specifically designed for deduplication of data.
- It helps identify and merge duplicate records within datasets.
5. OpenRefine:
- OpenRefine is an open-source tool for data cleaning and transformation.
- It provides a user-friendly interface for cleaning and manipulating data.
6. Regex (re module):

- Regular expressions are powerful tools for pattern matching and manipulation.
- They can be used in Python's `re` module to identify and correct inconsistencies.
These packages, among others, offer a wide range of tools and functions to help you clean and
pre-process your data effectively before analysis or modeling. The choice of methods and
packages depends on the specific nature of your data and the goals of your analysis.
Removing Missing Values
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5, 3), index=['a', 'c', 'e', 'f',
'h'],columns=['one', 'two', 'three'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
print (df)
Output
one two three
a 0.728041 0.807730 -1.094090
b NaN NaN NaN
c 0.068344 1.361973 0.019359
d NaN NaN NaN
e 1.504725 2.063960 1.528923
f 0.949028 -0.450736 1.198248
g NaN NaN NaN
h 0.486872 0.867129 0.997725
Checking Missing Values
import pandas as pd
import numpy as np

print (df)['one'].isnull()
Output
one two three
a -0.595792 1.363672 0.410404
b NaN NaN NaN
c 0.753777 0.043714 -1.311293
d NaN NaN NaN
e -0.281143 -0.015217 0.825725
f -0.802016 -1.098865 0.947955
g NaN NaN NaN
h -1.109293 0.595085 2.015506
Replace Missing Values
import pandas as pd
import numpy as np
df = pd.DataFrame({'one':[10,20,30,40,50,2000],
'two':[1000,0,30,40,50,60]})
print(df).replace({1000:10,2000:60})
Output
one two
0 10 1000
1 20 0
2 30 30
3 40 40
4 50 50
5 2000 60
Drop Missing Values
import pandas as pd
import numpy as np
print (df).drop(na)
Output
one two three
a -0.931894 0.451869 -0.925939
b NaN NaN NaN
c 1.489571 1.083932 1.090170
d NaN NaN NaN
e 1.298125 -0.008270 0.220347
f -0.165098 2.090805 -1.197678
g NaNNaNNaN
h 0.916180 1.019796 0.022656
Experiment-4
Implementation of different types of Activation Functions in Python.
Binary Step Function
import numpy as np
import numpy as np
def binaryStep(x):
return np.heaviside(x,1)
x = np.linspace(-10, 10)
plt.plot(x, binaryStep(x))
plt.axis('tight')
plt.title('Activation Function :BinaryStep')
plt.show()
Sigmoid Function
def sigmoid(x):
return 1/(1+np.exp(-x))
plt.plot(x, sigmoid(x))
plt.axis('tight')
plt.title('Activation Function :Sigmoid')

plt.show()
Linear Function
def linear(x):
return x
plt.plot(x, linear(x))
plt.axis('tight')
plt.title('Activation Function :Linear')
plt.show()
Tan h Function
def tanh(x):
''' It returns the value (1-exp(-2x))/(1+exp(-2x)) and the value
returned will be lies in between -1 to 1.'''
return np.tanh(x)
plt.plot(x, tanh(x))
plt.axis('tight')
plt.title('Activation Function :Tanh')
plt.show()
ReLu Function
import numpy as np
def RELU(x):
x1 = []
for i in x:
if i< 0:
x1.append(0)
else:
x1.append(i)
return x1
plt.plot(x, RELU(x))
plt.axis('tight')
plt.title('Activation Function: ReLU')
plt.show()
Softmax Function
def softmax(x):
return np.exp(x) / np.sum(np.exp(x), axis=0)
plt.plot(x, softmax(x))
plt.axis('tight')
plt.title('Activation Function :Softmax')
plt.show()
Leaky ReLu Activation Function

import numpy as np
def leaky_relu(x, alpha=0.01):
return np.maximum(alpha * x, x)
x = np.linspace(-10, 10, 400)
y = leaky_relu(x)
plt.plot(x, y, label="Leaky ReLU: alpha=0.01")
plt.title("Leaky ReLU Activation Function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid()
plt.legend()
plt.show()
Experiment-5
Implementation of Time Series Data in Python with different types of Algorithms.
Univariate Time Series Data
1. ARIMA
#univariate time series
import pandas as pd
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.arima.model import ARIMA
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv',
index_col='date', parse_dates=True)
print(df.columns)
column_name = 'value'
result = adfuller(df[column_name])
print("ADF Statistic:", result[0])
print("p-value:", result[1])
print("Critical Values:", result[4])
model = ARIMA(df[column_name], order=(2, 1, 2))
model_fit = model.fit()
predictions = model_fit.predict(start=len(df), end=len(df) + 10)
df['Prediction'] = predictions
df.plot()
Output
Multi – Variate Time Series
1. Multi Variate Exponential Smoothning
import pandas as pd
import numpy as np
from statsmodels.tsa.statespace.sarimax import SARIMAX
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv'
df = pd.read_csv(url, header=0, parse_dates=[0], index_col=0, names=['Month',

'Passengers'])
print(df.columns)
column_name = 'Passengers'
def multivariate_exponential_smoothing(df, order, seasonal_order, trend='c'):
model = SARIMAX(df, order=order, seasonal_order=seasonal_order, trend=trend)
results = model.fit()
return results
order = (1, 1, 1)
seasonal_order = (1, 1, 1, 12)
results = multivariate_exponential_smoothing(df[[column_name]], order, seasonal_order)
forecast_periods = 10
forecast = results.forecast(steps=forecast_periods)
plt.plot(df.index, df[column_name], label='Actual')
plt.plot(forecast.index, forecast, label='Forecast', color='red')
plt.xlabel('Month')
plt.ylabel(column_name)
plt.legend()
plt.title('Multivariate Exponential Smoothing Forecast')

plt.grid(True)
plt.show()
Output
2. State Space Model

import numpy as np
import pandas as pd
import statsmodels.api as sm
url = 'https://raw.githubusercontent.com/selva86/datasets/master/a10.csv'
data = pd.read_csv(url, index_col='date', parse_dates=True)
num_variables = 1
models = [sm.tsa.UnobservedComponents(data['value'], 'local level')]
results = [model.fit() for model in models]
forecast_steps = 10
forecasts = [result.get_forecast(steps=forecast_steps) for result in results]
for var in range(num_variables):
forecast_values = forecasts[var].predicted_mean
print(f'Forecasted values for Variable {var + 1}:\n{forecast_values}\n')
Output
/usr/local/lib/python3.10/dist-packages/statsmodels/tsa/base/tsa_model.py:473:
ValueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
Forecasted values for Variable 1:
2008-07-01 21.488012
2008-08-01 21.488012
2008-09-01 21.488012
2008-10-01 21.488012
2008-11-01 21.488012
2008-12-01 21.488012
2009-01-01 21.488012
2009-02-01 21.488012
2009-03-01 21.488012
2009-04-01 21.488012
Freq: MS, Name: predicted_mean, dtype: float64
Longitudinal Time Series

1. Linear Mixed Effective Model
import numpy as np
import pandas as pd
data['subject'] = 1
model = sm.MixedLM.from_formula('value ~ value', data=data,
groups=data['subject'])
print(results.summary())
Output
Mixed Linear Model Regression Results
====================================================
Model: MixedLM Dependent Variable: value
No. Observations: 204 Method: REML
No. Groups: 1 Scale: 0.0000
Min. group size: 204 Log-Likelihood: inf
Max. group size: 204 Converged: No
Mean group size: 204.0
-----------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
-----------------------------------------------------
Intercept 0.000
value 1.000
Group Var 0.000
2. Hierarchial Linear Model

import pandas as pd
import numpy as np
data['group'] = 1
formula = 'value ~ 1'
model = sm.MixedLM.from_formula(formula, data=data, groups=data['group'])
Output
Mixed Linear Model Regression Results
=======================================================
Model: MixedLM Dependent Variable: value
No. Observations: 204 Method: REML
No. Groups: 1 Scale: 35.4858
Min. group size: 204 Log-Likelihood: -652.9706
Max. group size: 204 Converged: Yes
Mean group size: 204.0
--------------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
--------------------------------------------------------
Intercept 10.694 0.000 0.000 10.694 10.694
Group Var 35.486
Panel Time Series Data

1. Fixed Effects and Random Effects Model
import pandas as pd
data['group'] = 1
data = data.set_index(['group', data.groupby('group').cumcount()])
print(data)
Output
Value group
1 0 3.526591
1 3.180891
2 3.252221
3 3.611003
4 3.565869
... ...
199 21.654285
200 18.264945
201 23.107677
202 22.912510
203 19.431740
2. Pooled OLS
import pandas as pd
data['time'] = range(1, len(data) + 1)
data['group'] = 1
X = data[['time', 'group']]
X = sm.add_constant(X)
y = data['value']
model = sm.OLS(y, X)
Output
OLS Regression Results
==============================================================
Dep. Variable: value R-squared: 0.855
Model: OLS Adj. R-squared: 0.854
Method: Least Squares F-statistic: 1191.
Date: Thu, 24 Aug 2023 Prob (F-statistic): 1.23e-86
Time: 09:14:28 Log-Likelihood: -456.07
No. Observations: 204 AIC: 916.1
Df Residuals: 202 BIC: 922.8
Df Model: 1
Covariance Type: nonrobust
==============================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
time 0.0933 0.003 34.509 0.000 0.088 0.099
group 1.1307 0.320 3.538 0.001 0.500 1.761
=============================================================
Omnibus: 44.147 Durbin-Watson: 0.972
Prob(Omnibus): 0.000 Jarque-Bera (JB): 90.037
Skew: 1.031 Prob(JB): 2.81e-20
Kurtosis: 5.518 Cond. No. 237.
Irregular Time Series Data

1. Gaussian Process Regression
import pandas as pd
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
from sklearn.model_selection import train_test_split
X = np.arange(len(data)).reshape(-1, 1)
y = data['value'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)
kernel = C(1.0, (1e-3, 1e3)) * RBF(length_scale=1.0, length_scale_bounds=(1e-2,
1e2))
gpr = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=10)
gpr.fit(X_train, y_train)
y_pred, sigma = gpr.predict(X_test, return_std=True)
plt.plot(data.index, data['value'], label='Actual', color='blue')
plt.plot(data.index[-len(y_test):], y_pred, label='Predicted', color='red')
plt.fill_between(data.index[-len(y_test):], y_pred - 1.96 * sigma, y_pred + 1.96 *
sigma, color='pink', alpha=0.3)
plt.title('Gaussian Process Regression')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.show()
Output
2. Long Short Term Memory
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
data = np.array([i for i in range(100)])
target = np.array([i for i in range(1, 101)])
data = data.reshape((data.shape[0], 1, 1))

model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(data.shape[1], data.shape[2])))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(data, target, epochs=2, verbose=1)
test_input = np.array([i for i in range(100, 110)])
test_input = test_input.reshape((test_input.shape[0], 1, 1))
predictions = model.predict(test_input)
print(predictions)
Output
Epoch 1/2
4/4 [==============================] - 4s 7ms/step - loss: 2887.2913
Epoch 2/2
4/4 [==============================] - 0s 6ms/step - loss: 2818.5085
1/1 [==============================] - 0s 278ms/step
[[10.547165 ]
[10.675017 ]
[10.802973 ]
[10.931027 ]
[11.0591755]
[11.187414 ]
[11.315742 ]
[11.444154 ]
[11.572646 ]
[11.701216 ]]
Experiment-6
Implementation of different types of Neural Networks in Python

Feed Forward Neural Network
import numpy as np
def feed_forward(x):
weights = np.array([[1, 2], [3, 4]])
biases = np.array([5, 6])
z = np.dot(x, weights) + biases
a = np.tanh(z)
return a
x = np.array([1, 2])
y = feed_forward(x)
print(y)
Output
[1. 1.]
Recurrent Neural Network
#Recurrent Neural Network
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import SimpleRNN, Dense
sequence_length = 1
num_samples = 10
X_train = np.random.rand(num_samples, sequence_length, 1)
y_train = np.sum(X_train, axis=1)
model = keras.Sequential([
SimpleRNN(units=32, activation='relu', input_shape=(sequence_length, 1)),

Dense(1)
])
model.compile(optimizer='adam', loss='mse')
model.summary()
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)
X_test = np.random.rand(10, sequence_length, 1)
predictions = model.predict(X_test)
print(predictions)
Output
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
==============================================================
simple_rnn_2 (SimpleRNN) (None, 32) 1088
dense_2 (Dense) (None, 1) 33
==============================================================
Total params: 1,121
Trainable params: 1,121
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
1/1 [==============================] - 1s 1s/step - loss: 0.2235 - val_loss: 0.5792
Epoch 2/10
1/1 [==============================] - 0s 50ms/step - loss: 0.2177 - val_loss: 0.5655
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
1/1 [==============================] - 0s 219ms/step
[[-0.00139036]
[-0.01395435]
[-0.03636114]
[-0.00224465]
[ 0.02789616]
[-0.00750315]
[ 0.0128946 ]
[ 0.02804753]
[-0.02991994]
[ 0.02489439]]
Back Propagation Neural Network
import numpy as np
def backpropagation(x, y, weights, biases, learning_rate):
z = np.dot(x, weights) + biases
a = np.tanh(z)
error = y - a
da = error * (1 - a ** 2)
dw = np.dot(x.T, da)
db = np.sum(da, axis=0)
weights = weights - learning_rate * dw
biases = biases - learning_rate * db
return weights, biases
x = np.array([1, 2])
y = np.array([3, 4])
weights = np.array([[1, 2], [3, 4]])
biases = np.array([5, 6])
learning_rate = 0.1
for i in range(100):
weights, biases = backpropagation(x, y, weights, biases, learning_rate)
print(weights)
print(biases)
Output
[[1. 2.]
[3. 4.]]
[5. 6.]
Convolution Neural Network
from tensorflow.keras import layers
from tensorflow.keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
layers.Input(shape=(28, 28, 1)),
layers.Conv2D(32, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test))
test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=2)
print(f'Test accuracy: {test_accuracy:.4f}')
Output
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz

11490434/11490434 [==============================] - 0s 0us/step
Epoch 1/5
1875/1875 [==============================] - 73s 38ms/step - loss: 0.1453 -

accuracy: 0.9551 - val_loss: 0.0467 - val_accuracy: 0.9850
Epoch 2/5
1875/1875 [==============================] - 66s 35ms/step - loss: 0.0468 -

Epoch 3/5
1875/1875 [==============================] - 69s 37ms/step - loss: 0.0337 -

Epoch 4/5
1875/1875 [==============================] - 71s 38ms/step - loss: 0.0256 -

Epoch 5/5
1875/1875 [==============================] - 69s 37ms/step - loss: 0.0201 -

313/313 - 3s - loss: 0.0303 - accuracy: 0.9920 - 3s/epoch - 9ms/step
Test accuracy: 0.9920
Neural Networks Implementation with data sets:
Classification Data Set in Feed Forward Neural Network
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from keras.models import Sequential
from keras.layers import Dense

from keras.utils import to_categorical
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model.add(Dense(8, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=10, verbose=1)
loss, accuracy = model.evaluate(X_test, y_test)
print(f"Test Loss: {loss}, Test Accuracy: {accuracy}")
Output
Epoch 1/5
12/12 [===========================] - 2s 3ms/step - loss: 1.1775 - accuracy: 0.3083
Epoch 2/5
Epoch 3/5
12/12 [==========================] - 0s 3ms/step - loss: 1.0588 - accuracy: 0.5000
Epoch 4/5

Epoch 5/5
1/1 [============================] - 0s 203ms/step - loss: 0.9488 - accuracy: 0.5000
Test Loss: 0.9488111138343811, Test Accuracy: 0.5
Regression Data Set in Feed Forward Neural Network
import numpy as np
np.random.seed(0)
X = np.linspace(0, 10, 100)
y = 2 * X + 1 + np.random.randn(100)
model = tf.keras.Sequential([
tf.keras.layers.Dense(units=1, input_shape=(1,))
])
model.compile(optimizer='sgd', loss='mean_squared_error')
history = model.fit(X_train, y_train, epochs=100, verbose=0)
loss = model.evaluate(X_test, y_test, verbose=0)
print(f"Mean Squared Error on Test Data: {loss:.4f}")
y_pred = model.predict(X_test)
plt.scatter(X_test, y_test, label='True Data')
plt.scatter(X_test, y_pred, label='Predicted Data', marker='x')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
Output
# Health Care Data Set In Feed Forward Neural Network
import numpy as np
from sklearn.metrics import accuracy_score
np.random.seed(0)
X = np.random.randn(1000, 10)
y = np.random.randint(2, size=1000)
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(10,)),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2, verbose=2)

y_pred = model.predict(X_test)
y_pred_binary = (y_pred> 0.5).astype(int)
accuracy = accuracy_score(y_test, y_pred_binary)
print(f"Test accuracy: {accuracy * 100:.2f}%")
Output
Epoch 1/10
20/20 - 1s - loss: 0.6991 - accuracy: 0.4891 - val_loss: 0.6960 - val_accuracy: 0.5063 -

984ms/epoch - 49ms/step
Epoch 2/10

Epoch 3/10

Epoch 4/10

Epoch 5/10

Epoch 6/10

Epoch 7/10

Epoch 8/10
Epoch 9/10

Epoch 10/10

7/7 [==============================] - 0s 3ms/step
Test accuracy: 53.00%
#Speech Data Set
import numpy as np
from tensorflow.keras.layers import SimpleRNN, Dense, Input
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
num_samples = 1000
sequence_length = 50
input_dim = 13
output_dim = 10
X = np.random.rand(num_samples, sequence_length, input_dim)
y = np.random.randint(output_dim, size=(num_samples,))
input_layer = Input(shape=(sequence_length, input_dim))
rnn_layer = SimpleRNN(64)(input_layer)
output_layer = Dense(output_dim, activation='softmax')(rnn_layer)

model = Model(inputs=input_layer, outputs=output_layer)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=Adam(learning_rate=0.001),
model.fit(X, y, epochs=5, batch_size=32)
Output
Epoch 1/5
32/32 [==============================] - 1s 11ms/step - loss: 2.3589 - accuracy:

0.0950
Epoch 2/5

0.1220
Epoch 3/5

0.1590
Epoch 4/5

0.1620
Epoch 5/5

0.2010
#Video Analysis Data Set
import numpy as np
from tensorflow.keras.layers import LSTM, Dense, TimeDistributed

def generate_data():
X = np.random.randn(100, 10, 3)
Y = np.sin(np.sum(X, axis=2))
return X, Y
X_train, y_train = generate_data()
model.add(LSTM(50, input_shape=(10, 3), return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.fit(X_train, y_train, epochs=2, batch_size=1)
X_test, _ = generate_data()
print("Predictions:")
print(predictions)
Output
Epoch 1/2
100/100 [==============================] - 2s 4ms/step - loss: 0.4326
Epoch 2/2
100/100 [==============================] - 1s 5ms/step - loss: 0.3673
4/4 [==============================] - 1s 6ms/step
Predictions:
[[[ 2.18788102e-01]
[ 2.71870680e-02]
[ 1.69337727e-02]
[-2.41994888e-01]
[-2.99985141e-01]
[ 1.29971102e-01]
[-1.18486717e-01]
[ 2.03054249e-01]
[-2.30462939e-01]
[-4.99120951e-01]]
#Training Data Set
import numpy as np
from tensorflow.keras.layers import SimpleRNN, Dense
seq_length = 10
num_samples = 1000
X = np.random.randn(num_samples, seq_length, 1)
y = np.sin(np.sum(X, axis=1))
split_ratio = 0.8
split_index = int(num_samples * split_ratio)
X_train, X_val = X[:split_index], X[split_index:]
y_train, y_val = y[:split_index], y[split_index:]
model = Sequential([
SimpleRNN(units=64, activation='relu', input_shape=(seq_length, 1)),
Dense(units=1)
])
batch_size = 32
epochs = 5
model.fit(X_train, y_train, validation_data=(X_val, y_val), batch_size=batch_size,

epochs=epochs)
new_data = np.random.randn(10, seq_length, 1)
predictions = model.predict(new_data)
print(predictions)
Output
Epoch 1/5
25/25 [==============================] - 1s 16ms/step - loss: 0.5166 - val_loss:

0.5045
Epoch 2/5

0.5009
Epoch 3/5

0.5006
Epoch 4/5

0.4992
Epoch 5/5

0.5031
WARNING:tensorflow:5 out of the last 14 calls to <function

Model.make_predict_function.<locals>.predict_function at 0x7cc80d7db130> triggered
tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to
(1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3)
passing Python objects instead of tensors. For (1), please define your @tf.function outside of
the loop. For (2), @tf.function has reduce_retracing=True option that can avoid unnecessary
retracing. For (3), please refer to
https://www.tensorflow.org/guide/function#controlling_retracing and
https://www.tensorflow.org/api_docs/python/tf/function for more details.
1/1 [==============================] - 0s 151ms/step
Predictions:
[[ 0.03846954]
[-0.06677599]
[ 0.02927302]
[ 0.00162729]
[-0.00160487]
[-0.12391292]
[ 0.05950115]
[-0.13238075]
[-0.03390695]
[-0.01337759]]
#Validation Data Set
import numpy as np
num_samples = 1000
sequence_length = 10
input_dim = 1
X = np.random.randn(num_samples, sequence_length, input_dim)
y = np.sum(X, axis=1)
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
keras.layers.SimpleRNN(units=64, activation='relu', input_shape=(sequence_length,

input_dim)),
keras.layers.Dense(units=1)
])
model.compile(optimizer='adam', loss='mse') # Mean squared error as the loss function
history = model.fit(X_train, y_train, epochs=5, batch_size=32, validation_data=(X_val, y_val))
test_loss = model.evaluate(X_val, y_val)
print(f'Test Loss: {test_loss}')
Output
Epoch 1/5

4.4260
Epoch 2/5

0.9332
Epoch 3/5

0.2181
Epoch 4/5

0.1080
Epoch 5/5

0.0672
7/7 [==============================] - 0s 3ms/step - loss: 0.0672
Test Loss: 0.06718305498361588
#Image Classification Data Set
import numpy as np
from tensorflow.keras.layers import Input, SimpleRNN, Dense
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train / 255.0
x_test = x_test / 255.0
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
Input(shape=(28, 28)),
SimpleRNN(units=64, activation='relu'),
Dense(units=10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.2)
test_loss, test_accuracy = model.evaluate(x_test, y_test)
print(f'Test Accuracy: {test_accuracy * 100:.2f}%')
Output
Epoch 1/5

0.7525 - val_loss: 0.3585 - val_accuracy: 0.8913
Epoch 2/5

Epoch 3/5
Epoch 4/5

Epoch 5/5


0.9337
Test Accuracy: 93.37%
#image classification
from tensorflow.keras import layers
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((-1, 28, 28, 1))
test_images = test_images.reshape((-1, 28, 28, 1))
train_images, test_images = train_images / 255.0, test_images / 255.0
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),

layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
model.fit(train_images, train_labels, epochs=2)
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print(f'Test accuracy: {test_acc}')
predictions = model.predict(test_images)
Output
Epoch 1/2
1875/1875 [==============================] - 97s 51ms/step - loss: 0.1440 -

accuracy: 0.9566
Epoch 2/2
1875/1875 [==============================] - 63s 34ms/step - loss: 0.0473 -

accuracy: 0.9854
313/313 - 5s - loss: 0.0332 - accuracy: 0.9900 - 5s/epoch - 15ms/step
Test accuracy: 0.9900000095367432
313/313 [==============================] - 5s 16ms/step
#Video Analysis Data Set
import numpy as np
from tensorflow.keras.layers import LSTM, Dense, TimeDistributed
def generate_data():
X = np.random.randn(100, 10, 3)
Y = np.sin(np.sum(X, axis=2))
return X, Y
X_train, y_train = generate_data()
model.add(LSTM(50, input_shape=(10, 3), return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.fit(X_train, y_train, epochs=2, batch_size=1)
X_test, _ = generate_data()
print(predictions)
Output:
Epoch 1/2
100/100 [==============================] - 2s 5ms/step - loss: 0.4172
Epoch 2/2
100/100 [==============================] - 0s 5ms/step - loss: 0.3387
4/4 [==============================] - 0s 4ms/step
Predictions:
[[[ 2.00504497e-01]
[-2.02019125e-01]
[-3.98066401e-01]
[-6.29033446e-01]
[ 3.75485979e-02]
[ 1.73215717e-02]
[ 1.17083795e-01]
[ 6.31665707e-01]
[-2.20511571e-01]
[-5.30349433e-01]]
Experiment-7
Implementation of Adaline and Madaline Networks in python

Adaline Networks
import numpy as np
def Adaline(Input, Target, lr=0.2, stop=0.001):
weight = np.random.random(Input.shape[1])
bias = np.random.random(1)
Error = [stop + 1]
while Error[-1] > stop or Error[-1] - Error[-2] > 0.0001:
error = []
for i in range(Input.shape[0]):
Y_input = sum(weight * Input[i]) + bias
for j in range(Input.shape[1]):
weight[j] = weight[j] + lr * (Target[i] - Y_input) * Input[i][j]
bias = bias + lr * (Target[i] - Y_input)
error.append((Target[i] - Y_input) ** 2)
Error.append(sum(error))
print('Error:', Error[-1])
return weight, bias
x = np.array([[1.0, 1.0, 1.0],
[1.0, -1.0, 1.0],
[-1.0, 1.0, 1.0],
[-1.0, -1.0, -1.0]])
t = np.array([1, 1, 1, -1])
w, b = Adaline(x, t, lr=0.2, stop=0.001)
print('Weight:', w)
print('Bias:', b)
Output
Error: [3.04097366]
Error: [1.82660775]
Error: [1.25568368]
Error: [0.86871029]
Error: [0.60121077]
Error: [0.41609032]
Error: [0.28797116]
Error: [0.19930143]
Error: [0.13793416]
Error: [0.0954626]
Error: [0.06606853]
Error: [0.04572525]
Error: [0.0316459]
Error: [0.02190176]
Error: [0.01515795]
Error: [0.01049063]
Error: [0.00726044]
Error: [0.00502487]
Error: [0.00347765]
Error: [0.00240684]
Error: [0.00166575]
Error: [0.00115284]
Error: [0.00079787]
Weight: [0.00977986 0.00977986 0.98802216]
Bias: [0.00977986]
Madaline Networks
import numpy as np
def activation_fn(z):
if z >= 0:
return 1
else:
return -1
def madaline(Input, Target, lr, epoch):
num_inputs = Input.shape[1]
weight = np.random.random(num_inputs)
bias = np.random.random()
k=0
while k < epoch:
error = 0
for i in range(Input.shape[0]):
y_input = sum(weight * Input[i]) + bias
y = activation_fn(y_input)
if y != Target[i]:
weight = weight + lr * (Target[i] - y) * Input[i]
bias = bias + lr * (Target[i] - y)
error += (Target[i] - y) ** 2
print(k, '>> Error:', error)
k += 1
return weight, bias
# Input dataset
x = np.array([[1.0, 1.0, 1.0],
[1.0, -1.0, 1.0],
[-1.0, 1.0, 1.0],
[-1.0, -1.0, -1.0]])
# Target values
t = np.array([1, 1, 1, -1])
w, b = madaline(x, t, 0.1, 10)
print('Weight:', w)
print('Bias:', b)
Output
0 >> Error: 0
1 >> Error: 0
2 >> Error: 0
3 >> Error: 0
4 >> Error: 0
5 >> Error: 0
6 >> Error: 0
7 >> Error: 0
8 >> Error: 0
9 >> Error: 0
Weight: [0.87271172 0.94767273 0.18015286]
Bias: 0.9207387586434277
Experiment-8
Implementation of Hand Written digits Recognition using different tools available in

Python.
from matplotlib import testing
import pygame, sys
import numpy as np
from keras.models import load_model
from pygame.locals import *
import cv2
WINDOWSIZEX = 640
WINDOWSIZEY = 480
BOUNDARYINC = 5
WHITE = (255, 255, 255)
BLACK = (0, 0, 0)
RED = (255, 0, 0)
IMAGESAVE = False
MODEL = load_model("bestmodel.h5")
LABELS = {0: "Zero", 1: "One", 2: "Two", 3: "Three", 4: "Four", 5: "Five", 6: "Six", 7:
"Seven", 8: "Eight", 9: "Nine"}
pygame.init()
DISPLAYSURF = pygame.display.set_mode((WINDOWSIZEX, WINDOWSIZEY))
pygame.display.set_caption("Digit Board")
# Create a Pygame Font object
FONT = pygame.font.Font(None, 36)
iswriting = False
number_xcord = []
number_ycord = []
image_cnt = 1 # image count
PREDICT = True
while True:
for event in pygame.event.get():
if event.type == QUIT:
pygame.quit()
sys.exit()
if event.type == MOUSEMOTION and iswriting:

xcord, ycord = event.pos
pygame.draw.circle(DISPLAYSURF, WHITE, (xcord, ycord), 4, 0)
number_xcord.append(xcord)
number_ycord.append(ycord)
if event.type == MOUSEBUTTONDOWN:
iswriting = True
if event.type == MOUSEBUTTONUP:
iswriting = False
number_xcord = sorted(number_xcord)
number_ycord = sorted(number_ycord)
rect_min_x, rect_max_x = max(number_xcord[0] - BOUNDARYINC, 0),
min(WINDOWSIZEX, number_xcord[-1] + BOUNDARYINC)
rect_min_Y, rect_max_Y = max(number_ycord[0] - BOUNDARYINC, 0),
min(WINDOWSIZEY, number_ycord[-1] + BOUNDARYINC)
number_xcord = []
number_ycord = []
img_arr = np.array(pygame.PixelArray(DISPLAYSURF))[rect_min_x:rect_max_x,
rect_min_Y:rect_max_Y].T.astype(np.float32)
if IMAGESAVE:
cv2.imwrite("image.png", img_arr)
image_cnt += 1
if PREDICT:
img_arr = cv2.resize(img_arr, (28, 28))
img_arr = np.pad(img_arr, ((10, 10), (10, 10)), 'constant', constant_values=0)
img_arr = cv2.resize(img_arr, (28, 28)) / 255.0
label = str(LABELS[np.argmax(MODEL.predict(img_arr.reshape(1, 28, 28, 1)))])
textsurface = FONT.render(label, True, RED, WHITE)
textRecObj = textsurface.get_rect()
textRecObj.left, textRecObj.top = rect_min_x, rect_min_Y
DISPLAYSURF.blit(textsurface, textRecObj)
if event.type == KEYDOWN:
if event.unicode == "n":
DISPLAYSURF.fill(BLACK)
pygame.display.update()
Output
Implementation of Facial Recognition Application using different tools available in

Python.
import cv2
# Load the pre-trained Haar Cascade classifier for face detection
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades +
'haarcascade_frontalface_default.xml')
# Initialize the video capture from the default camera (0)
video_capture = cv2.VideoCapture(0)
while True:
# Capture frame-by-frame
ret, frame = video_capture.read()
# Convert the frame to grayscale for face detection
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect faces in the frame

faces = face_cascade.detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30)
)
# Draw a rectangle around each detected face
for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
# Display the resulting frame
cv2.imshow('Face Detection', frame)
# Exit the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release the video capture object and close all OpenCV windows
video_capture.release()
cv2.destroyAllWindows()
Output

Deep Learning Lab Manual

Uploaded by

Deep Learning Lab Manual

Uploaded by

Experiment-1

Deep Learning Concepts

In neural networks, an activation function is a mathematical function applied to the weighted

2. Rectified Linear Unit (ReLU):

f(x) = max(ax, x) (a is a small positive constant, typically 0.01)

4. Hyperbolic Tangent (tanh):

f(x) = (e^(x) - e^(-x)) / (e^(x) + e^(-x))

3 .What is Perceptron and explain about Perceptron Networks

4 . What is Adaline Network and Madaline Networks?

An Adaline (Adaptive Linear Neuron) network, also known as a single-layer neural

Here are the main components and characteristics of an Adaline network:

In summary, an Adaline network is a type of single-layer neural network that uses a

Here are the main characteristics and components of a Madaline network:

1. Regular Expressions: Regular expressions (regex) are a powerful rule-based approach

2. Template Matching: Template matching is a technique commonly used in computer

4. Sequence Alignment: Sequence alignment techniques are commonly used in pattern

5. Statistical Methods: Statistical methods can be applied to pattern matching tasks to

1. Univariate Time Series:

- Univariate time series consists of a single variable observed over time.

2. Multivariate Time Series:

- Algorithms: Vector Autoregression (VAR), Vector Error Correction Model (VECM),

- Algorithms: Linear Mixed Effects Models, Generalized Estimating Equations (GEE),

5. Functional Time Series:

- Functional time series represent curves or functions observed over time.

- Algorithms: Functional Data Analysis (FDA), Functional Principal Component Analysis

6. Spatial Time Series:

- Algorithms: Spatial AutoRegressive (SAR) Models, Spatial Durbin Models, Space-Time

7. High-Frequency Time Series:

- Algorithms: High-Frequency Trading Algorithms, Tick Data Analysis, Volatility Models

- Event time series focus on occurrences of events over time.

9. Irregular Time Series:

- Algorithms: Continuous-Time Models, Kernel Smoothing, Local Polynomial Regression.

10. Seasonal Time Series:

- Algorithms: Seasonal Decomposition, Seasonal Exponential Smoothing, Seasonal ARIMA.

Case Study: Weather Forecasting

1. Autoregressive Integrated Moving Average (ARIMA):

2. Seasonal Autoregressive Integrated Moving Average (SARIMA):

SARIMA extends ARIMA models by incorporating seasonality. It considers seasonal patterns

3. Long Short-Term Memory (LSTM) Networks:

4. Convolutional Neural Networks (CNNs):

6. Deep Neural Networks (DNNs):

7. Gaussian Processes (GPs):

Common activation functions include:

In summary, activation functions introduce non-linearity, enabling neural networks to

A neural network is composed of layers of interconnected nodes, also known as neurons or

1. Feedforward Neural Network (FFNN):

A feedforward neural network is a basic architecture where information flows

1. Recurrent Neural Network (RNN):

4. Convolutional Neural Network (CNN):

import matplotlib.pyplot as plt

def create_line_plot(x_values, y_values, title, x_label, y_label):

ax.plot(x_values, y_values, marker='o', linestyle='-')

y_values = [10, 15, 5, 20, 12]

create_line_plot(x_values, y_values, "Line Plot ", "X-axis", "Y-axis")

import matplotlib.pyplot as plt

values = [25, 18, 40, 30, 15]

import matplotlib.pyplot as plt

import matplotlib.pyplot as plt

def create_pie_chart(labels, sizes):

plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=140)

labels = ['Category 1', 'Category 2', 'Category 3 ', 'Category 4 ']

sizes = [30, 25, 20, 25]

import matplotlib.pyplot as plt

import matplotlib.pyplot as plt

1. Handling Missing Values:

- Detection: Identifying outliers using statistical methods or visualization tools.

3. Data Type Conversion:

- Identifying and removing duplicate records from the dataset.

5. Normalization and Standardization:

- Scaling numerical features to similar ranges (normalization) or standardizing them to have

6. Handling Inconsistent Data:

8. Handling Irrelevant Data:

- Removing features that don't contribute meaningful information to the analysis.

- Pandas is a powerful library for data manipulation and analysis.

- It provides functions for handling missing values (`fillna()`), removing duplicates

- NumPy is used for numerical operations and array manipulation.