Deep Learning Lab Manual
Deep Learning Lab Manual
Here are some commonly used activation functions in neural networks along with their
mathematical formulas and examples:
1. Sigmoid Function:
The sigmoid function, also known as the logistic function, maps the input to a value between 0 and
1, providing a smooth, continuous output. It is expressed as:
f(x) = 1 / (1 + e^(-x))
Example: In binary classification problems, the sigmoid activation function is often used in the
output layer to produce a probability value indicating the likelihood of the input belonging to a
particular class.
ReLU is a popular activation function that introduces non-linearity by outputting 0 for negative
inputs and the input value for positive inputs. Mathematically, it is defined as:
f(x) = max(0, x)
Example: ReLU is commonly used in hidden layers of deep neural networks due to its simplicity
and ability to alleviate the vanishing gradient problem.
3. Leaky ReLU:
Leaky ReLU is a modification of the ReLU function that addresses the "dying ReLU" problem. It
introduces a small slope for negative inputs, allowing for non-zero outputs even when the input is
negative. It is expressed as:
Example: Leaky ReLU can be beneficial when dealing with negative inputs, preventing the
corresponding neurons from being completely deactivated.
The tanh function is similar to the sigmoid function but ranges between -1 and 1. It provides
stronger non-linearity and allows negative values in the output. Mathematically, it is defined as:
Example: Tanh activation function is commonly used in recurrent neural networks (RNNs) and
when the output range needs to be centered around 0.
5. Softmax:
The softmax function is primarily used in the output layer of a neural network for multi-class
classification problems. It converts a vector of real numbers into a probability distribution where the
sum of all the probabilities equals 1. Mathematically, it is defined as:
f(x_i) = e^(x_i) / sum(e^(x_j)) (for each element x_i in the input vector)
Example: Softmax activation is useful when you want the neural network to classify inputs into
mutually exclusive classes.
These are just a few examples of activation functions commonly used in neural networks. Each
activation function has its own characteristics and can be chosen based on the specific requirements
of the problem at hand.
2. What is Learning? Explain various types of Learning Rules.
Learning, in the context of neural networks, refers to the process of adjusting the parameters
(weights and biases) of the network based on the input data and desired outputs. The goal of
learning is to enable the network to make accurate predictions or classifications by minimizing
the difference between the predicted output and the desired output.
There are several types of learning rules used in neural networks. Here are some commonly used
learning rules:
1. Gradient Descent:
Gradient descent is the most popular and widely used learning rule in neural networks.
It adjusts the parameters of the network by iteratively updating them in the direction of steepest
descent of the loss function. The steps involved in gradient descent are as follows:
- Compute the gradient of the loss function with respect to the network parameters.
- Update the parameters by subtracting a small fraction of the gradient, multiplied by a learning
rate, from their current values.
- Repeat these steps until the network converges or a predefined stopping criterion is met.
There are variations of gradient descent, such as stochastic gradient descent (SGD) and
mini-batch gradient descent, which update the parameters using a subset of the training data
rather than the entire dataset at each iteration.
2. Back propagation:
Back propagation is a specific algorithm used to compute the gradients of the loss
function with respect to the weights and biases in a neural network. It is often used in
conjunction with gradient descent to update the parameters. Back propagation works by
propagating the errors from the output layer back to the earlier layers of the network, allowing
the gradients to be calculated and used for parameter updates.
3. Hebbian Learning:
Hebbian learning is a type of unsupervised learning that is based on the Hebbian
principle, which states that "cells that fire together wire together." It is used to strengthen the
connections between neurons that have correlated activity. In Hebbian learning, the weight
between two neurons is increased if they are both active at the same time and decreased if they
are not.
4. Reinforcement Learning:
Reinforcement learning is a type of learning where an agent learns to take actions in an
environment to maximize a cumulative reward signal. It involves the agent interacting with the
environment, receiving feedback in the form of rewards or penalties, and adjusting its actions
based on this feedback. Reinforcement learning algorithms use techniques like Q-learning and
policy gradients to update the network parameters based on the rewards received.
5. Unsupervised Learning:
Unsupervised learning refers to learning from unlabeled data, where the network learns
to extract meaningful representations or patterns from the input data without explicit target
outputs. Common techniques in unsupervised learning include autoencoders, clustering
algorithms, and generative models like variational autoencoders (VAEs) and generative
adversarial networks (GANs).
These are just a few examples of learning rules used in neural networks. The choice of learning
rule depends on the specific problem, the type of data, and the desired outcomes. Different
learning rules have different characteristics and are suited for different types of tasks, whether
it's supervised learning, unsupervised learning, or reinforcement learning.
The Adaline network is designed to perform linear regression and pattern classification
tasks. Unlike the perceptron, which uses a step function as its activation function, Adaline uses
a linear activation function. This allows it to produce continuous outputs instead of binary
outputs, making it suitable for regression tasks.
2. Weighted Sum: Each input value is multiplied by its corresponding weight, and the weighted
sums are computed. These weighted sums are then summed together.
3. Linear Activation Function: The output of the Adaline network is the linear combination of
the weighted sums, which is not limited to binary values. The output is simply the weighted
sum itself without any threshold or non-linear transformation.
4. Weight Update: The learning process in Adaline involves adjusting the weights to minimize
the error between the network's output and the desired output. This is achieved using a variant
of the Widrow-Hoff learning rule, also known as the delta rule or the Least Mean Squares
(LMS) algorithm. The weights are updated incrementally based on the difference between the
network output and the target output.
5. Convergence: The learning process continues iteratively until the network converges or a
stopping criterion is met. Convergence is achieved when the weights reach a stable
configuration that minimizes the error and produces accurate predictions.
Adaline networks are primarily used for regression tasks, where the goal is to predict
continuous output values based on input features. They can also be used for pattern
classification by interpreting the continuous output as a decision boundary. However, Adaline
networks are limited to linearly separable patterns and cannot handle non-linearly separable
patterns without the use of additional techniques like feature engineering or kernel methods.
1. Layers: The Madaline network consists of multiple layers of Adaline neurons. Typically, it
includes an input layer, one or more hidden layers, and an output layer. The hidden layers
provide the network with additional processing power and the ability to learn complex patterns
and relationships.
2. Adaline Neurons: Each neuron within the Madaline network is an Adaline neuron, which is a
linear regression model with a linear activation function. The Adaline neurons in the network
operate similarly to the single-layer Adaline model, performing weighted sum calculations and
producing continuous outputs.
3. Weight Update: The Madaline network employs a learning algorithm, similar to the Adaline
network, to adjust the weights of the Adaline neurons and minimize the error between the
network's output and the desired output. This learning algorithm, often based on the Widrow-
Hoff learning rule or variants of it, updates the weights iteratively to improve the network's
performance.
4. Activation Function: The Madaline network typically uses a threshold activation function at
the output layer to produce binary outputs or make class predictions. The activation function
can be a step function or a sigmoid function, depending on the problem being solved.
The main advantage of the Madaline network over the single-layer Adaline network is its ability
to handle non-linearly separable patterns by introducing hidden layers. The hidden layers
provide additional levels of abstraction and allow the network to learn more complex decision
boundaries.
However, like the Adaline network, the Madaline network has limitations. It is still a shallow
network and may struggle with highly complex patterns that require deeper architectures such
as modern deep neural networks. Additionally, the Madaline network may suffer from the
vanishing gradient problem and may not converge well with deeper structures.
In summary, the Madaline network is a multi-layer neural network that extends the Adaline
model by incorporating multiple layers of Adaline neurons. It provides the ability to learn non-
linear patterns and perform more complex tasks compared to the single-layer Adaline model.
However, it is a relatively simpler neural network architecture compared to more modern deep
neural networks.
5. What is Pattern Matching? Explain about the rules used for the same
Pattern matching is a fundamental concept in computer science and refers to the process
of finding specific patterns or structures within a given data set or sequence. It involves
searching for instances of a particular pattern or determining whether a given input matches a
specific pattern.
Pattern matching techniques can vary depending on the context and requirements of the
specific application. It can involve simple matching of exact patterns, approximate matching
considering variations or errors, or more complex matching based on probabilistic or statistical
models.
6. Machine Learning-based Techniques: While deep learning falls under the umbrella of
machine learning, there are other machine learning methods that can be used for pattern
matching tasks. These include decision trees, support vector machines (SVM), or k-
nearest neighbors (KNN), which can learn rules or decision boundaries based on
training data and classify or match patterns accordingly.
The selection of the appropriate rule or technique for pattern matching depends on the specific
problem domain, the nature of the data, and the complexity of the patterns being sought.
Combining these rule-based approaches with deep learning techniques can often enhance the
performance and accuracy of pattern matching tasks.
6. What is Time Series Data? Explain about various machine learning techniques used for
analyzing time series data.
Time series data is a sequence of data points collected at successive points in time, and
analyzing it can provide insights into trends, patterns, and future predictions. There are various
types of time series data, each requiring different algorithms and techniques for analysis. Here
are some common types of time series data and the corresponding algorithms used to analyze
them:
- Multivariate time series involve multiple variables observed over time, often with potential
interdependencies.
3. Longitudinal Data:
- Longitudinal data involves repeated measurements on the same individuals or entities over
time.
4. Panel Data:
- Panel data combines time series and cross-sectional data, with observations on multiple
entities at different time points.
- Algorithms: Fixed Effects Models, Random Effects Models, Pooled OLS Regression, Panel
ARIMA.
- Spatial time series involve data collected across different spatial locations over time.
- High-frequency time series involve data collected at very fine time intervals, often in
financial or trading contexts.
- Algorithms: Event Detection Algorithms, Point Process Models, Hidden Markov Models
(HMM) for Events.
- Irregular time series have unevenly spaced time intervals between observations.
- Seasonal time series exhibit regular patterns that repeat over fixed intervals.
Time series data refers to a sequence of data points collected or recorded over a period of time,
where each data point is associated with a specific timestamp or time index. Time series data
captures observations or measurements taken at regular intervals, such as hourly, daily, weekly,
or monthly.
Weather forecasting is a complex task that involves analyzing time series data to predict future
weather conditions. Machine learning techniques are commonly used to analyze weather data
and make accurate forecasts. Here are various machine learning techniques specifically applied
to weather forecasting:
ARIMA models capture trends, seasonality, and noise in time series data. They are widely used
for short-term weather forecasting tasks. ARIMA models consider the historical weather
observations to predict future weather conditions based on the autoregressive (AR) and moving
average (MA) components.
CNNs, known for their success in image analysis, can also be utilized for weather forecasting.
In the case of weather data, CNNs can be applied to analyze meteorological images or satellite
data. By leveraging convolutional layers, CNNs can extract spatial patterns and make
predictions based on weather images or spatial data.
5. Ensemble Methods:
Ensemble methods combine multiple models to improve forecasting accuracy. They leverage
the strengths of individual models and reduce the impact of biases or errors. Ensemble methods,
such as Random Forests or Gradient Boosting, can be applied to weather forecasting by
aggregating predictions from multiple models or incorporating different features.
DNNs, including feedforward neural networks or hybrid models, can be used for weather
forecasting tasks. By stacking multiple layers, DNNs can learn complex patterns and non-linear
relationships in weather data. These models can be customized with specific architectures,
activation functions, and regularization techniques to suit the characteristics of weather data.
Gaussian processes are probabilistic models that can capture the uncertainty and non-linear
relationships in time series data. GPs have been applied to weather forecasting by modeling the
weather observations as a stochastic process and making predictions based on the learned
distributions.
It is important to note that weather forecasting is a highly complex domain, and machine
learning techniques often complement physical models and numerical weather prediction
(NWP) methods. These techniques integrate multiple data sources, such as satellite data, radar
data, or atmospheric models, to improve the accuracy of weather forecasts.
The selection of the most appropriate technique depends on factors such as the availability and
quality of historical weather data, the forecast horizon (short-term or long-term), and the
specific weather phenomena being predicted. It is common to experiment with multiple
techniques, compare their performance, and fine-tune the models based on the requirements and
constraints of weather forecasting applications.
7. What is the Role of Threshold Values and Activation Functions in Neural Networks?
Threshold values and activation functions play crucial roles in shaping the behavior and
learning of artificial neural networks. They are integral components of individual neurons
(also called nodes or units) within neural network layers. Let's explore their roles:
1. Activation Functions:
An activation function defines the output of a neuron based on its input. It introduces non-
linearity into the network, allowing it to capture complex relationships between inputs and
outputs. Without activation functions, neural networks would be limited to representing
linear transformations, which severely restricts their capability to learn and model intricate
patterns in data.
- Sigmoid: Maps input values to the range (0, 1). It's often used in the past but is less
favored now due to vanishing gradient problems.
- Hyperbolic Tangent (tanh): Similar to sigmoid but maps input values to the range (-1, 1).
- Rectified Linear Unit (ReLU): Replaces all negative input values with zero and leaves
positive values unchanged. It's the most widely used activation function due to its simplicity
and effectiveness.
- Leaky ReLU: Similar to ReLU, but allows a small gradient for negative values, addressing
the "dying ReLU" problem.
- Parametric ReLU (PReLU): A variant of Leaky ReLU with a learnable parameter that
determines the slope of the negative side.
- Exponential Linear Unit (ELU): Smooth approximation of ReLU for negative inputs, with
some benefits in learning speed.
- Swish: A recently introduced activation function that performs well in some scenarios.
The choice of activation function depends on the problem at hand, network architecture, and
potential vanishing/exploding gradient issues. Activation functions introduce non-linearities
that enable the network to learn complex mappings and make them suitable for a wide range
of tasks, from image recognition to natural language processing.
2. Threshold Values:
In a biological neuron, there's a certain threshold that the combined input signals must
surpass for the neuron to "fire" and transmit an output signal. While artificial neurons don't
directly mimic biological neurons, the concept of a threshold value is related to how
activation functions operate.
In most artificial neural networks, neurons apply the activation function to the weighted
sum of their inputs. The threshold value corresponds to the point at which the activation
function starts generating non-zero outputs. For example, in the case of ReLU, the threshold
value is zero. Inputs below the threshold (negative values for ReLU) result in zero outputs.
The threshold value isn't usually a separate learnable parameter, as it is in some older
models like perceptrons. Instead, it's incorporated into the bias term of the neuron. The bias
shifts the activation function horizontally, effectively setting a threshold for when the
neuron should become active. This allows the network to learn when to respond to certain
features or patterns in the data.
a. Feed Forward
b. Recurrent
c. Back Propagation
d. Convolution Neural Network
A neural network, often referred to as an artificial neural network (ANN) or simply a neural net,
is a computational model inspired by the structure and functioning of the human brain's
interconnected neurons. It's a machine learning algorithm designed to recognize patterns, learn
from data, and make predictions or decisions based on that learned information. Neural
networks excel at tasks involving complex, non-linear relationships within data.
1. Input Layer: This layer receives the initial data, which could be images, text, numerical
values, or any other form of input. Each neuron in the input layer corresponds to a feature or
dimension of the input data.
2. Hidden Layers: These layers sit between the input and output layers and are responsible for
learning and representing complex relationships in the data. Each neuron in a hidden layer takes
inputs from the previous layer, performs computations on them using weights and biases, and
then applies an activation function to produce an output. The number of hidden layers and the
number of neurons in each layer are design choices that impact the network's capacity to learn.
3. Output Layer: The final layer produces the network's output. The number of neurons in the
output layer depends on the type of problem the network is solving. For example, in a binary
classification task, there might be one neuron in the output layer that gives the probability of
belonging to the positive class. In a multi-class classification task, the output layer could have a
neuron for each class.
The connections between neurons are defined by weights, which are adjustable parameters that
determine the strength of the connection. During training, the network adjusts these weights
based on the error between its predictions and the actual target values. The objective is to
minimize this error by iteratively updating the weights using optimization algorithms like
gradient descent.
Neural networks can be used for a wide range of tasks, including image and speech recognition,
natural language processing, recommendation systems, game playing, medical diagnosis, and
more. They have shown remarkable success in various domains, often outperforming traditional
machine learning algorithms by automatically learning features and representations from raw
data.
There are various types of neural network architectures, including feedforward neural networks
(the simplest type), convolutional neural networks (CNNs) for image processing, recurrent
neural networks (RNNs) for sequence data, and more advanced architectures like transformers
and GANs (Generative Adversarial Networks). Each architecture is tailored to specific types of
data and tasks.
A recurrent neural network is designed for sequence data, allowing information to loop back
through connections. It maintains a hidden state that carries memory of past inputs, enabling it
to capture temporal dependencies. RNNs are suitable for tasks like language modeling and time
series analysis.
3. Backpropagation:
Backpropagation is a training algorithm for neural networks. It involves two main steps:
forward pass, where inputs generate predictions; and backward pass, where prediction errors are
propagated backward to adjust weights using gradients. This process iterates to minimize
prediction errors and improve the model's performance.
A convolutional neural network is tailored for image and grid-like data. It employs
convolutional layers to extract spatial features, pooling layers to reduce dimensionality, and
fully connected layers for classification. CNNs excel in tasks like image recognition, object
detection, and image generation.
Experiment-2
Implement different types of Plots using Data Visualization in Python and write about
packages used.
Packages used
There are several popular Python packages used for plotting and data visualization. Each of
these packages has its strengths and weaknesses, and the choice of package depends on your
specific needs and preferences. Here are some of the main plotting libraries in Python:
1. Matplotlib: Matplotlib is one of the most widely used and versatile plotting libraries in
Python. It provides a wide range of plotting functions, supports various plot types (line plots,
scatter plots, bar plots, histograms, etc.), and allows for extensive customization. It serves as the
foundation for many other visualization libraries.
2. Seaborn: Seaborn is built on top of Matplotlib and provides a higher-level interface for
creating attractive statistical visualizations. It simplifies the process of creating complex plots
like violin plots, box plots, pair plots, and more. Seaborn also has good support for working
with Pandas DataFrames.
3. Plotly: Plotly is a powerful interactive plotting library that allows you to create interactive,
web-based visualizations. It supports a wide range of plot types and is particularly useful for
creating interactive charts, dashboards, and 3D visualizations.
4. Pandas Visualization: Pandas, a popular data manipulation library, also offers basic plotting
functionality using the `.plot()` method. It leverages Matplotlib behind the scenes and is handy
for quick exploratory visualizations directly from DataFrames.
5. Bokeh: Bokeh is another interactive visualization library, which is suitable for creating web-
based interactive plots. It can handle big datasets and is often used in data dashboards.
6. Altair: Altair is a declarative statistical visualization library that allows you to create concise,
expressive visualizations by specifying the data transformations and mappings in a compact and
easy-to-read format.
7. Ggplot: Ggplot is a Python implementation of the popular ggplot2 library from R. It follows
the grammar of graphics philosophy and provides a straightforward way to create complex
visualizations with a consistent syntax.
8. Holoviews: Holoviews provides a high-level interface for data visualization, allowing you to
create interactive plots with minimal code. It integrates with various plotting libraries like
Matplotlib, Bokeh, and Plotly.
9. Geopandas: Geopandas is a specialized library for working with geospatial data. It allows
you to create maps and plot geospatial data easily.
Line Plot
plt.figure(figsize=(8, 6))
ax = plt.axes()
ax.set_title(title)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
plt.show()
x_values = [1, 2, 3, 4, 5]
Bar Chart
categories = ['Category 1', 'Category 2', 'Category 3', 'Category 4', 'Category 5']
plt.bar(categories, values)
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Chart')
plt.show()
Histogram
plt.axis('equal')
plt.show()
create_pie_chart(labels, sizes)
Scatter Plot
Area Plot
Experiment-3
Implementation of Data Pre-Processing in Python and its types and also write about
packages used in them.
Data Cleaning also known as data cleansing or data scrubbing, is the process of identifying and
correcting errors, inconsistencies, and inaccuracies in datasets to improve their quality and
reliability. Data cleaning is a crucial step in the data pre-processing pipeline, as accurate and
reliable data is essential for making informed decisions and obtaining meaningful insights from
the data.
Various methods and techniques are used for data cleaning. Here are some common ones:
- Deletion: Removing rows or columns with missing values. This can lead to loss of
information.
- Imputation: Filling in missing values using methods like mean, median, mode, or more
advanced techniques like regression or machine learning models.
2. Handling Outliers:
- Treatment: Depending on the nature of the data and the analysis, outliers can be corrected,
transformed, or left as-is.
- Ensuring that data types are consistent and appropriate for analysis.
- Converting categorical variables into numerical representations using techniques like one-
hot encoding.
4. Deduplication:
7. Feature Engineering:
- Creating new features or transforming existing ones to better represent the underlying
patterns in the data.
Python provides several packages and libraries that are commonly used to implement data
cleaning:
1. Pandas:
2. NumPy:
- It is useful for handling numerical data, performing computations, and working with
matrices.
3. Scikit-learn:
4. Dedupe:
5. OpenRefine:
- They can be used in Python's `re` module to identify and correct inconsistencies.
These packages, among others, offer a wide range of tools and functions to help you clean and
pre-process your data effectively before analysis or modeling. The choice of methods and
packages depends on the specific nature of your data and the goals of your analysis.
import pandas as pd
import numpy as np
print (df)
Output
import pandas as pd
import numpy as np
print (df)['one'].isnull()
Output
import pandas as pd
import numpy as np
df = pd.DataFrame({'one':[10,20,30,40,50,2000],
'two':[1000,0,30,40,50,60]})
print(df).replace({1000:10,2000:60})
Output
one two
0 10 1000
1 20 0
2 30 30
3 40 40
4 50 50
5 2000 60
import pandas as pd
import numpy as np
print (df).drop(na)
Output
g NaNNaNNaN
Experiment-4
Implementation of different types of Activation Functions in Python.
import numpy as np
import numpy as np
def binaryStep(x):
return np.heaviside(x,1)
x = np.linspace(-10, 10)
plt.plot(x, binaryStep(x))
plt.axis('tight')
plt.show()
Sigmoid Function
def sigmoid(x):
return 1/(1+np.exp(-x))
x = np.linspace(-5, 5)
plt.plot(x, sigmoid(x))
plt.axis('tight')
Linear Function
def linear(x):
return x
x = np.linspace(-10, 10)
plt.plot(x, linear(x))
plt.axis('tight')
plt.show()
Tan h Function
def tanh(x):
return np.tanh(x)
x = np.linspace(-10, 10)
plt.plot(x, tanh(x))
plt.axis('tight')
plt.show()
ReLu Function
import numpy as np
import matplotlib.pyplot as plt
def RELU(x):
x1 = []
for i in x:
if i< 0:
x1.append(0)
else:
x1.append(i)
return x1
x = np.linspace(-10, 10)
plt.plot(x, RELU(x))
plt.axis('tight')
plt.title('Activation Function: ReLU')
plt.show()
Softmax Function
def softmax(x):
return np.exp(x) / np.sum(np.exp(x), axis=0)
x = np.linspace(-10, 10)
plt.plot(x, softmax(x))
plt.axis('tight')
plt.title('Activation Function :Softmax')
plt.show()
return np.maximum(alpha * x, x)
y = leaky_relu(x)
plt.figure(figsize=(8, 6))
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid()
plt.legend()
plt.show()
Experiment-5
Implementation of Time Series Data in Python with different types of Algorithms.
1. ARIMA
#univariate time series
import pandas as pd
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.arima.model import ARIMA
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv',
index_col='date', parse_dates=True)
print(df.columns)
column_name = 'value'
result = adfuller(df[column_name])
print("ADF Statistic:", result[0])
print("p-value:", result[1])
print("Critical Values:", result[4])
model = ARIMA(df[column_name], order=(2, 1, 2))
model_fit = model.fit()
predictions = model_fit.predict(start=len(df), end=len(df) + 10)
df['Prediction'] = predictions
df.plot()
Output
Multi – Variate Time Series
import pandas as pd
import numpy as np
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv'
print(df.columns)
column_name = 'Passengers'
results = model.fit()
return results
order = (1, 1, 1)
forecast_periods = 10
forecast = results.forecast(steps=forecast_periods)
plt.figure(figsize=(12, 6))
plt.xlabel('Month')
plt.ylabel(column_name)
plt.legend()
plt.show()
Output
Output
/usr/local/lib/python3.10/dist-packages/statsmodels/tsa/base/tsa_model.py:473:
ValueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
Forecasted values for Variable 1:
2008-07-01 21.488012
2008-08-01 21.488012
2008-09-01 21.488012
2008-10-01 21.488012
2008-11-01 21.488012
2008-12-01 21.488012
2009-01-01 21.488012
2009-02-01 21.488012
2009-03-01 21.488012
2009-04-01 21.488012
Freq: MS, Name: predicted_mean, dtype: float64
Output
Mixed Linear Model Regression Results
====================================================
Model: MixedLM Dependent Variable: value
No. Observations: 204 Method: REML
No. Groups: 1 Scale: 0.0000
Min. group size: 204 Log-Likelihood: inf
Max. group size: 204 Converged: No
Mean group size: 204.0
-----------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
-----------------------------------------------------
Intercept 0.000
value 1.000
Group Var 0.000
Output
Mixed Linear Model Regression Results
=======================================================
Model: MixedLM Dependent Variable: value
No. Observations: 204 Method: REML
No. Groups: 1 Scale: 35.4858
Min. group size: 204 Log-Likelihood: -652.9706
Max. group size: 204 Converged: Yes
Mean group size: 204.0
--------------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
--------------------------------------------------------
Intercept 10.694 0.000 0.000 10.694 10.694
Group Var 35.486
Output
Value group
1 0 3.526591
1 3.180891
2 3.252221
3 3.611003
4 3.565869
... ...
199 21.654285
200 18.264945
201 23.107677
202 22.912510
203 19.431740
2. Pooled OLS
import pandas as pd
import statsmodels.api as sm
url = 'https://raw.githubusercontent.com/selva86/datasets/master/a10.csv'
data = pd.read_csv(url, index_col='date', parse_dates=True)
data['time'] = range(1, len(data) + 1)
data['group'] = 1
X = data[['time', 'group']]
X = sm.add_constant(X)
y = data['value']
model = sm.OLS(y, X)
results = model.fit()
print(results.summary())
Output
OLS Regression Results
==============================================================
Dep. Variable: value R-squared: 0.855
Model: OLS Adj. R-squared: 0.854
Method: Least Squares F-statistic: 1191.
Date: Thu, 24 Aug 2023 Prob (F-statistic): 1.23e-86
Time: 09:14:28 Log-Likelihood: -456.07
No. Observations: 204 AIC: 916.1
Df Residuals: 202 BIC: 922.8
Df Model: 1
Covariance Type: nonrobust
==============================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
time 0.0933 0.003 34.509 0.000 0.088 0.099
group 1.1307 0.320 3.538 0.001 0.500 1.761
=============================================================
Omnibus: 44.147 Durbin-Watson: 0.972
Prob(Omnibus): 0.000 Jarque-Bera (JB): 90.037
Skew: 1.031 Prob(JB): 2.81e-20
Kurtosis: 5.518 Cond. No. 237.
plt.figure(figsize=(10, 6))
plt.plot(data.index, data['value'], label='Actual', color='blue')
plt.plot(data.index[-len(y_test):], y_pred, label='Predicted', color='red')
plt.fill_between(data.index[-len(y_test):], y_pred - 1.96 * sigma, y_pred + 1.96 *
sigma, color='pink', alpha=0.3)
plt.title('Gaussian Process Regression')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.show()
Output
import numpy as np
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
predictions = model.predict(test_input)
print(predictions)
Output
Epoch 1/2
Epoch 2/2
[[10.547165 ]
[10.675017 ]
[10.802973 ]
[10.931027 ]
[11.0591755]
[11.187414 ]
[11.315742 ]
[11.444154 ]
[11.572646 ]
[11.701216 ]]
Experiment-6
import numpy as np
def feed_forward(x):
a = np.tanh(z)
return a
x = np.array([1, 2])
y = feed_forward(x)
print(y)
Output
[1. 1.]
import numpy as np
import tensorflow as tf
sequence_length = 1
num_samples = 10
model = keras.Sequential([
])
model.compile(optimizer='adam', loss='mse')
model.summary()
predictions = model.predict(X_test)
print(predictions)
Output
Model: "sequential_2"
_________________________________________________________________
==============================================================
==============================================================
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
Epoch 2/10
Epoch 3/10
1/1 [==============================] - 0s 54ms/step - loss: 0.2121 - val_loss: 0.5521
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[[-0.00139036]
[-0.01395435]
[-0.03636114]
[-0.00224465]
[ 0.02789616]
[-0.00750315]
[ 0.0128946 ]
[ 0.02804753]
[-0.02991994]
[ 0.02489439]]
import numpy as np
def backpropagation(x, y, weights, biases, learning_rate):
a = np.tanh(z)
error = y - a
da = error * (1 - a ** 2)
dw = np.dot(x.T, da)
db = np.sum(da, axis=0)
x = np.array([1, 2])
y = np.array([3, 4])
learning_rate = 0.1
for i in range(100):
print(weights)
print(biases)
Output
[[1. 2.]
[3. 4.]]
[5. 6.]
import tensorflow as tf
from tensorflow import keras
model = keras.Sequential([
layers.MaxPooling2D((2, 2)),
layers.MaxPooling2D((2, 2)),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
Output
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
import numpy as np
import pandas as pd
iris = datasets.load_iris()
X = iris.data
y = iris.target
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model = Sequential()
model.add(Dense(3, activation='softmax'))
Output
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
import numpy as np
import tensorflow as tf
np.random.seed(0)
y = 2 * X + 1 + np.random.randn(100)
model = tf.keras.Sequential([
tf.keras.layers.Dense(units=1, input_shape=(1,))
])
model.compile(optimizer='sgd', loss='mean_squared_error')
y_pred = model.predict(X_test)
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
Output
import numpy as np
import tensorflow as tf
np.random.seed(0)
X = np.random.randn(1000, 10)
y = np.random.randint(2, size=1000)
model = tf.keras.Sequential([
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
Output
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
20/20 - 0s - loss: 0.6534 - accuracy: 0.6500 - val_loss: 0.7093 - val_accuracy: 0.4875 -
65ms/epoch - 3ms/step
Epoch 9/10
Epoch 10/10
import numpy as np
import tensorflow as tf
num_samples = 1000
sequence_length = 50
input_dim = 13
output_dim = 10
y = np.random.randint(output_dim, size=(num_samples,))
rnn_layer = SimpleRNN(64)(input_layer)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=Adam(learning_rate=0.001),
metrics=['accuracy'])
predictions = model.predict(X_test)
Output
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
import numpy as np
import tensorflow as tf
X = np.random.randn(100, 10, 3)
Y = np.sin(np.sum(X, axis=2))
return X, Y
model = Sequential()
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
X_test, _ = generate_data()
predictions = model.predict(X_test)
print("Predictions:")
print(predictions)
Output
Epoch 1/2
Epoch 2/2
Predictions:
[[[ 2.18788102e-01]
[ 2.71870680e-02]
[ 1.69337727e-02]
[-2.41994888e-01]
[-2.99985141e-01]
[ 1.29971102e-01]
[-1.18486717e-01]
[ 2.03054249e-01]
[-2.30462939e-01]
[-4.99120951e-01]]
import numpy as np
import tensorflow as tf
seq_length = 10
num_samples = 1000
X = np.random.randn(num_samples, seq_length, 1)
y = np.sin(np.sum(X, axis=1))
split_ratio = 0.8
model = Sequential([
Dense(units=1)
])
model.compile(optimizer='adam', loss='mse')
batch_size = 32
epochs = 5
predictions = model.predict(new_data)
print("Predictions:")
print(predictions)
Output
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Predictions:
[[ 0.03846954]
[-0.06677599]
[ 0.02927302]
[ 0.00162729]
[-0.00160487]
[-0.12391292]
[ 0.05950115]
[-0.13238075]
[-0.03390695]
[-0.01337759]]
Back Propagation Neural Network
import numpy as np
import tensorflow as tf
num_samples = 1000
sequence_length = 10
input_dim = 1
y = np.sum(X, axis=1)
model = keras.Sequential([
])
Output
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
import numpy as np
import tensorflow as tf
model = keras.Sequential([
Input(shape=(28, 28)),
SimpleRNN(units=64, activation='relu'),
Dense(units=10, activation='softmax')
])
Output
Epoch 1/5
Epoch 2/5
Epoch 3/5
750/750 [==============================] - 6s 8ms/step - loss: 0.2384 - accuracy:
0.9300 - val_loss: 0.2670 - val_accuracy: 0.9213
Epoch 4/5
Epoch 5/5
#image classification
import tensorflow as tf
mnist = keras.datasets.mnist
model = keras.Sequential([
layers.MaxPooling2D((2, 2)),
layers.MaxPooling2D((2, 2)),
layers.Dense(64, activation='relu'),
layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
predictions = model.predict(test_images)
Output
Epoch 1/2
Epoch 2/2
import numpy as np
import tensorflow as tf
def generate_data():
X = np.random.randn(100, 10, 3)
Y = np.sin(np.sum(X, axis=2))
return X, Y
model = Sequential()
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
X_test, _ = generate_data()
predictions = model.predict(X_test)
print("Predictions:")
print(predictions)
Output:
Epoch 1/2
Epoch 2/2
Predictions:
[[[ 2.00504497e-01]
[-2.02019125e-01]
[-3.98066401e-01]
[-6.29033446e-01]
[ 3.75485979e-02]
[ 1.73215717e-02]
[ 1.17083795e-01]
[ 6.31665707e-01]
[-2.20511571e-01]
[-5.30349433e-01]]
Experiment-7
Output
Error: [3.04097366]
Error: [1.82660775]
Error: [1.25568368]
Error: [0.86871029]
Error: [0.60121077]
Error: [0.41609032]
Error: [0.28797116]
Error: [0.19930143]
Error: [0.13793416]
Error: [0.0954626]
Error: [0.06606853]
Error: [0.04572525]
Error: [0.0316459]
Error: [0.02190176]
Error: [0.01515795]
Error: [0.01049063]
Error: [0.00726044]
Error: [0.00502487]
Error: [0.00347765]
Error: [0.00240684]
Error: [0.00166575]
Error: [0.00115284]
Error: [0.00079787]
Weight: [0.00977986 0.00977986 0.98802216]
Bias: [0.00977986]
Madaline Networks
import numpy as np
def activation_fn(z):
if z >= 0:
return 1
else:
return -1
def madaline(Input, Target, lr, epoch):
num_inputs = Input.shape[1]
weight = np.random.random(num_inputs)
bias = np.random.random()
k=0
while k < epoch:
error = 0
for i in range(Input.shape[0]):
y_input = sum(weight * Input[i]) + bias
y = activation_fn(y_input)
if y != Target[i]:
weight = weight + lr * (Target[i] - y) * Input[i]
bias = bias + lr * (Target[i] - y)
error += (Target[i] - y) ** 2
print(k, '>> Error:', error)
k += 1
return weight, bias
# Input dataset
x = np.array([[1.0, 1.0, 1.0],
[1.0, -1.0, 1.0],
[-1.0, 1.0, 1.0],
[-1.0, -1.0, -1.0]])
# Target values
t = np.array([1, 1, 1, -1])
w, b = madaline(x, t, 0.1, 10)
print('Weight:', w)
print('Bias:', b)
Output
0 >> Error: 0
1 >> Error: 0
2 >> Error: 0
3 >> Error: 0
4 >> Error: 0
5 >> Error: 0
6 >> Error: 0
7 >> Error: 0
8 >> Error: 0
9 >> Error: 0
Weight: [0.87271172 0.94767273 0.18015286]
Bias: 0.9207387586434277
Experiment-8
Output