Business Data Mining Week 12
Business Data Mining Week 12
The architecture of the Self Organizing Map with two clusters and n input features of any
sample is given below:
Let’s say an input data of size (m, n) where m is the number of training examples and n is the
number of features in each example. First, it initializes the weights of size (n, C) where C is
the number of clusters. Then iterating over the input data, for each training example, it updates
the winning vector (weight vector with the shortest distance (e.g Euclidean distance) from
training example). Weight updation rule is given by :
where alpha is a learning rate at time t, j denotes the winning vector, i denotes the ith feature
of training example and k denotes the kth training example from the input data. After training
the SOM network, trained weights are used for clustering new examples. A new example falls
in the cluster of winning vectors.
Algorithm
Training:
Step 1: Initialize the weights wij random value may be assumed. Initialize the learning rate α.
Step 3: Find index J, when D(j) is minimum that will be considered as winning
index.
Step 4: For each j within a specific neighborhood of j and for all i, calculate the
new weight.
classSOM:
D0 =0
D1 =0
fori inrange(len(sample)):
return0
else:
return1
returnweights
# Driver code
defmain():
# Training Examples ( m, n )
T =[[1, 1, 0, 0], [0, 0, 0, 1], [1, 0, 0, 0], [0, 0, 1, 1]]
m, n =len(T), len(T[0])
# weight initialization ( n, C )
weights =[[0.2, 0.6, 0.5, 0.9], [0.8, 0.4, 0.7, 0.3]]
# training
ob =SOM()
epochs =3
alpha =0.5
fori inrange(epochs):
forj inrange(m):
# training sample
sample =T[j]
J =ob.winner(weights, sample)
*********************************
The Convolutional layer applies filters to the input image to extract features, the Pooling
layer downsamples the image to reduce computation, and the fully connected layer makes
the final prediction. The network learns the optimal filters through backpropagation and
gradient descent.
Now let’s talk about a bit of mathematics that is involved in the whole convolution process.
Convolution layers consist of a set of learnable filters (or kernels) having small widths
and heights and the same depth as that of input volume (3 if the input layer is image
input).
For example, if we have to run convolution on an image with dimensions 34x34x3. The
possible size of filters can be axax3, where ‘a’ can be anything like 3, 5, or 7 but smaller
as compared to the image dimension.
During the forward pass, we slide each filter across the whole input volume step by step
where each step is called stride (which can have a value of 2, 3, or even 4 for high-
dimensional images) and compute the dot product between the kernel weights and patch
from input volume.
As we slide our filters we’ll get a 2-D output for each filter and we’ll stack them together
as a result, we’ll get output volume having a depth equal to the number of filters. The
network will learn all the filters.
Convolutional Layers: This is the layer, which is used to extract the feature from the
input dataset. It applies a set of learnable filters known as the kernels to the input images.
The filters/kernels are smaller matrices usually 2×2, 3×3, or 5×5 shape. it slides over the
input image data and computes the dot product between kernel weight and the
corresponding input image patch. The output of this layer is referred as feature maps.
Suppose we use a total of 12 filters for this layer we’ll get an output volume of dimension
32 x 32 x 12.
Activation Layer: By adding an activation function to the output of the preceding layer,
activation layers add nonlinearity to the network. it will apply an element-wise activation
function to the output of the convolution layer. Some common activation functions are
RELU: max(0, x), Tanh, Leaky RELU, etc. The volume remains unchanged hence
output volume will have dimensions 32 x 32 x 12.
Pooling layer: This layer is periodically inserted in the covnets and its main function is
to reduce the size of volume which makes the computation fast reduces memory and also
prevents overfitting. Two common types of pooling layers are max pooling and average
pooling. If we use a max pool with 2 x 2 filters and stride 2, the resultant volume will be
of dimension 16x16x12.
Image source: cs231n.stanford.edu
Flattening: The resulting feature maps are flattened into a one-dimensional vector after
the convolution and pooling layers so they can be passed into a completely linked layer
for categorization or regression.
Fully Connected Layers: It takes the input from the previous layer and computes the
final classification or regression task.
Output Layer: The output from the fully connected layers is then fed into a logistic
function for classification tasks like sigmoid or softmax which converts the output of each
class into the probability score of each class.
Example:
Let’s consider an image and apply the convolution layer, activation layer, and pooling layer
operation to extract the inside feature.
Input image:
Input image
Step:
import the necessary libraries
set the parameter
define the kernel
Load the image and plot it.
Reformat the image
Apply convolution layer operation and plot the output image.
Apply activation layer operation and plot the output image.
Apply pooling layer operation and plot the output image.
importmatplotlib.pyplot as plt
fromitertools importproduct
# set the param
plt.rc('figure', autolayout=True)
plt.rc('image', cmap='magma')
[-1, 8, -1],
[-1, -1, -1],
])
image =tf.io.read_file('Ganesh.jpg')
image =tf.io.decode_jpeg(image, channels=1)
image =tf.image.resize(image, size=[300, 300])
img =tf.squeeze(image).numpy()
plt.figure(figsize=(5, 5))
plt.imshow(img, cmap='gray')
plt.axis('off')
# Reformat
image =tf.image.convert_image_dtype(image, dtype=tf.float32)
image =tf.expand_dims(image, axis=0)
# convolution layer
conv_fn =tf.nn.conv2d
image_filter =conv_fn(
input=image,
filters=kernel,
strides=1, # or (1, 1)
padding='SAME',
)
plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(
tf.squeeze(image_filter)
)
plt.axis('off')
plt.title('Convolution')
# activation layer
relu_fn =tf.nn.relu
# Image detection
image_detect =relu_fn(image_filter)
plt.subplot(1, 3, 2)
plt.imshow(
# Reformat for plotting
tf.squeeze(image_detect)
)
plt.axis('off')
plt.title('Activation')
# Pooling layer
pool =tf.nn.pool
image_condense =pool(input=image_detect,
window_shape=(2, 2),
pooling_type='MAX',
strides=(2, 2),
padding='SAME',
)
plt.subplot(1, 3, 3)
plt.imshow(tf.squeeze(image_condense))
plt.axis('off')
plt.title('Pooling')
plt.show()
Output :
Today Deep learning AI has become one of the most popular and visible areas of machine
learning, due to its success in a variety of applications, such as computer vision, natural
language processing, and Reinforcement learning.
Deep learning AI can be used for supervised, unsupervised as well as reinforcement machine
learning. it uses a variety of ways to process these.
Supervised Machine Learning: Supervised machine learning is the machine learning
technique in which the neural network learns to make predictions or classify data based
on the labeled datasets. Here we input both input features along with the target variables.
the neural network learns to make predictions based on the cost or error that comes from
the difference between the predicted and the actual target, this process is known as
backpropagation. Deep learning algorithms like Convolutional neural networks,
Recurrent neural networks are used for many supervised tasks like image classificat ions
and recognization, sentiment analysis, language translations, etc.
Unsupervised Machine Learning: Unsupervised machine learning is the machine
learning technique in which the neural network learns to discover the patterns or to cluster
the dataset based on unlabeled datasets. Here there are no target variables. while the
machine has to self-determined the hidden patterns or relationships within the datasets.
Deep learning algorithms like autoencoders and generative models are used for
unsupervised tasks like clustering, dimensionality reduction, and anomaly detection.
Reinforcement Machine Learning: Reinforcement Machine Learning is the machine
learning technique in which an agent learns to make decisions in an environment to
maximize a reward signal. The agent interacts with the environment by taking action and
observing the resulting rewards. Deep learning can be used to learn policies, or a set of
actions, that maximizes the cumulative reward over time. Deep reinforcement learning
algorithms like Deep Q networks and Deep Deterministic Policy Gradient (DDPG) are
used to reinforce tasks like robotics and game playing etc.
Artificial neurons, also known as units, are found in artificial neural networks. The whole
Artificial Neural Network is composed of these artificial neurons, which are arranged in a
series of layers. The complexities of neural networks will depend on the complexities of the
underlying patterns in the dataset whether a layer has a dozen units or millions of units.
Commonly, Artificial Neural Network has an input layer, an output layer as well as hidden
layers. The input layer receives data from the outside world which the neural network needs
to analyze or learn about.
In a fully connected artificial neural network, there is an input layer and one or more hidden
layers connected one after the other. Each neuron receives input from the previous layer
neurons or the input layer. The output of one neuron becomes the input to other neurons in
the next layer of the network, and this process continues until the final layer produces the
output of the network. Then, after passing through one or more hidden layers, this data is
transformed into valuable data for the output layer. Finally, the output layer provides an
output in the form of an artificial neural network’s response to the data that comes in.
Units are linked to one another from one layer to another in the bulk of neural networks. Each
of these links has weights that control how much one unit influences another. The neural
network learns more and more about the data as it moves from one unit to another, ultimately
producing an output from the output layer.
1. Computer vision
The first Deep Learning applications is Computer vision. In computer vision, Deep learning
AI models can enable machines to identify and understand visual data. Some of the main
applications of deep learning in computer vision include:
Object detection and recognition: Deep learning model can be used to identify and
locate objects within images and videos, making it possible for machines to perform tasks
such as self-driving cars, surveillance, and robotics.
Image classification: Deep learning models can be used to classify images into
categories such as animals, plants, and buildings. This is used in applications such as
medical imaging, quality control, and image retrieval.
Image segmentation: Deep learning models can be used for image segmentation into
different regions, making it possible to identify specific features within images.
Language translation: Deep learning models can translate text from one language to
another, making it possible to communicate with people from different linguist ic
backgrounds.
Sentiment analysis: Deep learning models can analyze the sentiment of a piece of text,
making it possible to determine whether the text is positive, negative, or neutral. This is
used in applications such as customer service, social media monitoring, and political
analysis.
Speech recognition: Deep learning models can recognize and transcribe spoken words,
making it possible to perform tasks such as speech-to-text conversion, voice search, and
voice-controlled devices.
3. Reinforcement learning:
In reinforcement learning, deep learning works as training agents to take action in an
environment to maximize a reward. Some of the main applications of deep learning in
reinforcement learning include:
Game playing: Deep reinforcement learning models have been able to beat human
experts at games such as Go, Chess, and Atari.
Robotics: Deep reinforcement learning models can be used to train robots to perform
complex tasks such as grasping objects, navigation, and manipulation.
Control systems: Deep reinforcement learning models can be used to control complex
systems such as power grids, traffic management, and supply chain optimization.
- **Use-Cases**:
- Market Segmentation: Identifying distinct customer segments based on purchasing
behavior, demographics, etc.
- Anomaly Detection: Detecting unusual patterns in financial transactions or manufacturing
processes.
- Product Recommendation: Recommending products to customers based on their behavior
and preferences.
- **Use-Cases**:
- Image Classification: Classifying images into predefined categories.
- Object Detection: Identifying and localizing objects within images.
- Facial Recognition: Recognizing faces in images or videos.
Deep Learning:
- **Role**: Deep learning encompasses a variety of neural network architectures,
including CNNs and Recurrent Neural Networks (RNNs). It learns intricate patterns and
relationships from data, often with multiple layers of abstraction.
- **Use-Cases**:
- Natural Language Processing (NLP): Analyzing and generating human language text.
- Time Series Forecasting: Predicting future values based on historical data.
- Customer Churn Prediction: Identifying customers likely to leave a service or cancel a
subscription.
- **CNNs**:
- Excellent performance in image-related tasks.
- Hierarchical feature learning.
- Parameter sharing and translation invariance.
- **Deep Learning**:
- Ability to learn intricate patterns from large datasets.
- High flexibility and scalability.
- Can handle unstructured data types.
Drawbacks:
- **SOMs**:
- Limited interpretability of clusters.
- Computational complexity for large datasets.
- Parameter tuning challenges.
- **CNNs**:
- Computationally intensive, especially with large models.
- Requires substantial amounts of labeled data for training.
- Interpretability challenges.
- **Deep Learning**:
- Prone to overfitting, especially with insufficient data.
- Black-box nature makes it hard to interpret decisions.
- High computational resources required for training.
3. **Interpretability Techniques**:
- Employ visualization methods like t-SNE (t-distributed Stochastic Neighbor
Embedding) to interpret SOM results.
- Use model interpretation techniques like SHAP or LIME for CNNs and deep learning
models to explain predictions to stakeholders.
4. **Incremental Learning**:
- Implement incremental learning strategies to update models over time with new data.
- Deploy online learning approaches to adapt models to evolving business environments.
Conclusion:
Self-Organizing Maps, Convolutional Neural Networks, and Deep Learning offer
powerful capabilities for business data mining, each with unique strengths and weaknesses.
By understanding their roles, use-cases, advantages, and drawbacks, businesses can
effectively leverage these techniques to extract actionable insights, optimize processes, and
drive innovation. Effective application of these methods requires careful consideration of the
specific business problem, data characteristics, computational resources, and interpretability
requirements. Through thoughtful integration and adaptation, these advanced analytics
techniques can unlock new opportunities and provide a competitive edge in today's data-
driven business landscape.
In conclusion, the field of Deep Learning represents a transformative leap in artificial
intelligence. By mimicking the human brain’s neural networks, Deep Learning AI algorithms
have revolutionized industries ranging from healthcare to finance, from autonomous vehicles
to natural language processing. As we continue to push the boundaries of computational
power and dataset sizes, the potential applications of Deep Learning are limitless. However,
challenges such as interpretability and ethical considerations remain significant. Yet, with
ongoing research and innovation, Deep Learning promises to reshape our future, ushering in
a new era where machines can learn, adapt, and solve complex problems at a scale and speed
previously unimaginable.