DWDN Lab
DWDN Lab
Build Data Warehouse/Data Mart (using open source tools like Pentaho Data Integration
Tool,
Pentaho Business Analytics; or other data warehouse tools like Microsoft-SSIS,
Informatica,
Business Objects,etc.,)
Design multi-dimensional data models namely Star, Snowflake and Fact Constellation
schemas for
any one enterprise (ex. Banking, Insurance, Finance, Healthcare, manufacturing,
Automobiles,
sales etc).
Write ETL scripts and implement using data warehouse tools.
Perform Various OLAP operations such slice, dice, roll up, drill up and pivot
3. Perform data preprocessing tasks and Demonstrate performing association rule mining on data
sets
Explore various options available in Weka for preprocessing data and apply
Unsupervised filters
like Discretization, Resample filter, etc. on each dataset
Load weather. nominal, Iris, Glass datasets into Weka and run Apriori
Algorithm with different support and confidence values.
Study the rules generated. Apply different discretization filters on numerical attributes
and run the
Apriori association rule algorithm. Study the rules generated.
¾ Derive interesting insights and observe the effect of discretization in the rule
generation process.
Load each dataset into Weka and run 1d3, J48 classification algorithm. Study the
classifier output.
Compute entropy values, Kappa statistic.
Extract if-then rules from the decision tree generated by the classifier, Observe the
confusion
matrix.
Load each dataset into Weka and perform Naïve-bayes classification and k-Nearest
Neighbour
classification. Interpret the results obtained. Plot RoC Curves
Compare classification results of ID3, J48, Naïve-Bayes and k-NN classifiers for each
dataset, and
deduce which classifier is performing best and poor for each dataset and justify.
Load each dataset into Weka and run simple k-means clustering algorithm with different
values of
k (number of desired clusters).
Study the clusters formed. Observe the sum of squared errors and centroids, and derive
insights.
Explore other clustering techniques available in Weka.
Explore visualization features of Weka to visualize the clusters. Derive interesting
insights and
explain.
6. Demonstrate knowledge flow application on data sets
¾ Develop a knowledge flow layout for finding strong association rules by using Apriori,
FP Growth
algorithms
¾ Set up the knowledge flow to load an ARFF (batch mode) and perform a cross
validation using J48
algorithm
¾ Demonstrate plotting multiple ROC curves in the same plot window by using j48 and
Random
forest tree
7. Demonstrate ZeroR technique on Iris dataset (by using necessary preprocessing technique(s))
and
8. Write a java program to prepare a simulated data set with unique instances.
9. Write a Python program to generate frequent item sets / association rules using Apriori
algorithm
10. Write a program to calculate chi-square value using Python. Report your observation.
11. Write a program of Naive Bayesian classification using Python programming language.
13. Write a program to cluster your choice of data using simple k-means algorithm using JDK
14. Write a program of cluster analysis using simple k-means algorithm Python programming
language.
15. Write a program to compute/display dissimilarity matrix (for your own dataset containing at
least four
16. Visualize the datasets using matplotlib in python.(Histogram, Box plot, Bar chart, Pie chart
etc.,)
Lab Name: CN
List of Experiments:
1. Study of Network devices in detail and connect the computers in Local Area
Network.
2. Write a Program to implement the data link layer farming methods such as
4. Write a program for Hamming Code generation for error detection and
correction.
node (Take an example subnet graph with weights indicating delay between
nodes).
12. Write a Program to implement Broadcast tree by taking subnet of hosts.
13. Wireshark
i. NS2 Simulator-Introduction
Lab Name:
Exercise:1
Module name: Implementation of CICD with Java and open source stack
Configure the web application and Version control using Git using Git commands
and version control operations.
Exercise 2
Module Name: Implementation of CICD with Java and open source stack
Configure a static code analyzer which will perform static analysis of the web
application code and identify the coding practices that are not appropriate.
Configure the profiles and dashboard of the static code analysis tool.
Exerxise-3
Module Name: Implementation of CICD with Java and open source stack
Write a build script to build the application using a build automation tool like
Maven. Create a folder structure that will run the build script and invoke the
various software development build stages. This script should invoke the static
analysis tool and unit test cases and deploy the application to a web application
server like Tomcat.
Exercise -4
Module name: Implementation of CICD with Java and open source stack
Create a pipeline view of the Jenkins pipeline used in Exercise 8. Configure it with
user defined messages.
Exercise 5 :
Module name: Implementation of CICD with Java and open source stack
of code.
Exercise 6
Module name :Implementation of CICD with Java and open source stack
testing.
Exercise 7