0% found this document useful (0 votes)

54 views5 pages

Data Mining Syllabus Overview

The document covers various chapters on data mining, including numerical problems and previous year questions related to data preprocessing, mining frequent patterns, classification, clustering, and web mining. Each chapter provides practical exercises and theoretical questions to enhance understanding of data mining concepts and techniques. Key topics include data normalization, classification metrics, clustering algorithms, and web usage mining.

Uploaded by

story.legandery

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views5 pages

Data Mining Syllabus Overview

Uploaded by

story.legandery

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Chapter 1: Introduction to Data Mining

Numerical Problems:

1. Suppose a dataset contains 10,000 records, and each record has 20 attributes. Estimate
the storage required if each attribute takes 4 bytes.

2. If a data mining system can analyze 500 records per second, estimate how long it will
take to process 2 million records.

3. A company uses data mining to classify customer transactions. Given that 10% of the
transactions are fraudulent, estimate the number of fraudulent transactions from a
dataset of 50,000 transactions.

Previous Year Questions:

1. Define data mining and explain its different functionalities.

2. Discuss the major issues in data mining with examples.

3. Explain the Knowledge Discovery in Databases (KDD) process.

4. How does a data warehouse support data mining?

Chapter 2: Data Preprocessing

Numerical Problems:

1. Given the dataset: (10, 20, 30, 40, 50, 60, 70, 80, 90, 100)

o Normalize the dataset using Min-Max Normalization between [0,1].

o Perform Z-score normalization using mean and standard deviation.

2. Suppose a dataset contains missing values. If 20% of a 1,000-record dataset has missing
values, how many records need imputation?

3. Given a dataset with 5 categorical attributes, each having 4 possible values, calculate the
number of unique possible records.

Previous Year Questions:

1. What are different techniques for handling missing values in a dataset?

2. Explain data discretization and concept hierarchy generation with examples.

3. How is dimensionality reduction helpful in data mining? Explain CUR decomposition.

4. Describe feature selection and feature transformation.

Chapter 3: Mining Frequent Patterns, Associations, and Correlations

Numerical Problems:

1. Given the transactions:

css

CopyEdit

T1: {Milk, Bread, Butter}

T2: {Milk, Bread}

T3: {Milk, Butter}

T4: {Bread, Butter}

o Calculate the support and confidence for the rule {Milk} → {Bread}.

o Find frequent itemsets using Apriori algorithm with a minimum support of 50%.

2. Suppose you have the following frequent itemsets and confidence values:

less

CopyEdit

{A, B} → {C} (Confidence = 70%)

{A} → {C} (Confidence = 50%)

o Compute the Lift Ratio and analyze the association rule strength.

Previous Year Questions:

1. What are the major steps in the Apriori algorithm? Explain with an example.

2. Discuss how association rule mining differs from correlation analysis.

3. Explain different types of association rule techniques and their applications.

4. What are the methods for measuring the quality of association rules?

Chapter 4: Classification and Prediction

Numerical Problems:

1. A dataset has the following classification results for a binary classifier:

cpp

CopyEdit

True Positives (TP) = 40, False Positives (FP) = 10

True Negatives (TN) = 30, False Negatives (FN) = 20

o Calculate Accuracy, Precision, Recall, and F1-score.

2. Given the following dataset:

makefile

CopyEdit

Age: (25, 30, 45, 50, 60)

Salary: (40k, 50k, 80k, 90k, 100k)

o Fit a linear regression model to predict salary based on age.

o Estimate the salary for an employee aged 40.

3. A company uses a decision tree-based classifier. The training dataset has 100 records
with 4 attributes. If each attribute can take 3 different values, calculate the number of
possible attribute-value combinations.

Previous Year Questions:

1. Explain Decision Tree algorithm with an example.

2. Differentiate between classification and prediction.

3. What are the different performance evaluation techniques used for classification?

4. Explain the role of logistic regression in prediction problems.

Chapter 5: Cluster Analysis

Numerical Problems:

1. Apply the K-Means clustering algorithm to the dataset:

scss
CopyEdit

(2,2), (3,3), (8,8), (9,9)

o Assume K=2 and initial centroids as (2,2) and (8,8).

o Perform one iteration of K-Means and update the centroids.

2. A dataset has 200 points and is divided into 5 clusters. Compute the intra-cluster
distance if the average distance within each cluster is 2.5.

3. A hierarchical clustering algorithm merges two clusters with distances D(A, B) = 5 and
D(B, C) = 7. Compute the new distance using:

o Single linkage method

o Complete linkage method

Previous Year Questions:

1. Compare K-Means and Hierarchical clustering techniques.

2. What are the major issues in clustering high-dimensional data?

3. How does outlier detection affect clustering performance?

4. Explain agglomerative and divisive hierarchical clustering techniques.

Chapter 6: Web Mining and Other Data Mining Techniques

Numerical Problems:

1. Given a web log file with 500 entries, where 100 belong to a single user session,
calculate the session duration if the average time spent per entry is 2 minutes.

2. A website has the following clickstream data:

css

CopyEdit

Page A → Page B (50 clicks)

Page B → Page C (30 clicks)

Page C → Page A (20 clicks)

o Compute the transition probability matrix for web usage mining.

3. Given a dataset of multimedia files, where images occupy 40% of the total storage,
videos 50%, and audio 10%, compute the total space required if the total dataset size is
1TB.

Previous Year Questions:

1. What is web usage mining? Explain its applications.

2. Discuss various types of web mining and their importance.

3. Explain how spatial and temporal data mining differ from traditional data mining
techniques.

4. What are the challenges in multimedia mining?

DM Unit Wise Important Questions
No ratings yet
DM Unit Wise Important Questions
6 pages
Seperated
No ratings yet
Seperated
11 pages
Question Bank Bca - Ids
No ratings yet
Question Bank Bca - Ids
3 pages
Sample Question DMW
No ratings yet
Sample Question DMW
4 pages
Data Ming
No ratings yet
Data Ming
28 pages
Gandhinagar Institute of Technology: Computer Engineer Ing Department Question Bank
No ratings yet
Gandhinagar Institute of Technology: Computer Engineer Ing Department Question Bank
3 pages
DATA MINING (Gtu Sem-6) Assignment
No ratings yet
DATA MINING (Gtu Sem-6) Assignment
3 pages
Data Mining & Warehouse Q&A
No ratings yet
Data Mining & Warehouse Q&A
4 pages
Data Warehousing & Mining Exam 2019
No ratings yet
Data Warehousing & Mining Exam 2019
4 pages
Ans DM
No ratings yet
Ans DM
16 pages
Data Warehousing & Mining Course Overview
No ratings yet
Data Warehousing & Mining Course Overview
118 pages
Data Mining Suggestions
No ratings yet
Data Mining Suggestions
5 pages
DM Passing Package
No ratings yet
DM Passing Package
38 pages
DM-Model Question Paper Solutions
No ratings yet
DM-Model Question Paper Solutions
27 pages
DMBI QB AssignmentQ
No ratings yet
DMBI QB AssignmentQ
8 pages
Data Mining Exam Prep Guide
No ratings yet
Data Mining Exam Prep Guide
4 pages
CS-DM Module - 1
No ratings yet
CS-DM Module - 1
27 pages
Aie - Concept of Data Mining
No ratings yet
Aie - Concept of Data Mining
5 pages
Content DM
No ratings yet
Content DM
10 pages
Data Mining & Warehousing Exam Paper
No ratings yet
Data Mining & Warehousing Exam Paper
47 pages
DWM PYQs
No ratings yet
DWM PYQs
7 pages
DM Guidelines 14jan2022
No ratings yet
DM Guidelines 14jan2022
5 pages
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
Datamining Quiz
No ratings yet
Datamining Quiz
173 pages
DWDM
No ratings yet
DWDM
14 pages
SemSuggestions DM
No ratings yet
SemSuggestions DM
6 pages
KDD and Data Mining Explained
No ratings yet
KDD and Data Mining Explained
46 pages
Data Mining
No ratings yet
Data Mining
32 pages
DWDM
No ratings yet
DWDM
18 pages
Short Notes On Data Mining & Warehousing
No ratings yet
Short Notes On Data Mining & Warehousing
43 pages
Data Warehousing and Data Mining Unit - I Data Warehousing, Business Analysis and On-Line Analytical Processing (Olap) PART A (2 Marks)
No ratings yet
Data Warehousing and Data Mining Unit - I Data Warehousing, Business Analysis and On-Line Analytical Processing (Olap) PART A (2 Marks)
5 pages
Data Mining Exam Review Guide
No ratings yet
Data Mining Exam Review Guide
6 pages
Data Warehousing and Data Mining Dr.P.rizwan Ahmed
0% (1)
Data Warehousing and Data Mining Dr.P.rizwan Ahmed
20 pages
CEUC502 - DMBI - Question - Bank
No ratings yet
CEUC502 - DMBI - Question - Bank
12 pages
QB Data Mining
No ratings yet
QB Data Mining
5 pages
DMDW Assignment
No ratings yet
DMDW Assignment
20 pages
Cs1004: Data Warehousing and Mining Two Marks Questions and Answers Unit I
No ratings yet
Cs1004: Data Warehousing and Mining Two Marks Questions and Answers Unit I
31 pages
DWDM QB
No ratings yet
DWDM QB
6 pages
Handling Continuous Attributes: Different Kinds of Rules
No ratings yet
Handling Continuous Attributes: Different Kinds of Rules
33 pages
Data Mining Syllabus and Question
No ratings yet
Data Mining Syllabus and Question
6 pages
DMBI Questions
No ratings yet
DMBI Questions
8 pages
DMDW Question Bank
No ratings yet
DMDW Question Bank
17 pages
What Is Data Mining?
No ratings yet
What Is Data Mining?
17 pages
DM 100
No ratings yet
DM 100
17 pages
CAS CS 565, Data Mining
No ratings yet
CAS CS 565, Data Mining
30 pages
DM Vsaq
No ratings yet
DM Vsaq
8 pages
Data Mining Long Answers
No ratings yet
Data Mining Long Answers
4 pages
Data Mining Concepts and Techniques Guide
No ratings yet
Data Mining Concepts and Techniques Guide
4 pages
Data Mining Basics for Beginners
100% (1)
Data Mining Basics for Beginners
7 pages
DWDM Unitwise Questions
No ratings yet
DWDM Unitwise Questions
3 pages
Unit - 1 INTRODUCTION, DATA - 1: What Is Data Mining? Motivating Challenges The Origins of Data 6 Hours
No ratings yet
Unit - 1 INTRODUCTION, DATA - 1: What Is Data Mining? Motivating Challenges The Origins of Data 6 Hours
6 pages
Lec 1
No ratings yet
Lec 1
33 pages
Data Mining1
No ratings yet
Data Mining1
13 pages
Key Concepts in Data Mining and Analysis
No ratings yet
Key Concepts in Data Mining and Analysis
40 pages
EPIQ 1.5.6 and Up Software Installation FII
No ratings yet
EPIQ 1.5.6 and Up Software Installation FII
13 pages
HTTP - App - Utu.ac - in - Utuexmanagement - Exammsters - Syllabus - OC5005 - Introduction To Algorithms and Analysis.
No ratings yet
HTTP - App - Utu.ac - in - Utuexmanagement - Exammsters - Syllabus - OC5005 - Introduction To Algorithms and Analysis.
5 pages
Narasimhan Resume
No ratings yet
Narasimhan Resume
5 pages
AB Hi-Fi Expenditure Cycle Process
0% (1)
AB Hi-Fi Expenditure Cycle Process
3 pages
Math 503 TQ Final
No ratings yet
Math 503 TQ Final
10 pages
Digital Innovation: Strategies for Success
No ratings yet
Digital Innovation: Strategies for Success
14 pages
IT Department Book List
No ratings yet
IT Department Book List
4 pages
SYMAP - A9 - MDEC Settings - v3.8 - GB
No ratings yet
SYMAP - A9 - MDEC Settings - v3.8 - GB
14 pages
Ex 9
No ratings yet
Ex 9
5 pages
Formulas To Remember That Are Not Given in The Formula Sheet
No ratings yet
Formulas To Remember That Are Not Given in The Formula Sheet
7 pages
Set 2
No ratings yet
Set 2
11 pages
Android System Server Initialization Logs
No ratings yet
Android System Server Initialization Logs
1,172 pages
Microsoft Copilot Guide
No ratings yet
Microsoft Copilot Guide
6 pages
MBA Thesis: Mobile Banking Analytics
No ratings yet
MBA Thesis: Mobile Banking Analytics
83 pages
E-Commerce: Growth, Features, and Security
No ratings yet
E-Commerce: Growth, Features, and Security
10 pages
Software Project Management Exam Paper
No ratings yet
Software Project Management Exam Paper
2 pages
Auma User Manual
No ratings yet
Auma User Manual
100 pages
AICTE Internship 2024 Project Report
No ratings yet
AICTE Internship 2024 Project Report
12 pages
Step 1: Create 16 Subkeys, Each of Which Is 48-Bits Long
No ratings yet
Step 1: Create 16 Subkeys, Each of Which Is 48-Bits Long
12 pages
Ec15 TB Chapter 4 Test Bank
No ratings yet
Ec15 TB Chapter 4 Test Bank
25 pages
Python MySQL Connectivity Examples
No ratings yet
Python MySQL Connectivity Examples
5 pages
FINAL
No ratings yet
FINAL
19 pages
AI & ML B.E. Syllabus R-20
No ratings yet
AI & ML B.E. Syllabus R-20
36 pages
Student ch02 Data Models Solutions
No ratings yet
Student ch02 Data Models Solutions
17 pages
Python Quiz Results Analysis
No ratings yet
Python Quiz Results Analysis
8 pages
Applied Cryptography Overview
No ratings yet
Applied Cryptography Overview
30 pages
DAA Short Answers
No ratings yet
DAA Short Answers
2 pages
Assignment Table Claas 6
No ratings yet
Assignment Table Claas 6
3 pages
Ransomware Attack Seminar Report
No ratings yet
Ransomware Attack Seminar Report
20 pages
UDS Basics
No ratings yet
UDS Basics
1 page

Data Mining Syllabus Overview

Uploaded by

Data Mining Syllabus Overview

Uploaded by

Chapter 1: Introduction to Data Mining

Previous Year Questions:

1. Define data mining and explain its different functionalities.

2. Discuss the major issues in data mining with examples.

3. Explain the Knowledge Discovery in Databases (KDD) process.

4. How does a data warehouse support data mining?

Chapter 2: Data Preprocessing

o Normalize the dataset using Min-Max Normalization between [0,1].

o Perform Z-score normalization using mean and standard deviation.

Previous Year Questions:

1. What are different techniques for handling missing values in a dataset?

2. Explain data discretization and concept hierarchy generation with examples.

3. How is dimensionality reduction helpful in data mining? Explain CUR decomposition.

Chapter 3: Mining Frequent Patterns, Associations, and Correlations

1. Given the transactions:

T1: {Milk, Bread, Butter}

T2: {Milk, Bread}

T3: {Milk, Butter}

T4: {Bread, Butter}

{A, B} → {C} (Confidence = 70%)

{A} → {C} (Confidence = 50%)

Previous Year Questions:

2. Discuss how association rule mining differs from correlation analysis.

3. Explain different types of association rule techniques and their applications.

Chapter 4: Classification and Prediction

1. A dataset has the following classification results for a binary classifier:

True Positives (TP) = 40, False Positives (FP) = 10

True Negatives (TN) = 30, False Negatives (FN) = 20

o Calculate Accuracy, Precision, Recall, and F1-score.

2. Given the following dataset:

Age: (25, 30, 45, 50, 60)

Salary: (40k, 50k, 80k, 90k, 100k)

o Fit a linear regression model to predict salary based on age.

o Estimate the salary for an employee aged 40.

Previous Year Questions:

1. Explain Decision Tree algorithm with an example.

2. Differentiate between classification and prediction.

4. Explain the role of logistic regression in prediction problems.

Chapter 5: Cluster Analysis

1. Apply the K-Means clustering algorithm to the dataset:

(2,2), (3,3), (8,8), (9,9)

o Assume K=2 and initial centroids as (2,2) and (8,8).

o Perform one iteration of K-Means and update the centroids.

o Single linkage method

o Complete linkage method

Previous Year Questions:

1. Compare K-Means and Hierarchical clustering techniques.

2. What are the major issues in clustering high-dimensional data?

3. How does outlier detection affect clustering performance?

4. Explain agglomerative and divisive hierarchical clustering techniques.

Chapter 6: Web Mining and Other Data Mining Techniques

2. A website has the following clickstream data:

Page A → Page B (50 clicks)

Page B → Page C (30 clicks)

Page C → Page A (20 clicks)

o Compute the transition probability matrix for web usage mining.

Previous Year Questions:

1. What is web usage mining? Explain its applications.

2. Discuss various types of web mining and their importance.

4. What are the challenges in multimedia mining?

You might also like