Understanding Data Mining

This document provides an outline for a course on predictive analytics. It covers reviewing business analytics, understanding predictive analytics through definitions and uses. It discusses getting started with predictive analytics including data preprocessing and understanding data mining. The document focuses on a section about data mining, defining it, discussing its importance and goals. It describes the types of data mining algorithms including supervised learning techniques like classification and regression, and unsupervised learning techniques like clustering.

Uploaded by

Yah yah yahhhhh

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

146 views

Understanding Data Mining

Uploaded by

Yah yah yahhhhh

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 21

BANA 104 (3 Units)

Fundamentals of Predictive Analytics

By: Danilo “Sir Dan” Dumantay, MBA, CPA, CGFM, AIF

Predictive Analytics Outline
A. Review of Business Analytics
B. Understanding Predictive Analytics
B1. Practical Illustrations
B2. Introduction, Definition, Evolution
B3. Uses
C. Getting Started with Predictive Analytics
D. Data Preprocessing
E. Understanding Data Mining
F. Using Predictive Tools
G. Interacting With Predictive Analytics Software
(Power BI)
H. Case Study
Understanding Data Mining

1. Introduction of Data Mining

2. Definition of Data Mining
3. Importance of Data Mining
4. Goal of Data Mining
5. Types of Data Mining Algorithm
– Supervised Learning
– Unsupervised Learning
Introduction of Data Mining
Data mining is one of the “10 emerging
technologies that will change the world” listed by
the MIT Technology Review (Larose).

There is no doubt why many firms embrace data

mining in their operations. An article in
Information System Management points out that
“data mining has become a widely accepted
process for organizations to enhance their
organizational performance and gain a competitive
advantage”
Introduction of Data Mining
What is Data Mining in
Business?

• Decision making
• Marketing
• Detecting Fraud
The Data Mining technology is popular with many
businesses because it allows businesses to learn
more about their customers, prevent frauds and
identity theft, and also make smart marketing
decisions.
Definitions of Data Mining
Data Mining is the analysis step of
Knowledge Discovery in Databases or KDD.
 The core of the KDD process, involving the
inferring of algorithms that explore the
data, develop the model and discover
previously unknown patterns.
 Algorithms - a process or set of rules to be
followed in calculations or other problem-
solving operations, especially by a computer.
Definitions of Data Mining
Data Mining is the process of discovering
new, hidden or unexpected patterns and
inferring associations in raw data.
Data Mining is a collection of powerful
techniques intended to analyse large
amounts of data.
There is no single Data Mining approach.
Data Mining can employ a range of
methods, either individually or in
combination with each other.
Importance of Data Mining
Data are being generated in enormous
quantities
Data are being collected over long periods
of time
Data are being kept for long periods of
time
Computing power is formidable and
cheap
A variety of Data Mining software is
available
Goal of Data Mining
The overall goal of the data mining process is to
extract information from a data set and transform
it into an understandable structure for further use
and action. Predictive Analytics

Discovering meaningful new corrections, patterns.

Predictive Analytics

Discovering trends. Forecasting

Types of Data Mining Algorithms
Supervised learning
• Classification
• Regression

Unsupervised learning
• Association Analysis
• Sequential Pattern Analysis
• Clustering
• Text Mining/Social Media Sentiment
Analysis
Supervised Learning
 In supervised learning, the output datasets
are provided which are used to train the
machine and get the desired outputs.
 In supervised learning, there is a given data
set and how the correct output should look
like is already known.
 In supervised learning, there is a
relationship between the input and the
output.
Unsupervised Learning
 In Unsupervised learning no datasets
are provided, instead the data is
clustered into different classes .
 In Unsupervised learning, it allows us
to approach problems with little or no
idea what our results should look like.
We can derive structure from data
where we don't necessarily know the
effect of the variables.
Illustration: Supervised vs Unsupervised
Situation:
• Basket full of fresh fruits (apple, banana, cherry
grape, orange).
• Task is to arrange the same type of fruit

Supervised learning:
• From previous work, you already know the
shape of each fruit so it is easy to arrange the
same type of fruits at one place.
• Here your previous work is called as train data in
data mining.
• You already learn the things from your train data
(i.e. the features of each fruit).
Illustration: Supervised vs Unsupervised, continued
Unsupervised Learning:
•No knowledge about fruits. So, how will you arrange
the same type of fruit.
•To arrange, select any physical character of a particular
fruit.
• If color: Then it will be arranged based on color
•Red group: apples & cherry fruits.
•Green group: bananas & grapes.
•Select another physical character, eg. Size.
•Red and big group: apple.
•Red and small group: cherry fruits.
•Green and big group: bananas.
•Green and small: grapes
Categories of Supervised Learning
Supervised learning problems are categorized into
"classification" and “regression” problems.

• In a classification problem, we predict

results in a discrete output, i.e.
mapping input variables into discrete
categories.
• In a regression problem, we predict
results within a continuous output, i.e.
mapping input variables to some
continuous function.
Categories of Supervised Learning, continued
Given: Data about the size of houses on the real
estate market.
 If predicting the output as to whether the
house "sells for more or less than the
asking price”, it is a classification
problem.
 If predicting the price of the house, it is a
regression problem because price as a
function of size is a continuous output.
Supervised Learning: Classification
1. Classification
–Data mining task of predicting the
value of a categorical variable by
building a model based on one or
more numerical and/or categorical
variables.
–Classify a data item into one or several
predefined classes.
Supervised Learning: Regression
2. Regression
• Data mining task of predicting the value of
numerical variable by building a model based
on one or more predictors (numerical and
categorical variables.
• Examples:
o Predicting sales mounts of new product
based on advertising expenditure.
o Predicting wind velocities as a function of
temperature, humidity, air pressure, etc.
o Time series prediction of stock market
indices.
Questions?

By: Danilo “Sir Dan” Dumantay, MBA, CPA, CGFM, AIF

Solid Starts - First 100 Days
94% (18)
Solid Starts - First 100 Days
287 pages
Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
62% (68)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
KamaSutra Positions
69% (83)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
75% (12)
27 Feedback Mechanisms Pogil Key
6 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
Sample Mental Health Progress Note
96% (47)
Sample Mental Health Progress Note
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (55)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
71% (69)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Building Recommendation System Using Movielens Data
No ratings yet
Building Recommendation System Using Movielens Data
6 pages
Memory Based Reasoning - BIA
100% (1)
Memory Based Reasoning - BIA
19 pages
Research Paper
No ratings yet
Research Paper
7 pages
Inteligen-Nt Sevice Manual1
100% (1)
Inteligen-Nt Sevice Manual1
21 pages
3x Weekly DUP Template W Block Progression
No ratings yet
3x Weekly DUP Template W Block Progression
16 pages
Data Mining
100% (1)
Data Mining
18 pages
Data Mining: Business Intelligence
No ratings yet
Data Mining: Business Intelligence
68 pages
Lecture 1
100% (1)
Lecture 1
21 pages
The Next Frontier For Innovation, Competition and Productivity
No ratings yet
The Next Frontier For Innovation, Competition and Productivity
23 pages
Unit I-Ch 01-Big Data Introduction
No ratings yet
Unit I-Ch 01-Big Data Introduction
40 pages
Introduction To Data Mining: Dr. Dipti Chauhan Assistant Professor SCSIT, SUAS Indore
No ratings yet
Introduction To Data Mining: Dr. Dipti Chauhan Assistant Professor SCSIT, SUAS Indore
16 pages
Data Mining
No ratings yet
Data Mining
27 pages
Data Science Lecture 1 Introduction
No ratings yet
Data Science Lecture 1 Introduction
27 pages
Data Science - A Kaggle Walkthrough - Introduction - 1 PDF
No ratings yet
Data Science - A Kaggle Walkthrough - Introduction - 1 PDF
5 pages
Big Data
No ratings yet
Big Data
16 pages
Supervised Vs Unsupervised Learning
No ratings yet
Supervised Vs Unsupervised Learning
9 pages
Data Mining Hotel
No ratings yet
Data Mining Hotel
17 pages
Future Skills - An Introduction, General Overview of The Future Skills Sub-Sector-1
No ratings yet
Future Skills - An Introduction, General Overview of The Future Skills Sub-Sector-1
15 pages
Internet of Things
No ratings yet
Internet of Things
25 pages
Analytics: The Real-World Use of Big Data: How Innovative Enterprises Extract Value From Uncertain Data
100% (1)
Analytics: The Real-World Use of Big Data: How Innovative Enterprises Extract Value From Uncertain Data
22 pages
Nptel Swayam DWDM Slides
No ratings yet
Nptel Swayam DWDM Slides
406 pages
Data Warehousing and Mining
No ratings yet
Data Warehousing and Mining
2 pages
Introduction To Big Data - Presentation
No ratings yet
Introduction To Big Data - Presentation
30 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
70 pages
Data Warehousing & Mining: Unit - V
100% (2)
Data Warehousing & Mining: Unit - V
13 pages
Topic 1 Etw3482
100% (2)
Topic 1 Etw3482
69 pages
Notes DATA MINING MBA III
No ratings yet
Notes DATA MINING MBA III
8 pages
Data Science 3
No ratings yet
Data Science 3
216 pages
Association Rules
No ratings yet
Association Rules
64 pages
Data Mining
No ratings yet
Data Mining
14 pages
RMM Unit-I Introdution To Data Mining
No ratings yet
RMM Unit-I Introdution To Data Mining
129 pages
BDM Unit I Slides Part 1
No ratings yet
BDM Unit I Slides Part 1
27 pages
Data Science Case Study For Introduction
No ratings yet
Data Science Case Study For Introduction
19 pages
Introduction To Data Mining
75% (4)
Introduction To Data Mining
45 pages
Social Media
No ratings yet
Social Media
17 pages
7 More Steps To Mastering Machine Learning With Python - Page1
No ratings yet
7 More Steps To Mastering Machine Learning With Python - Page1
8 pages
BDACh 07 L06 Real Time Analytics Platform
No ratings yet
BDACh 07 L06 Real Time Analytics Platform
14 pages
Introduction To Data Science 5-13
No ratings yet
Introduction To Data Science 5-13
19 pages
Data Mining
No ratings yet
Data Mining
14 pages
Data Science Syllabus
No ratings yet
Data Science Syllabus
3 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
18 pages
Big Data Analytics in Life Insurance PDF
100% (1)
Big Data Analytics in Life Insurance PDF
20 pages
1 - Intro To Machine Learning
100% (1)
1 - Intro To Machine Learning
20 pages
Big Data
No ratings yet
Big Data
13 pages
Big Data Not Right Data Yes
No ratings yet
Big Data Not Right Data Yes
8 pages
Data Mining Seminar
100% (2)
Data Mining Seminar
21 pages
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
No ratings yet
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
27 pages
Data Mining
No ratings yet
Data Mining
32 pages
Anomaly Detection: Course: Data Mining II
No ratings yet
Anomaly Detection: Course: Data Mining II
12 pages
Distributed System
100% (1)
Distributed System
119 pages
Data Mining - Tasks: Data Characterization Data Discrimination
No ratings yet
Data Mining - Tasks: Data Characterization Data Discrimination
4 pages
Dream House Price Predictor
No ratings yet
Dream House Price Predictor
8 pages
2 Da
100% (1)
2 Da
17 pages
Data Science in E-Commerce - Report - Writing
No ratings yet
Data Science in E-Commerce - Report - Writing
18 pages
Data Mining
No ratings yet
Data Mining
49 pages
Chapter 2 - Introduction To Data Science
No ratings yet
Chapter 2 - Introduction To Data Science
36 pages
MachineLearning Presentation
No ratings yet
MachineLearning Presentation
71 pages
Deep Learning and CNNFYTGS5101-Guoyangxie
No ratings yet
Deep Learning and CNNFYTGS5101-Guoyangxie
42 pages
Engineering-A Review Web Data Scrapping
No ratings yet
Engineering-A Review Web Data Scrapping
4 pages
Chapter-3 DATA MINING PDF
No ratings yet
Chapter-3 DATA MINING PDF
13 pages
The Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats
From Everand
The Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats
Dr.Chandrakant
No ratings yet
Machine Learning and Deep Learning With Python
From Everand
Machine Learning and Deep Learning With Python
James Chen
No ratings yet
How To Construct A Bar Chart: The Title
No ratings yet
How To Construct A Bar Chart: The Title
3 pages
Mathematics 2009
No ratings yet
Mathematics 2009
20 pages
CBSE Class XII Business Studies - Planning Notes 2
No ratings yet
CBSE Class XII Business Studies - Planning Notes 2
5 pages
Cpu 95 Oi e 2 10
No ratings yet
Cpu 95 Oi e 2 10
58 pages
Assignments With Solution
No ratings yet
Assignments With Solution
14 pages
C166SV2 Manual
No ratings yet
C166SV2 Manual
440 pages
Fluid-Structure Interaction Involving Free Surface Flows: Report On A Research Project Supported by A DAAD Scholarship
No ratings yet
Fluid-Structure Interaction Involving Free Surface Flows: Report On A Research Project Supported by A DAAD Scholarship
6 pages
RCU-User Guide v5.00
No ratings yet
RCU-User Guide v5.00
43 pages
Introduction To Programming: Shabir Ahmad Usmany
No ratings yet
Introduction To Programming: Shabir Ahmad Usmany
7 pages
Top 10 Best Practices For VMware Data Protection
No ratings yet
Top 10 Best Practices For VMware Data Protection
17 pages
Application of Predictive Analytics in Customer Relationship Mana
No ratings yet
Application of Predictive Analytics in Customer Relationship Mana
8 pages
A Framework For Understanding and Comparing Enterprise Architecture Models
100% (1)
A Framework For Understanding and Comparing Enterprise Architecture Models
14 pages
2 Processes
No ratings yet
2 Processes
18 pages
Sol All Multimedia
No ratings yet
Sol All Multimedia
5 pages
Class and String Stream Processing
No ratings yet
Class and String Stream Processing
24 pages
ServiceMax Power and Utility Final
No ratings yet
ServiceMax Power and Utility Final
4 pages
PI or Not To PI in Game Lighting Equation - Sébastien Lagarde
No ratings yet
PI or Not To PI in Game Lighting Equation - Sébastien Lagarde
19 pages
Future Workplace 2025
No ratings yet
Future Workplace 2025
16 pages
FY MSC Cs
No ratings yet
FY MSC Cs
34 pages
Automatic Vehicle Speed Control at School Zones
100% (1)
Automatic Vehicle Speed Control at School Zones
4 pages
Kivy - Interactive Applications and Games in Python - Second Edition - Sample Chapter
No ratings yet
Kivy - Interactive Applications and Games in Python - Second Edition - Sample Chapter
35 pages
Jzos Users Guide v8
No ratings yet
Jzos Users Guide v8
64 pages
Static Routing
No ratings yet
Static Routing
3 pages
Depreciation Posting Runs Explained in Detail
No ratings yet
Depreciation Posting Runs Explained in Detail
2 pages
Lmi Method
No ratings yet
Lmi Method
38 pages
Microsoft Age of Empires Expansion Readme File
No ratings yet
Microsoft Age of Empires Expansion Readme File
6 pages
Answer:: Correct
No ratings yet
Answer:: Correct
19 pages