Data Science Introduction
Data Science Introduction
Dr. A. Ramesh
DEPARTMENT OF MANAGEMENT
IIT ROORKEE
1
Objective of the course
• The principle focus of this course is to introduce conceptual understanding
using simple and practical examples rather than repetitive and point click
mentality
• This course should make you comfortable using analytics in your career
and your life
• You will know how to work with real data, and might have learned many
different methodologies but choosing the right methodology is important
2
Objective of the course Contd…
3
Learning objectives
1. Define data and its importance
2. Define data analytics and its types
3. Explain why analytics is important in today’s business environment
4. Explain how statistics, analytics and data science are interrelated
5. Why python?
6. Explain the four different levels of Data:
– Nominal
– Ordinal
– Interval and
– Ratio
4
1. Define Data and its importance
5
1.1 Variable, Measurement and Data
6
1.2 What is generating so much data?
7
1.3 How data add value to business?
Data warehouse
Business value
Source:https://datajobs.com/
8
Data Products
9
1.4 Why Data is important?
10
2. Define data analytic and its types
• Define data analytics
• Data analysis
11
2.1. Define data analytics
12
2.2 Why analytics is important?
13
2.3 Data analysis
• Data analysis is the process of examining, transforming, and
arranging raw data in a specific way to generate useful
information from it
• Data analysis allows for the evaluation of data through
analytical and logical reasoning to lead to some sort of
outcome or conclusion in some context
• Data analysis is a multi-faceted process that involves a
number of steps, approaches, and diverse techniques
14
Analysis 2.4 Data analytics vs. Data analysis
Past
Explain
How?
Why?
15
2.4 Data analytics vs. Data analysis Analytics
Future
16
2.4 Data analytics vs. Data analysis
Analytics
Qualitative Quantitative
ll
ll
Intuition + analysis Formulas + algorithms
17
Analysis
Quantitative
ll
Qualitative Data + how the sale decreased last summer
ll
18
Analysis =/ Analytics
Data Analysis =/ Data analytics
19
2.5 Classification of Data analytics
Based on the phase of workflow and the kind of analysis required, there are
four major types of data analytics.
• Descriptive analytics
• Diagnostic analytics
• Predictive analytics
• Prescriptive analytics
20
Classification of Data analytics
https://www.governanceanalytics.org/knowledge-
base/Main_Tools/Data_classification_and_analysis
21
Descriptive Analytics
• Descriptive Analytics, is the conventional form of Business Intelligence and
data analysis
• It seeks to provide a depiction or “summary view” of facts and figures in
an understandable format
• This either inform or prepare data for further analysis
• Descriptive analysis or statistics can summarize raw data and convert it
into a form that can be easily understood by humans
• They can describe in detail about an event that has occurred in the past
22
Example
A common example of Descriptive Analytics are company reports that simply
provide a historic review like:
• Data Queries
• Reports
• Descriptive Statistics
• Data Visualization
• Data dashboard
Source: https://www.linkedin.com/learning/478e9692-d13d-338f-907e-d76f0724d773
23
Diagnostic analytics
24
Example
1. Data Discovery
2. Data Mining
3. Correlations
25
Predictive analytics
26
Source: https://www.logianalytics.com/wp-content/uploads/2017/11/predictive-1.png
27
Example
• Set of techniques that use model constructed from past data to predict
the future or ascertain impact of one variable on another:
1. Linear regression
2. Time series analysis and forecasting
3. Data mining
Source: https://bigdata-madesimple.com/5-examples-predictive-analytics-travel-industry/
28
Prescriptive analytics
29
Prescriptive analytics: Example
• Optimization Model
• Simulation
• Decision Analysis
30
3. Explain why analytics is important
31
3. Explain why analytics is important
Data Scientist
Search Trends
Statistician, Operations Researcher
32
https://timesofindia.indiatimes.com/india/Data-scientists-earning-more-than-
CAs-engineers/articleshow/52171064.cms
33
3.1 Demand for Data Analytics
http://timesofindia.indiatimes.com/articleshow/52171064.cms?utm_source=
contentofinterest&utm_medium=text&utm_campaign=cppst
34
3.2 Element of data Analytics
35
4. Data analyst and Data scientist
36
4.1 The requisite skill set
Technology;
Mathematic
Hacking Skill
Expertise
Business and
strategy Data Science
acumen
37
4.1 The requisite skill set
Mathematic Technology;
Expertise Hacking Skill
Business and
strategy
Data Science
acumen
38
4.1 The requisite skill set
Mathematic Technology;
Expertise Hacking Skill
Business and
strategy
Data Science
acumen
39
4.2 Difference between Data analyst and Data Scientist
Business Administration
Analyst
Domain specific responsibility : For Example marketing analyst, Financial analyst etc.
Data Scientist
Advance algorithms and machine learning
Source:https://datajobs.com/
40
5. Why python?
Features
• Simple and easy to learn
• Freeware and Open source
• Interpreted
• Dynamically Typed
• Extensible
• Embedded
• Extensive library
41
5. Why python?
Usability
• Desktop and web applications
• Database applications
• Networking applications
• Data analysis (Data Science)
• Machine learning
• IoT and AI applications
• Games
42
Companies using Python
43
Why Jupyter NoteBook?
Why?
• Client – Server Application
• Edit code on web browser
• Easy in documentation
• Easy in demonstration
• User- friendly Interface
44
6. Explain the four different levels of Data
• Types of Variables
• Levels of Data Measurement
• Compare the four different levels of Data:
Nominal
Ordinal
Interval and
Ratio
• Usage Potential of Various Levels of Data
• Data Level, Operations, and Statistical Methods
45
6.1 Types of Variables
Data
Categorical Numerical
Examples:
Marital Status
Political Party Discrete Continuous
Eye Color
Examples: Examples:
(Defined categories)
Number of Children Weight
Defects per hour Voltage
(Counted items) (Measured characteristics)
6.2 Levels of Data Measurement
47
6.3.1 Nominal
48
6.3.2 Ordinal scale
49
6.3.3. Interval scale
50
6.3.4 Ratio scale
51
6.4 Usage Potential of Various
Levels of Data
Ratio
Interval
Ordinal
Nominal
52
6.5 Impact of choice of measurement scale
Statistical
Data Level Meaningful Operations
Methods
53
Thank You
54