0% found this document useful (0 votes)

32 views41 pages

1 - Introduction

The Data Analytics course (DS342) aims to equip students with the skills to analyze data using various software tools and techniques, focusing on real-world applications. Key components include a grading schema with exams, assignments, and quizzes, as well as topics like data representation, business analytics, and decision models. The course emphasizes the importance of data analytics in future job markets and provides foundational knowledge in tools like Excel, Power Query, and statistical methods.

Uploaded by

omar Yousef

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views41 pages

1 - Introduction

Uploaded by

omar Yousef

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Data Analytics

DS342

1-2
Data Analytics
Course Instructors:
DS342
[Link] Sabry

Course TA.:

[Link] Ayman

3
WELCOME TO THE
COURSE ☺

4
Grading Schema

• Final exam 60%

• Mid-term exam 20%

• Assignments (4 assignments)10%

• Quizzes (2 quizzes) 10%

1-5
Textbook Content

1-6
Covered Topics
Introduction Data
Spreadsheet
to Business Management
Modeling
Analytics and Wrangling

Summary
Pivot Tables Dashboards
Measures

BI Tools –
Regression
Power Pivot & Power BI
Analysis
Power Query
7
The Essence of the Course
The overall goal of this course is to:

Understand data analytics and be able to apply data analysis to

data sets using a variety of software tools and techniques

This course will provide the tools for you to perform your own data
analysis when encountering problems in the real-world.

1-8
Course Objectives
1. Understand data representation formats and techniques and
how to use them.

2. Experience a wide-range of data analytics tools include Excel,

Power Query, visualization and reporting software.

3. Develop a computational thinking approach to problem solving

and use programs and scripting to solve data tasks.

4. Be able to clearly articulate a problem in a systematic way .

1-9
What is Data Analysis?
Data analysis is the processing of data to yield useful insights or knowledge.
• Data processing involves finding, loading, cleaning, manipulating, transforming, modeling,
and visualizing the data.

• The knowledge may be used for scientific discovery, business decision-making, or a variety
of other applications.

A data analyst is a person who uses tools and applications to transform raw data
into a form that will be useful.
• Data analyst jobs are projected to be one of the top jobs over the next 10 years.
▪ See: [Link]

1-10
Why is Data Analytics Important?
Data analytics is important as society is collecting more and larger data sets all the time:
• Web: All web pages visited and links clicked, searches made, images and posts
• Business: Items purchased by date, supply chain/customers, industrial sensors
• Science: Massive data sets (biological/genomic, astronomy, physics)
• Environmental: Sensors and monitors (temperature, etc.)
and transforming this raw data into useful insights has major value:
• Web: Online advertising driven by understanding customer behavior
• Business: Sales predictions, marketing promotions, manufacturing improvement
• Science: Scientific discoveries, new medical treatments and drugs
• Environmental: Understanding of environmental processes to allow for changing
policies and behaviors

1-11
Data Analytics Tools
✓ A data analyst has expertise in programming, statistics, data collection
and data visualization.
✓ In this course, you will learn industrial tools and build competency in each one of
these skills.
✓ As an introductory course, the goal is to get exposure to the skills and techniques as
there will not be time for mastery.
✓ These tools of systems and techniques will be useful in many jobs even if they are not
considered data analyst positions.

1-12
Why This Course is Important
➢ Many professional jobs of the future will involve collecting, manipulating, and analyzing data.
➢ People who can understand how data can be used will have better employment opportunities.
Important results:
• Excel Proficiency – Everyone should know how to use Excel as a general data analysis and productivity
software.
• Databases – Understand how they work and how to use them.
• Programming and Computational Thinking – The ability to clearly articulate a problem in a systematic way
has applications beyond data analytics.
• Applied Statistics – Using R and other software makes your statistics training useful for real-world
problems.
• Real-world problem solving – Your tools will allow you to tackle real-world data analysis problems and
understand what tool to use and how to proceed.

1-13
Chapter 1
Introduction to Business Analytics

14
Business Analytics

(Business) Analytics is the use of:

• data,
• information technology,
• statistical analysis,
• quantitative methods, and
• mathematical or computer-based models
to help managers gain improved insight about their business
operations and make better, fact-based decisions.

1-15
A Visual Perspective of Business
Analytics

1-16
Overview of Business Analytics
• Business analytics begins with understating the business context.
• Ask the right questions
• Identify the appropriate analysis
• Communicate information
• Numerical results are not very useful unless they are
accompanied with clearly stated actionable business insights.

17
Scope of Business Analytics

 Descriptive analytics: the use of data to understand past and

current business performance and make informed decisions.

• Predictive analytics: predict the future by examining historical

data, detecting patterns or relationships in these data, and then
extrapolating these relationships forward in time.

• Prescriptive analytics: identify the best alternatives to minimize

or maximize some objective.

1-18
Example 1.1: Retail Markdown Decisions

 Most department stores clear seasonal inventory by reducing

prices.
 Key question: When to reduce the price and by how much to
maximize revenue?
 Potential applications of analytics:
 Descriptive analytics: examine historical data for similar products (prices,
units sold, advertising, …)
 Predictive analytics: predict sales based on price
 Prescriptive analytics: find the best sets of pricing and advertising to
maximize sales revenue

1-19
Tools
Dashboards to
Database queries report key Statistical
Data visualization
and analysis performance methods
measures

Spreadsheets and Scenario and

Simulation Forecasting
predictive models “what-if” analyses

Social media,
Data and text
Optimization web, and text
mining
analytics

1-20
Data for Business Analytics

Data: numerical or textual facts and figures that are

collected through some type of measurement process.

Information: result of analyzing data; that is, extracting

meaning from data to support evaluation and decision
making.

1-21
Data Sets and Databases

Data set - a collection of data. Database - a collection of related

files containing records on
people, places, or things.
Examples: Marketing survey A database file is usually organized
responses, a table of historical in a two-dimensional table, where
stock prices, and a collection of the columns correspond to each
measurements of dimensions of a individual element of data (called
manufactured item. fields, or attributes), and the rows
represent records of related data
elements.
1-22
Example 1.2: A Sales Transaction Database File

Records
(Observations)

Entities Fields or Attributes

(Elements) (Variables)

1-23
Decision Models
Decision model - a logical or mathematical representation of
a problem or business situation that can be used to
understand, analyze, or facilitate making a decision.

Inputs:
Uncontrollable Decision variables,
Data, which are variables, which are which are controllable
assumed to be quantities that can and can be selected at
constant for purposes change but cannot be the discretion of the
of the model. directly controlled by decision maker.
the decision maker. 1-24
Nature of Decision Models

1-25
Spreadsheet Models
• Spreadsheet modeling is an alternative to algebraic modeling that
relates various quantities in a spreadsheet with cell formulas.
• Instant feedback is available from spreadsheets, so if a formula is
entered incorrectly, it is often immediately obvious.
• Developing good spreadsheet models is not easy.
• They must be correct, well designed and well documented.

1-26
Spreadsheet Models
• A spreadsheet model for a specific
example of the product mix
problem is shown below.

1-27
Types of Data

• Collected by recording a characteristic of many

Cross- subjects at the same point in time
sectional data • Recording a characteristic of many subjects at the
same point in time

• Collected over several time periods focusing on

Time series certain groups of people, specific events, or objects
data • Hourly, daily, weekly, monthly, quarterly, or annual
observations

28
Types of Data

29
Types of Data

30
Variables and Scales of Measurement
• A variable is a characteristic of interest that differs in kind or degree among
various observations (records).
• There are two types of variables: categorical and numerical
1. Categorical 2. Numerical
◦ Also called qualitative • Also called quantitative
◦ Represent categories • Represent meaningful numbers
◦ Labels or names to identify distinguishing • Arithmetic operations are meaningful
characteristics a)Discrete: assumes a countable number of
◦ Arithmetic operations on the labels/values are not values
meaningful Example: number of children in a family
b)Continuous: assumes an uncountable
◦ Coded into numbers for data processing
number of values within an interval
Example : marital status Example: investment returns

31
Working Example : Gig
• BalanceGig is a company that matches independent workers for short-term
engagements with businesses in the construction, automotive, and high-tech
industries.
• The ‘gig’ employees work only for a short period of time, often on a particular project
or a specific task.
• A manager at BalanceGig extracts the employee data from their most recent work
engagement, including: the following variables
✓the hourly wage (HourlyWage),
✓the client’s industry (Industry), and
✓the employee’s job classification (Job).

32
Working Example : Gig
The manager would like to find:
1. Number of missing observations for the HourlyWage, Industry, and
Job variables.
2. The number of employees who
✓ worked in the automotive industry,
✓ earned more than $30 per hour, and
✓ worked in the automotive industry and earned more than $30 per hour.
3. The hourly wage of the lowest and the highest-paid employees at the
company as a whole, and
4. The hourly wage of the lowest and the highest-paid accountants who
worked in the automotive and the tech industries.

33
Working Example : Gig
1. There are a total of 604 records in the data set.
✓ There are no missing values in the HourlyWage variable.
✓ The Industry and Job variables have 10 and 16 missing values, respectively.
2. 190 employees worked in the automotive industry,
✓ 536 employees earned more than $30 per hour, and
✓ 181 employees worked in the automotive industry and earned more than $30 per hour.
3. The lowest and the highest hourly wages in the data set are $24.28 and $51.00, respectively.
4. The three employees who had the lowest hourly wage of $24.28 all worked in the construction
industry and were hired as Engineer, Sales Rep, and Accountant, respectively.
• Interestingly, the employee with the highest hourly wage of $51.00 also worked in the
construction industry in a job type classified as Other.

34
Working Example : Gig
4. The lowest- and the highest-paid accountants who worked in
the automotive industry made $28.74 and $49.32 per hour,
respectively.
In the technology industry, the lowest- and the highest paid
accountants made $36.13 and $49.49 per hour, respectively.
• Note that the lowest hourly wage for an accountant is
considerably higher in the technology industry compared to the
automotive industry ($36.13 > $28.74).

35
Transforming Numerical Data
• Binning is the process of transforming numerical variables into
categorical variables by grouping the numerical values into a
small number of groups or bins.
• It is important that the bins are consecutive and nonoverlapping
so that each numerical value falls into only one bin.
• Binning can be an effective way to reduce noise in the data if we
believe that all observations in the same bin tend to behave the
same way.

36
Transforming Numerical Data
• Data transformation is an important step in bringing out the information in the
data set, which can then be used for further data analysis.
• Another common approach is to create new variables through mathematical
transformations of existing variables.
• Similarly, in order to analyze trend, we often transform raw data values into
percentages.
• Sometimes, data on variables such as income, firm size, and house prices are
highly skewed.
• Extremely high (or low) values of skewed variables significantly inflate the
average for the entire data set
• Difficult to detect meaningful relationships with skewed variables

37
Transforming Categorical Data
• An effective strategy for dealing with categorical data is category reduction,
where we collapse some of the categories to create fewer nonoverlapping
categories.
• Determining the appropriate number of categories often depends on the
data, context, and disciplinary norms, but there are a few general guidelines.
• Categories with very few observations may be combined to create the
“Other” category. The rationale behind this approach is that a critical mass
can be created for this “Other” category to help reveal patterns and
relationships in data.
• Categories with a similar impact may be combined.

38
Transforming Categorical Data
• Dealing with numerical data is often easier than categorical data because
it avoids the complexities of the semantics pertaining to each category of
the variable.
• A dummy variable, also referred to as an indicator or a binary variable, is
commonly used to describe two categories of a variable.
• It assumes a value of 1 for one of the categories and 0 for the other category,
referred to as the reference or the benchmark category.
• Dummy variables do not suggest any ranking of the categories.
• Oftentimes, a categorical variable is defined by more than two categories.
• Given k categories of a variable, the general rule is to create k − 1 dummy
variables, using the last category as reference.

39
Transforming Categorical Data
• Another common transformation of categorical variables is to create
category scores.
• This approach is most appropriate if the data are ordinal and have
natural, ordered categories.
• This transformation allows the categorical variable to be treated as a
numerical variable in certain analytical models.
• With this transformation, we need not convert a categorical variable
into several dummy variables or to reduce its categories.
• For an effective transformation, however, we assume equal
increments between the category scores, which may not be
appropriate in certain situations.

40
Transforming Categorical Data
• Example: In customer satisfaction surveys, we often use ordinal
scales such as very dissatisfied, somewhat dissatisfied, neutral,
somewhat satisfied, and very satisfied to indicate the level of
satisfaction.
• In such cases, we can recode the categories numerically using
numbers 1 through 5 with 1 being very dissatisfied and 5 being
very satisfied.

41
Thank You ☺

Week 1
No ratings yet
Week 1
50 pages
Intro To Data Analytics
No ratings yet
Intro To Data Analytics
42 pages
Unit 1 - Business Analytics
100% (1)
Unit 1 - Business Analytics
57 pages
Business Analytics
No ratings yet
Business Analytics
65 pages
Business Analytics Summary (Units 1.2 - 1.8)
No ratings yet
Business Analytics Summary (Units 1.2 - 1.8)
8 pages
Data Analytics For Beginners - Paul Kinley - CreateSpace Independent Publishing Platform 2016 - IsBN 978-1-53989-673-9
100% (2)
Data Analytics For Beginners - Paul Kinley - CreateSpace Independent Publishing Platform 2016 - IsBN 978-1-53989-673-9
51 pages
R Programming and Analytics Certification
No ratings yet
R Programming and Analytics Certification
25 pages
Understanding Data Analytics Techniques
No ratings yet
Understanding Data Analytics Techniques
91 pages
Unit 1
No ratings yet
Unit 1
54 pages
Dataanalyticsunit-1 (2) 104014
No ratings yet
Dataanalyticsunit-1 (2) 104014
51 pages
Lec 1
No ratings yet
Lec 1
27 pages
Introduction to Business Analytics Concepts
100% (1)
Introduction to Business Analytics Concepts
22 pages
DA Unit 1
No ratings yet
DA Unit 1
14 pages
Ca 1 Merged
No ratings yet
Ca 1 Merged
677 pages
Data Science Introduction
100% (1)
Data Science Introduction
54 pages
Google Data Analytics Course Overview
No ratings yet
Google Data Analytics Course Overview
16 pages
Introduction to Business Analytics Concepts
No ratings yet
Introduction to Business Analytics Concepts
27 pages
Business Analytics Course Overview
No ratings yet
Business Analytics Course Overview
52 pages
CHP 1 Introduction To Data Science Analytics
No ratings yet
CHP 1 Introduction To Data Science Analytics
8 pages
Introduction to Business Analytics
No ratings yet
Introduction to Business Analytics
66 pages
Data Analytics
75% (4)
Data Analytics
45 pages
ADM 2302: Introduction To Business Analytics
No ratings yet
ADM 2302: Introduction To Business Analytics
49 pages
Introduction to IT and Analytics Course
No ratings yet
Introduction to IT and Analytics Course
24 pages
Introduction to Data Analysis Concepts
No ratings yet
Introduction to Data Analysis Concepts
16 pages
Introduction To Data Science and Data Analytics
No ratings yet
Introduction To Data Science and Data Analytics
72 pages
Business Analytics - Page-0001
No ratings yet
Business Analytics - Page-0001
15 pages
Business Analytics Overview and Applications
No ratings yet
Business Analytics Overview and Applications
66 pages
Data Analytics Course Overview and Insights
No ratings yet
Data Analytics Course Overview and Insights
88 pages
Evans pdf1
No ratings yet
Evans pdf1
47 pages
Excel for Business Analytics Beginners
No ratings yet
Excel for Business Analytics Beginners
21 pages
Understanding Business Analytics Essentials
No ratings yet
Understanding Business Analytics Essentials
33 pages
Data Analytics: Overview and Types
No ratings yet
Data Analytics: Overview and Types
47 pages
FDA Notes - CCA 1
No ratings yet
FDA Notes - CCA 1
6 pages
Business Data Analysis Overview
No ratings yet
Business Data Analysis Overview
137 pages
Introduction to Data Analytics Concepts
No ratings yet
Introduction to Data Analytics Concepts
16 pages
Chapter 1 - Intro To Business Analytics
No ratings yet
Chapter 1 - Intro To Business Analytics
52 pages
Introduction to Business Analytics Basics
No ratings yet
Introduction to Business Analytics Basics
138 pages
Buma30023 Fund of Business Analytics
100% (1)
Buma30023 Fund of Business Analytics
91 pages
Chap - 1 - Data Analytics
No ratings yet
Chap - 1 - Data Analytics
29 pages
Data Analytics in Business Management
No ratings yet
Data Analytics in Business Management
51 pages
Data Analysis Foundations and Techniques
No ratings yet
Data Analysis Foundations and Techniques
26 pages
KMBN IT01 LM Consolidated
No ratings yet
KMBN IT01 LM Consolidated
123 pages
Data Analytics Course Overview and Insights
No ratings yet
Data Analytics Course Overview and Insights
65 pages
Introduction To Data Science and Data Analytics
No ratings yet
Introduction To Data Science and Data Analytics
85 pages
Lecture 1 - BISM7233 - AS - 2023
No ratings yet
Lecture 1 - BISM7233 - AS - 2023
62 pages
Business Analytics
No ratings yet
Business Analytics
42 pages
Business Analytics: Types and Tools Explained
No ratings yet
Business Analytics: Types and Tools Explained
40 pages
Understanding Business Analytics Essentials
No ratings yet
Understanding Business Analytics Essentials
17 pages
Excel Analytics for Professionals
No ratings yet
Excel Analytics for Professionals
79 pages
Introduction to Data Analytics Course
No ratings yet
Introduction to Data Analytics Course
23 pages
Data Analytics
No ratings yet
Data Analytics
9 pages
Module 2 - Fund. of Business Analytics
No ratings yet
Module 2 - Fund. of Business Analytics
26 pages
Business Analytics and Big Data PDF
100% (1)
Business Analytics and Big Data PDF
15 pages
Unit-2 Data Analytics
No ratings yet
Unit-2 Data Analytics
119 pages
5 - Pivot Tables
No ratings yet
5 - Pivot Tables
30 pages
2 - Data Management and Wrangling
No ratings yet
2 - Data Management and Wrangling
33 pages
3 - Summary Measures
No ratings yet
3 - Summary Measures
34 pages
Python Notes
No ratings yet
Python Notes
2 pages
ISO 20000 Standard
No ratings yet
ISO 20000 Standard
2 pages
Local Planning and Resource Mapping in Nepal
100% (2)
Local Planning and Resource Mapping in Nepal
11 pages
Interview Preparations - NielsenIQ
No ratings yet
Interview Preparations - NielsenIQ
1 page
Requirements Engineering Visualization
No ratings yet
Requirements Engineering Visualization
6 pages
DV Regular Question
No ratings yet
DV Regular Question
14 pages
Graphical Representation of Data
No ratings yet
Graphical Representation of Data
4 pages
KRANTI
No ratings yet
KRANTI
23 pages
COIS13013 Business Intelligence: Term 1 - 2022
No ratings yet
COIS13013 Business Intelligence: Term 1 - 2022
11 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
IDU Final - Reference 5
No ratings yet
IDU Final - Reference 5
8 pages
Parth Nagarkoti CV
No ratings yet
Parth Nagarkoti CV
1 page
Nasscom - (ACDS) Data Science - 010625-1748592138353
No ratings yet
Nasscom - (ACDS) Data Science - 010625-1748592138353
29 pages
Data Visualization Techniques in Python
No ratings yet
Data Visualization Techniques in Python
28 pages
Data-Driven Decision Making - What's The Importance
No ratings yet
Data-Driven Decision Making - What's The Importance
7 pages
Altair Data Science Internship Report
No ratings yet
Altair Data Science Internship Report
47 pages
Case Based Reading Class Xi 20250806151936
No ratings yet
Case Based Reading Class Xi 20250806151936
12 pages
Tableau Chart Types Presentation
No ratings yet
Tableau Chart Types Presentation
11 pages
Data Science and Analytics Brochure
No ratings yet
Data Science and Analytics Brochure
6 pages
LGP 2025 Junior Track Course Outline (Grade 9 and 10)
No ratings yet
LGP 2025 Junior Track Course Outline (Grade 9 and 10)
63 pages
Data Analysis Visualization Full Project
No ratings yet
Data Analysis Visualization Full Project
19 pages
PRDP Scale Up Geomapping and Governance Unit GGU Operations Manual
No ratings yet
PRDP Scale Up Geomapping and Governance Unit GGU Operations Manual
116 pages
Final Project
No ratings yet
Final Project
2 pages
Project Blackbook
No ratings yet
Project Blackbook
36 pages
Data Analytics Chapter 5
No ratings yet
Data Analytics Chapter 5
14 pages
Unit-2 Data Literacy
No ratings yet
Unit-2 Data Literacy
8 pages
Red and White Modern Cricket Club Sports Presentation
No ratings yet
Red and White Modern Cricket Club Sports Presentation
12 pages
AI Project Class 12
No ratings yet
AI Project Class 12
25 pages
Policy Brief Guide
No ratings yet
Policy Brief Guide
6 pages
Basic in StatSoftware Training
No ratings yet
Basic in StatSoftware Training
67 pages
Data Science Vs MLOps
No ratings yet
Data Science Vs MLOps
8 pages
Big Data Notes-1
No ratings yet
Big Data Notes-1
158 pages

1 - Introduction

Uploaded by

1 - Introduction

Uploaded by

Data Analytics

• Final exam 60%

• Mid-term exam 20%

• Quizzes (2 quizzes) 10%

Understand data analytics and be able to apply data analysis to

2. Experience a wide-range of data analytics tools include Excel,

3. Develop a computational thinking approach to problem solving

4. Be able to clearly articulate a problem in a systematic way .

(Business) Analytics is the use of:

 Descriptive analytics: the use of data to understand past and

• Predictive analytics: predict the future by examining historical

• Prescriptive analytics: identify the best alternatives to minimize

 Most department stores clear seasonal inventory by reducing

Spreadsheets and Scenario and

Data: numerical or textual facts and figures that are

Information: result of analyzing data; that is, extracting

Data set - a collection of data. Database - a collection of related

Entities Fields or Attributes

• Collected by recording a characteristic of many

• Collected over several time periods focusing on

You might also like