Unit 3 Data Analytics

Datarequirementsdefinitionestablishestheprocessusedtoidentify,prioritize,preciselyformulate,andvalidatethedataneededtoachievebusinessobjectives.Whendocumentingdatarequirements,datashouldbereferencedinbusinesslanguage,reusingapprovedstandardbusinesstermsifavailable.Ifbusinesstermshavenotyetbeenstandardizedandapprovedforthedatawithinscope,

Uploaded by

Tinku The Blogger

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views

Unit 3 Data Analytics

Uploaded by

Tinku The Blogger

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Unit 3

MBA/BBA/B.com /BCA/UGC Net

By
Dr. Anand Vyas
Data Science Project Life Cycle:
• A data science life cycle is an iterative set of data
science steps you take to deliver a project or
analysis. Because every data science project and
team are different, every specific data science life
cycle is different. However, most data science
projects tend to flow through the same general
life cycle of data science steps.
Exploratory Data Analysis
Business Requirement
• Data requirements definition establishes the process used to identify,
prioritize, precisely formulate, and validate the data needed to achieve
business objectives. When documenting data requirements, data should
be referenced in business language, reusing approved standard business
terms if available. If business terms have not yet been standardized and
approved for the data within scope, the data requirements process
provides the occasion to develop them.
• The data requirements analysis process employs a top-down approach
that emphasizes business-driven needs, so the analysis is conducted to
ensure the identified requirements are relevant and feasible. The process
incorporates data discovery and assessment in the context of explicitly
qualified business data consumer needs.
•
Data Acquisition,
• Data acquisition is the process of sampling signals that measure real
world physical conditions and converting the resulting samples into
digital numeric values that can be manipulated by a computer. Data
acquisition systems, abbreviated by the initialisms DAS, DAQ, or
DAU, typically convert analog waveforms into digital values for
processing. The components of data acquisition systems include:
• Sensors, to convert physical parameters to electrical signals.
• Signal conditioning circuitry, to convert sensor signals into a form
that can be converted to digital values.
• Analog-to-digital converters, to convert conditioned sensor signals
to digital values
Data Preparation,
• Data preparation is the process of gathering,
combining, structuring and organizing data so it can be
used in business intelligence (BI), analytics and data
visualization applications. The components of data
preparation include data pre-processing, profiling,
cleansing, validation and transformation; it often also
involves pulling together data from different internal
systems and external sources.
•
Hypothesis and,
• Hypothesis testing was introduced by Ronald Fisher,
Jerzy Neyman, Karl Pearson and Pearson’s son, Egon
Pearson. Hypothesis testing is a statistical method
that is used in making statistical decisions using
experimental data. Hypothesis Testing is basically an
assumption that we make about the population
parameter.
•
Important terms
• (i) Null hypothesis: Null hypothesis is a statistical hypothesis that assumes
that the observation is due to a chance factor. Null hypothesis is denoted
by; H0: μ1 = μ2, which shows that there is no difference between the two
population means.

• (ii) Alternative hypothesis: Contrary to the null hypothesis, the alternative

hypothesis shows that observations are the result of a real effect.

• (iii) Level of significance: Refers to the degree of significance in which we

accept or reject the null-hypothesis. 100% accuracy is not possible for
accepting or rejecting a hypothesis, so we therefore select a level of
significance that is usually 5%.
Importance of Hypothesis testing
• Hypothesis testing is one of the most important
concepts in statistics because it is how you decide if
something really happened, or if certain treatments
have positive effects, or if groups differ from each
other or if one variable predicts another. In short, you
want to proof if your data is statistically significant and
unlikely to have occurred by chance alone. In essence
then, a hypothesis test is a test of significance.
Modeling
• Data Modeling is the process of creating a visual representation of
either a whole information system or parts of it to communicate
connections between data points and structures. The goal is to
illustrate the types of data used and stored within the system, the
relationships among these data types, the ways the data can be
grouped and organized and its formats and attributes.

• Data models are built around business needs. Rules and

requirements are defined upfront through feedback from business
stakeholders so they can be incorporated into the design of a new
system or adapted in the iteration of an existing one.
Types of data models
• Conceptual data models. They are also referred to as domain models and offer a big-picture view of what the
system will contain, how it will be organized, and which business rules are involved. Conceptual models are usually
created as part of the process of gathering initial project requirements. Typically, they include entity classes
(defining the types of things that are important for the business to represent in the data model), their
characteristics and constraints, the relationships between them and relevant security and data integrity
requirements.
• Physical data models. They provide a schema for how the data will be physically stored within a database. As such,
they’re the least abstract of all. They offer a finalized design that can be implemented as a relational database,
including associative tables that illustrate the relationships among entities as well as the primary keys and foreign
keys that will be used to maintain those relationships. Physical data models can include database management
system (DBMS)-specific properties, including performance tuning.
• Logical data models. They are less abstract and provide greater detail about the concepts and relationships in the
domain under consideration. One of several formal data modeling notation systems is followed. These indicate
data attributes, such as data types and their corresponding lengths, and show the relationships among entities.
Logical data models don’t specify any technical system requirements. This stage is frequently omitted in agile or
DevOps practices. Logical data models can be useful in highly procedural implementation environments, or for
projects that are data-oriented by nature, such as data warehouse design or reporting system development.
Evaluation and Interpretation,
• Evaluation plans should illustrate how, where, and from what
sources data will be collected. Quantitative (numeric) and
qualitative (narrative or contextual) data should be collected within
a framework that aligns with stakeholder expectations, project
timelines, and program objectives.
• Data interpretation refers to the process of using diverse analytical
methods to review data and arrive at relevant conclusions. The
interpretation of data helps researchers to categorize, manipulate,
and summarize the information in order to answer critical
questions.
•
Deployment,
• The concept of deployment in data science refers to
the application of a model for prediction using a new
data. Building a model is generally not the end of the
project. Even if the purpose of the model is to increase
knowledge of the data, the knowledge gained will need
to be organized and presented in a way that the
customer can use it. Depending on the requirements,
the deployment phase can be as simple as generating a
report or as complex as implementing a repeatable
data science process.
Operations,
• Data Operations, combines people, processes, and
products that enable consistent, automated, and
secure data management. It is a delivery system based
on joining and analyzing large databases. Since
Collaboration and Teamwork are the two keys to a
successful business and under this idea, the term
“DataOps” was born. DataOps’s purpose is to be a
cross-functional way of working in terms of the
acquisition, storage, processing, quality monitoring,
execution, betterment, and delivery of information to
the end-user.
Optimization
• Data Optimization is a process that prepares the logical
schema from the data view schema. It is the counterpart of
data de-optimization. Data optimization is an important
aspect in database management in particular and in data
warehouse management in general. Data optimizations is
most commonly known to be a non-specific technique used
by several applications in fetching data from a data source
so that the data could use in data view tools and
applications such as those used in statistical reporting.
•

Solid Starts - First 100 Days
94% (18)
Solid Starts - First 100 Days
287 pages
Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
62% (66)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
KamaSutra Positions
69% (83)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
75% (12)
27 Feedback Mechanisms Pogil Key
6 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
Sample Mental Health Progress Note
96% (47)
Sample Mental Health Progress Note
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (55)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
71% (69)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Unit I - Part I Notes
100% (7)
Unit I - Part I Notes
33 pages
CC Unit - 4 Imp Questions
No ratings yet
CC Unit - 4 Imp Questions
4 pages
Unit 2 PPT (BA)
No ratings yet
Unit 2 PPT (BA)
33 pages
Unit 1
No ratings yet
Unit 1
61 pages
Data Analytics 1
No ratings yet
Data Analytics 1
4 pages
Facets of Data:: Self-Describing Structure
No ratings yet
Facets of Data:: Self-Describing Structure
6 pages
DA All Units
No ratings yet
DA All Units
85 pages
Introduction To Data Mining For Business Analytics
No ratings yet
Introduction To Data Mining For Business Analytics
51 pages
ITGY403 Lesson 1
No ratings yet
ITGY403 Lesson 1
16 pages
CDS - Unit 2
No ratings yet
CDS - Unit 2
31 pages
Lecture 6 23-24
No ratings yet
Lecture 6 23-24
20 pages
DSUR_EA2352001010391_W3
No ratings yet
DSUR_EA2352001010391_W3
3 pages
1 Da
No ratings yet
1 Da
12 pages
Data Mining
No ratings yet
Data Mining
3 pages
DA Question Bank
No ratings yet
DA Question Bank
16 pages
Mis Notes
No ratings yet
Mis Notes
5 pages
DataAnalytics-Chap-1
No ratings yet
DataAnalytics-Chap-1
36 pages
Module 4 Notes
No ratings yet
Module 4 Notes
53 pages
2.Data analysis Vs analytics
No ratings yet
2.Data analysis Vs analytics
6 pages
Unit - II (Bca01)
No ratings yet
Unit - II (Bca01)
17 pages
Data Profiling Screen
No ratings yet
Data Profiling Screen
4 pages
BDA Unit 1 Bigdata Intro
No ratings yet
BDA Unit 1 Bigdata Intro
69 pages
Data Mining
No ratings yet
Data Mining
41 pages
Presentation1 Revised [Autosaved]
No ratings yet
Presentation1 Revised [Autosaved]
83 pages
Introduction to Big Data
No ratings yet
Introduction to Big Data
47 pages
Big Data Analytics 1
No ratings yet
Big Data Analytics 1
22 pages
Modul 1 CertDA
No ratings yet
Modul 1 CertDA
8 pages
UNIT 1 Exploratory Data Analysis
No ratings yet
UNIT 1 Exploratory Data Analysis
21 pages
Unit 3 PPT (BA)
No ratings yet
Unit 3 PPT (BA)
19 pages
ModelQB - Part B&C-1
No ratings yet
ModelQB - Part B&C-1
51 pages
Data Analytics-Wps Office
No ratings yet
Data Analytics-Wps Office
21 pages
DSBDA_UNIT1
No ratings yet
DSBDA_UNIT1
232 pages
AA THeory and Methods
No ratings yet
AA THeory and Methods
40 pages
Unit 3 Data-Analytics
No ratings yet
Unit 3 Data-Analytics
48 pages
Notes Unit I
No ratings yet
Notes Unit I
47 pages
Data Science Pipeline, EDA & Data Preparation
No ratings yet
Data Science Pipeline, EDA & Data Preparation
14 pages
Chapter 4
No ratings yet
Chapter 4
20 pages
Rudra Bhatt Data
No ratings yet
Rudra Bhatt Data
9 pages
Unit 3: by Dr. Anand Vyas
No ratings yet
Unit 3: by Dr. Anand Vyas
20 pages
DATA Mining
No ratings yet
DATA Mining
21 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
13 pages
2.1_Data_Analytics[1]
No ratings yet
2.1_Data_Analytics[1]
16 pages
past ppr(1)
No ratings yet
past ppr(1)
31 pages
Unit-1 Data Science
No ratings yet
Unit-1 Data Science
74 pages
Data Curation and Managment Chap1-5 1-5
No ratings yet
Data Curation and Managment Chap1-5 1-5
31 pages
Unit - I DA.pptx
No ratings yet
Unit - I DA.pptx
107 pages
Module 1 - Introduction To Data Analytics
No ratings yet
Module 1 - Introduction To Data Analytics
21 pages
2 Data Analytics
No ratings yet
2 Data Analytics
49 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
DM Sem U-1
No ratings yet
DM Sem U-1
50 pages
Module 4
No ratings yet
Module 4
35 pages
Predictive Analytics
No ratings yet
Predictive Analytics
40 pages
Unit 7
No ratings yet
Unit 7
43 pages
Cse2026 Module 1 & 2 Detailed Notes
No ratings yet
Cse2026 Module 1 & 2 Detailed Notes
185 pages
Fods MQP Solutions - 025136
No ratings yet
Fods MQP Solutions - 025136
76 pages
Antim Prahar 2024 Data Analytics For Business Decisions
50% (2)
Antim Prahar 2024 Data Analytics For Business Decisions
38 pages
MetaData Management
No ratings yet
MetaData Management
7 pages
Assignement - Data Science For Business Growth and Big Data and Business Analytics
No ratings yet
Assignement - Data Science For Business Growth and Big Data and Business Analytics
5 pages
UNIT-4: Management Information System
No ratings yet
UNIT-4: Management Information System
18 pages
Data Analytics and Data Processing Essentials
From Everand
Data Analytics and Data Processing Essentials
gareth thomas
No ratings yet
The Role of Leadership in Engineering & Management Education
No ratings yet
The Role of Leadership in Engineering & Management Education
5 pages
Chardham Yatra Tour and Travel
No ratings yet
Chardham Yatra Tour and Travel
1 page
Python Programming Questions For Fresher
No ratings yet
Python Programming Questions For Fresher
9 pages
Data Structure QP
No ratings yet
Data Structure QP
8 pages
Sahitya Me Janavadi
No ratings yet
Sahitya Me Janavadi
7 pages
Final Print - ICMREMMS
No ratings yet
Final Print - ICMREMMS
162 pages
Power Quality and Facts
No ratings yet
Power Quality and Facts
2 pages
International Conference Schedule 2023
No ratings yet
International Conference Schedule 2023
8 pages
Sdemms Conference Brochure 2023-6
No ratings yet
Sdemms Conference Brochure 2023-6
4 pages
Marketing Analytics - Unit 2
No ratings yet
Marketing Analytics - Unit 2
8 pages
Unit 5 Data Analytics
No ratings yet
Unit 5 Data Analytics
10 pages
Unit 2 Data Analytics
No ratings yet
Unit 2 Data Analytics
16 pages
Optimized Agency Blueprint
No ratings yet
Optimized Agency Blueprint
2 pages
AI and ML For Business Antim Prahar WITH ANSWERS
No ratings yet
AI and ML For Business Antim Prahar WITH ANSWERS
26 pages
Tinku Gangwar 2022 23
No ratings yet
Tinku Gangwar 2022 23
2 pages
FTP S7-1200 DOKU V1 0 en
No ratings yet
FTP S7-1200 DOKU V1 0 en
39 pages
LESSON 4. Data Collection
No ratings yet
LESSON 4. Data Collection
7 pages
Experience of Ojt
No ratings yet
Experience of Ojt
78 pages
Supplied Package
No ratings yet
Supplied Package
10 pages
Mtech Digital Manufacturing
100% (1)
Mtech Digital Manufacturing
16 pages
12 AI and Machine Learning Use Cases in ITSM
No ratings yet
12 AI and Machine Learning Use Cases in ITSM
11 pages
Client Work Done-Report
No ratings yet
Client Work Done-Report
7 pages
BASIS Interview Questions
No ratings yet
BASIS Interview Questions
7 pages
Grade X Revision Test v IT Key Answers 2024 25
No ratings yet
Grade X Revision Test v IT Key Answers 2024 25
4 pages
Adobe Scan Jan 24, 2023
No ratings yet
Adobe Scan Jan 24, 2023
23 pages
Imrad
No ratings yet
Imrad
12 pages
23-24 GSL Sa1 Ix Ai Set - 2
No ratings yet
23-24 GSL Sa1 Ix Ai Set - 2
2 pages
CAP Theorem
No ratings yet
CAP Theorem
39 pages
Termostatos e Higrostatos Fandis
No ratings yet
Termostatos e Higrostatos Fandis
1 page
SSN Data Science Brochure2
No ratings yet
SSN Data Science Brochure2
14 pages