Unit 3 Data Analytics
Unit 3 Data Analytics
By
Dr. Anand Vyas
Data Science Project Life Cycle:
• A data science life cycle is an iterative set of data
science steps you take to deliver a project or
analysis. Because every data science project and
team are different, every specific data science life
cycle is different. However, most data science
projects tend to flow through the same general
life cycle of data science steps.
Exploratory Data Analysis
Business Requirement
• Data requirements definition establishes the process used to identify,
prioritize, precisely formulate, and validate the data needed to achieve
business objectives. When documenting data requirements, data should
be referenced in business language, reusing approved standard business
terms if available. If business terms have not yet been standardized and
approved for the data within scope, the data requirements process
provides the occasion to develop them.
• The data requirements analysis process employs a top-down approach
that emphasizes business-driven needs, so the analysis is conducted to
ensure the identified requirements are relevant and feasible. The process
incorporates data discovery and assessment in the context of explicitly
qualified business data consumer needs.
•
Data Acquisition,
• Data acquisition is the process of sampling signals that measure real
world physical conditions and converting the resulting samples into
digital numeric values that can be manipulated by a computer. Data
acquisition systems, abbreviated by the initialisms DAS, DAQ, or
DAU, typically convert analog waveforms into digital values for
processing. The components of data acquisition systems include:
• Sensors, to convert physical parameters to electrical signals.
• Signal conditioning circuitry, to convert sensor signals into a form
that can be converted to digital values.
• Analog-to-digital converters, to convert conditioned sensor signals
to digital values
Data Preparation,
• Data preparation is the process of gathering,
combining, structuring and organizing data so it can be
used in business intelligence (BI), analytics and data
visualization applications. The components of data
preparation include data pre-processing, profiling,
cleansing, validation and transformation; it often also
involves pulling together data from different internal
systems and external sources.
•
Hypothesis and,
• Hypothesis testing was introduced by Ronald Fisher,
Jerzy Neyman, Karl Pearson and Pearson’s son, Egon
Pearson. Hypothesis testing is a statistical method
that is used in making statistical decisions using
experimental data. Hypothesis Testing is basically an
assumption that we make about the population
parameter.
•
Important terms
• (i) Null hypothesis: Null hypothesis is a statistical hypothesis that assumes
that the observation is due to a chance factor. Null hypothesis is denoted
by; H0: μ1 = μ2, which shows that there is no difference between the two
population means.