0% found this document useful (0 votes)
3 views26 pages

Ch1-Introduction to Data Analytics & LifeCycle

The document outlines key roles in data analytics projects, emphasizing the importance of various team members such as business users, project sponsors, and data scientists for successful project execution. It also describes the Data Analytics Life Cycle, which includes phases like data discovery, preparation, modeling, and result communication, providing a structured approach to managing data for business goals. Each phase is essential for ensuring data is effectively utilized and analyzed to support decision-making.

Uploaded by

Ashwini
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
3 views26 pages

Ch1-Introduction to Data Analytics & LifeCycle

The document outlines key roles in data analytics projects, emphasizing the importance of various team members such as business users, project sponsors, and data scientists for successful project execution. It also describes the Data Analytics Life Cycle, which includes phases like data discovery, preparation, modeling, and result communication, providing a structured approach to managing data for business goals. Each phase is essential for ensuring data is effectively utilized and analyzed to support decision-making.

Uploaded by

Ashwini
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 26

Ch1-Introduction to

data analytics &


LifeCycle
Key Roles for Data Analytics
project
• There are certain key roles that are required for the complete and
fulfilled functioning of the data science team to execute projects on
analytics successfully. The key roles are seven in number.
• Each key plays a crucial role in developing a successful analytics
project. There is no hard and fast rule for considering the listed seven
roles, they can be used fewer or more depending on the scope of the
project, skills of the participants, and organizational structure.
• Example –
For a small, versatile team, these listed seven roles may be fulfilled by
only three to four people but a large project on the contrary may
require 20 or more people for fulfilling the listed roles.
1.Business User :
1.The business user is the one who
understands the main area of the project
and is also basically benefited from the
results.
2.This user gives advice and consult the
team working on the project about the
value of the results obtained and how the
operations on the outputs are done.
3.The business manager, line manager, or
deep subject matter expert in the project
mains fulfills this role.
2. Project Sponsor :
1. The Project Sponsor is the one who is responsible to initiate the
project. Project Sponsor provides the actual requirements for the
project and presents the basic business issue.
2. He generally provides the funds and measures the degree of
value from the final output of the team working on the project.
3. This person introduce the prime concern and brooms the desired
output.

3. Project Manager :
4. This person ensures that key milestone and purpose of the
project is met on time and of the expected quality.
1.Business Intelligence Analyst :
1.Business Intelligence Analyst provides business domain
perfection based on a detailed and deep understanding of
the data, key performance indicators (KPIs), key matrix, and
business intelligence from a reporting point of view.
2.This person generally creates fascia and reports and knows
about the data feeds and sources.

2.Database Administrator (DBA) :


1.DBA facilitates and arrange the database environment to
support the analytics need of the team working on a
project.
2.His responsibilities may include providing permission to key
databases or tables and making sure that the appropriate
security stages are in their correct places related to the
data repositories or not.
Data Engineer :
1.Data engineer grasps deep technical skills to assist with
tuning SQL queries for data management and data
extraction and provides support for data intake into the
analytic sandbox.
2.The data engineer works jointly with the data scientist to
help build data in correct ways for analysis.

Data Scientist :
3.Data scientist facilitates with the subject matter
expertise for analytical techniques, data modelling, and
applying correct analytical techniques for a given
business issues.
4.He ensures overall analytical objectives are met.
5.Data scientists outline and apply analytical methods and
What is Data Analytics Life Cycle?
In today’s digital world, data is of excellent
significance importance. It undergoes many
stages throughout life, during its creation,
tests, processes, consumption, and reuse.
The Data Analytics Life Cycle maps out the
locations for professionals working on data
analytics projects. These phases are
arranged in a circular structure that forms a
Data Analytics Life Cycle. Each step has its
significance and characteristics.
Why is Data Analytics Lifecycle Essential?

The Life cycle of Data Analytics is designed to be


used with significant big data projects. It is used
to portray the actual project correctly; the cycle is
iterative. A step-by-step technique is needed to
arrange the actions and tasks involved in
gathering, processing, analyzing, and reusing
data to explore the various needs for assessing
the information on big data. Data analysis is
modifying, processing, and cleaning raw data to
obtain a valuable, significant statement that
supports business decision-making.
Importance of life cycle of data
analytics
• The life cycle of Data Analytics defines the roadmap of
how data is generated, collected, processed, used, and
analyzed to achieve business goals. It offers a
systematic way to manage data for converting it into
information that can use to fulfil organizational and
project goals. The process provides the direction and
methods to extract information from the data and
proceed in the right direction to accomplish business
goals.
• Data professionals use the lifecycle’s circular form to
proceed with data analytics in either a forward or
backward direction. Based on the newly received
Phase 1: Data Discovery and Formation
• Every good journey begins with a purpose in mind. In this phase, you
will identify your desired data objectives and how best to attain them
through data analytics Life Cycle implementation. Evaluations and
assessments should also be undertaken during this initial phase to
develop a basic hypothesis capable of solving business issues or
problems.
• In the initial step, data will be evaluated for its potential uses and
demands – such as where it comes from, what message you wish for
it to send and how this incoming information benefits your business.
• As a data analyst, you will need to explore case studies using similar
data analytics and, most crucially, examine current company trends.
Then you must evaluate all in-house infrastructure and resources, as
well as time and technological needs, in order to match the
previously acquired data.
• Following the completion of the evaluations, the team closes this
stage with hypotheses that will be tested using data later on. This is
the first and most critical step in the life cycle of big data analytics.
Key takeaways:
1.The data science team investigates and learns about
the challenge.
2.Create context and understanding.
3.Learn about the data sources that will be required and
available for the project.
4.The team develops preliminary hypotheses that can
later be tested with data.
Phase 2: Data Preparation and Processing
Data preparation and processing involves gathering, sorting,
processing and purifying collected information to make sure
it can be utilized by subsequent steps of analysis. An
important element of this step is making sure all necessary
information is readily accessible before moving ahead with
processing it further.
Following are methods of data acquisition
• Data Collection: Draw information from external sources.
• Data Entry: Within an organization, data entry refers to
creating new points of information using either digital
technologies or manual input procedures.
• Signal Reception: Accumulating data from digital devices
like the Internet of Things devices and control systems.
Key Takeaways :
• Steps to explore, preprocess, and condition
data prior to modeling and analysis.
• It requires the presence of an analytic
sandbox, the team execute, load, and
transform, to get data into the sandbox.
• Data preparation tasks are likely to be
performed multiple times and not in
predefined order.
• Several tools commonly used for this phase
are – Hadoop, Alpine Miner, Open Refine,
etc.
Phase 3: Design a Model
After you’ve defined your business goals and gathered a large
amount of data (formatted, unformatted, or semi-formatted), it’s
time to create a model that uses the data to achieve the goal. Model
planning is the name given to this stage of the data analytics
process.
There are numerous methods for loading data into the system and
starting to analyze it:
• ETL (Extract, Transform, and Load) converts the information before
loading it into a system using a set of business rules.
• ELT (Extract, Load, and Transform) loads raw data into the sandbox
before transforming it.
• ETLT (Extract, Transform, Load, Transform) is a combination of two
layers of transformation.
This step also involves teamwork to identify the approaches,
techniques, and workflow to be used in the succeeding phase to
develop the model. The process of developing a model begins with
finding the relationship between data points to choose the essential
Key Takeaways :

• Team explores data to learn about relationships


between variables and subsequently, selects key
variables and the most suitable models.
• In this phase, data science team develop data sets
for training, testing, and production purposes.
• Team builds and executes models based on the
work done in the model planning phase.
• Several tools commonly used for this phase are –
Matlab, STASTICA.
Phase 4: Model Building
This stage of the data analytics life cycle involves
creating datasets for testing, training, and
production. The data analytics professionals develop
and operate the model they designed in the previous
stage with proper effort.
They use tools and methods, such as decision trees,
regression techniques logistic regression), and neural
networks to create and run the model. The experts
also run the model through a trial run to see if it
matches the datasets.
It assists them in determining whether the tools they
now have will be enough to execute the model or if a
Key Takeaways:
• The team creates datasets for use in testing,
training, and production.
• The team also examines if its present tools will
serve for running the models or if a more robust
environment is required for model execution.
• Rand PL/R, Octave, and WEKA are examples of
free or open-source tools.
Phase 5: Result Communication and Publication
Recall the objective you set for your company in
phase 1. Now is the time to see if the tests you ran in
the previous phase matched those criteria.
The communication process begins with cooperation
with key stakeholders to decide whether the project’s
outcomes are successful or not.
The project team is responsible for identifying the
major conclusions of the analysis, calculating the
business value associated with the outcome, and
creating a narrative to summarize and communicate
the results to stakeholders.
Key Takeaways:
•After executing model team need to compare
outcomes of modeling to criteria established
for success and failure.
•Team considers how best to articulate findings
and outcomes to various team members and
stakeholders, taking into account warning,
assumptions.
•Team should identify key findings, quantify
business value, and develop narrative to
summarize and convey findings to
stakeholders.
Phase 6: Measuring Effectiveness / Operationalize

As your data analytics life cycle comes to an end, the final


stage is to offer stakeholders a complete report that
includes important results, coding, briefings, and
technical papers or documents.
Furthermore, to assess the effectiveness of the study, the
data is transported from the sandbox to a live
environment and observed to see if the results match the
desired business aim.
If the findings meet the objectives, the reports and
outcomes are finalized. However, if the conclusion differs
from the purpose stated in phase 1, then you can go back
in the data analytics life cycle to any of the previous
phases to adjust your input and get a different result.
Keytakeways :
• The team distributes the benefits of the project to
a wider audience. It sets up a pilot project that will
deploy the work in a controlled manner prior to
expanding the project to the entire enterprise of
users.
• This technique allows the team to gain insight into
the performance and constraints related to the
model within a production setting at a small scale
and then make necessary adjustments before full
deployment.
• The team produces the last reports, presentations,
and codes.
Thank You
Any Questions????’s

You might also like