0% found this document useful (0 votes)

130 views5 pages

EDA - Unit-1: Prerequisite of The Subject

Exploratory Data Analysis (EDA) is a foundational subject that equips students with techniques to summarize, visualize, and understand datasets, essential for data science workflows. It requires prerequisites in statistics, programming, and data handling, and has applications across various fields including healthcare, finance, and marketing. EDA prepares students for advanced topics like machine learning and data visualization, emphasizing the importance of data exploration in deriving insights and improving model quality.

Uploaded by

pratikkamble

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

130 views5 pages

EDA - Unit-1: Prerequisite of The Subject

Uploaded by

pratikkamble

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

EDA – Unit-1

Prerequisite of the Subject:

Prerequisites:

 Basic Statistics and Probability: To understand data distributions, variability, and

statistical testing.
 Python Programming: For hands-on implementation of EDA tasks using libraries like
Pandas, Matplotlib, and Seaborn.
 Mathematics (Linear Algebra, Algebra): For understanding data structures and numerical
computations.
 Basic Data Handling Skills: Knowledge of Excel or basic data tools to manipulate and
visualize data.

Future Link with Other Subjects in Curriculum:

EDA serves as a foundation for the following advanced topics:

 Machine Learning (ML): EDA is critical for data preprocessing, feature engineering, and
model diagnostics.
 Data Visualization: Builds upon the visual aspects introduced during EDA.
 Big Data Analytics: Introduces basic data handling and exploration techniques applicable
to larger datasets.
 Data Mining and Pattern Recognition: Uses insights found during EDA for deeper pattern
detection.
 Artificial Intelligence (AI): Clean and well-explored data is crucial for building
intelligent systems.

Applications of the Subject:

EDA is used in nearly every field where data is involved. Some key application areas include:

 Healthcare: Discovering patterns in patient records to predict disease outbreaks.

 Finance: Analyzing stock market trends or customer credit behavior.
 Marketing: Customer segmentation, campaign performance evaluation.
 Manufacturing: Quality control and defect analysis.
 Retail: Inventory and sales trend analysis.
 Social Media: Analyzing user engagement and sentiment trends.
Course Objective and Course Outcomes

Course Objective:

 To introduce students to the techniques of exploratory data analysis using statistical and
computational methods.
 To develop the ability to prepare raw datasets for further analysis or modeling.
 To teach how to extract meaningful insights and visualize them using tools.

Course Outcomes (COs):

1. Understand and explain the importance of EDA in the data science workflow.
2. Apply statistical and visual methods to explore datasets.
3. Identify and handle data quality issues like missing values and outliers.
4. Use Python or Excel to perform different types of EDA.
5. Interpret the output of EDA to inform future analysis or modeling.

Ex. After the course, a student should be able to use Python to visualize and interpret the sales
trend of a product using line and bar plots.

Introduction to the Exploratory Data Analysis:

EDA is the first step in data analysis that focuses on summarizing, visualizing, and
understanding the structure of your dataset before applying any modeling or inference
techniques.

EDA is not just about data cleaning—it's a mindset of interacting with the data curiously and
iteratively to understand it better.

EDA is like exploring your data for the first time. You’re trying to see what’s going on without
jumping to conclusions.

Exploratory Data Analysis is the preliminary step in data analysis focused on understanding
datasets through summarization, visualization, and pattern detection.

Kind of model after the EDA process done

1-Predictive Modelling
To predict a future values base on the patterns in the data, like Predict a continuous
number. i.e. House prices, sales.
2-Classification Modeling
To classify items into categories. i.e. spam vs not spam, fraud vs not fraud.

3-Clustering (Unsupervised Learning)

To group similar items when you don’t know the labels beforehand.

4-Dimensionality Reductions
To simplify the dataset by reducing the number of features while keeping the important
information. i.e. your data has too many variables or is hard to visualize.

5-Time series modeling

To predict values over time. i.e. sales next month, stock prices.

6-Anomaly detection
To find unusual patterns or outliers in your data. i.e. looking for fraud detection, system
failures.

7-Deep Learning Models

To solve more complex problems, like image recognition, NLP, or unstructured data.

Advantages of EDA:

Improves Model Quality:- Better feature engineering and data understanding lead to more
accurate models.

Identifies Hidden Trends:- Recognizes patterns not visible through raw data.

Saves Cost:- Early detection of data issues reduces downstream errors.

COMMAND

act as an expert, [Link] level: BTech graduation [Link] format: descriptive, detail 3. book
reference: "Hands-On Exploratory Data Analysis with Python, Suresh Kumar Mukhiya, Usman Ahmed,
Packt Publication, 2020" , "Data Science Fundamentals and Practical Approaches, Gypsy Nandi, Rupam
Sharma, BPB Publications, 2020", and you can collect data from other sources as well. this is syllabus :
"Introduction to exploratory data analysis. Types of exploratory data analysis-Descriptive, Inferential,
Visual, Quantitative. Phases and steps involved in EDA. Advantages, limitations, and application areas of
EDA." Under this syllabus, cover following points: What is EDA? Importance of EDA in the data science
pipeline Historical background (John Tukey’s role) Comparison with Confirmatory Data Analysis (CDA)
Real-world motivation/examples -Descriptive EDA: Summary statistics, frequency tables -Inferential
EDA: Confidence intervals, basic hypothesis testing -Visual EDA: Use of plots (histogram, boxplot, scatter
plot) -Quantitative EDA: Correlation analysis, data distribution metrics Phases and steps involved in EDA.
● Data collection and understanding ● Data cleaning (missing values, outliers) ● Data transformation
(scaling, encoding) ● Feature analysis and selection ● Pattern discovery and initial insights Advantages:
Improves model quality, finds hidden trends ● Limitations: Time-consuming, may not reveal deep
insights without context ● Applications: Healthcare, finance, manufacturing, etc. ● Case studies or
industry scenarios where EDA plays a crucial role

Unit 1
No ratings yet
Unit 1
23 pages
Eda U1
No ratings yet
Eda U1
144 pages
UNIT 1 Exploratory Data Analysis
100% (4)
UNIT 1 Exploratory Data Analysis
21 pages
Eda 1
No ratings yet
Eda 1
25 pages
Group 7
No ratings yet
Group 7
19 pages
Eda Sandhya
No ratings yet
Eda Sandhya
7 pages
Lesson 5 Exploratory Data Analysis
No ratings yet
Lesson 5 Exploratory Data Analysis
10 pages
Module 2
No ratings yet
Module 2
78 pages
Python For Data Analysis 2nd Module
No ratings yet
Python For Data Analysis 2nd Module
14 pages
Devish All Unit
No ratings yet
Devish All Unit
42 pages
Unit 3
No ratings yet
Unit 3
83 pages
Unit 1
No ratings yet
Unit 1
19 pages
Dev 1
No ratings yet
Dev 1
2 pages
Data Science Lecture No 02
No ratings yet
Data Science Lecture No 02
21 pages
Notes - Unit 1 - Exploratory Data Analysis
No ratings yet
Notes - Unit 1 - Exploratory Data Analysis
33 pages
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
No ratings yet
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
9 pages
DSP Unit - Ii
No ratings yet
DSP Unit - Ii
14 pages
Data Science Lecture No 02
No ratings yet
Data Science Lecture No 02
21 pages
Exploratory Data Analysis Essentials
No ratings yet
Exploratory Data Analysis Essentials
47 pages
Document
No ratings yet
Document
21 pages
Python for Exploratory Data Analysis
No ratings yet
Python for Exploratory Data Analysis
7 pages
EDA: Essential for Data Scientists
No ratings yet
EDA: Essential for Data Scientists
7 pages
Data Exploration and Visualization
100% (1)
Data Exploration and Visualization
281 pages
FDS Unit 2
No ratings yet
FDS Unit 2
15 pages
DL EDA Process
No ratings yet
DL EDA Process
2 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
23 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
17 pages
Exploratory Data Analysis (EDA)
No ratings yet
Exploratory Data Analysis (EDA)
1 page
Ad3301 - Dev - 5 Units Question Bank
No ratings yet
Ad3301 - Dev - 5 Units Question Bank
16 pages
Data Analytics Course for Beginners
No ratings yet
Data Analytics Course for Beginners
34 pages
Unit I - Part I Notes
100% (7)
Unit I - Part I Notes
33 pages
Exploratory Data Analysis Using Python
No ratings yet
Exploratory Data Analysis Using Python
7 pages
P23MBA547 Predictive Analytics
No ratings yet
P23MBA547 Predictive Analytics
133 pages
Step-by-Step Exploratory Data Analysis (EDA) Using Python
100% (1)
Step-by-Step Exploratory Data Analysis (EDA) Using Python
20 pages
Python Data Analysis for Beginners
No ratings yet
Python Data Analysis for Beginners
7 pages
Exploratory Data Analysis (EDA)
No ratings yet
Exploratory Data Analysis (EDA)
12 pages
Lecture 21
No ratings yet
Lecture 21
16 pages
Understanding Exploratory Data Analysis
0% (1)
Understanding Exploratory Data Analysis
17 pages
Eda Feature
No ratings yet
Eda Feature
1 page
Exploratory Data Analysis Overview
No ratings yet
Exploratory Data Analysis Overview
34 pages
Exploratory Data Analysis With Python
No ratings yet
Exploratory Data Analysis With Python
2 pages
Unit 1
No ratings yet
Unit 1
50 pages
What Is Exploratory Data Analysis
No ratings yet
What Is Exploratory Data Analysis
28 pages
RMK Group Data Analytics Guide
No ratings yet
RMK Group Data Analytics Guide
150 pages
Unit 1
No ratings yet
Unit 1
29 pages
What Is EDA in Data Science - Everything About Exploratory Data - by Aman Kharwal - Medium
No ratings yet
What Is EDA in Data Science - Everything About Exploratory Data - by Aman Kharwal - Medium
11 pages
Eda 25-26 Ai&Ml 5 Sem Syllabus
No ratings yet
Eda 25-26 Ai&Ml 5 Sem Syllabus
3 pages
Exploratory Data Analysis (EDA) Guide
No ratings yet
Exploratory Data Analysis (EDA) Guide
16 pages
Unit 1 DXV
No ratings yet
Unit 1 DXV
28 pages
Ccs346 Eda Unit 1
No ratings yet
Ccs346 Eda Unit 1
129 pages
ML Exp1 - 2201107
No ratings yet
ML Exp1 - 2201107
34 pages
Master Exploratory Data Analysis For Fast Business Growth!2
No ratings yet
Master Exploratory Data Analysis For Fast Business Growth!2
21 pages
ccs346 Eda
No ratings yet
ccs346 Eda
2 pages
Introduction To EDA: Exploratory Data Analysis (EDA) in Data Science
No ratings yet
Introduction To EDA: Exploratory Data Analysis (EDA) in Data Science
4 pages
Lab07ML - f40
No ratings yet
Lab07ML - f40
13 pages
DSML Notes
No ratings yet
DSML Notes
32 pages
Exploratory Data Analysis and Data Science - Part 1
No ratings yet
Exploratory Data Analysis and Data Science - Part 1
7 pages
Exp 12
No ratings yet
Exp 12
7 pages
7semaids Time Table-1
No ratings yet
7semaids Time Table-1
1 page
1 - Chatter, Analytics in SF
No ratings yet
1 - Chatter, Analytics in SF
9 pages
4 - Data Loader
No ratings yet
4 - Data Loader
4 pages
EDA - Lab Manual
No ratings yet
EDA - Lab Manual
6 pages
Frontsheet Asm Final Report (Trankimthanh+Group4)
No ratings yet
Frontsheet Asm Final Report (Trankimthanh+Group4)
70 pages
Marketing (WH) (MKTG) : Page 1 of 14
100% (1)
Marketing (WH) (MKTG) : Page 1 of 14
14 pages
Minitabdat PDF
No ratings yet
Minitabdat PDF
249 pages
Measures of Central Tendency Guide
No ratings yet
Measures of Central Tendency Guide
23 pages
Probability & Venn Diagrams Guide
No ratings yet
Probability & Venn Diagrams Guide
7 pages
Stream Communities and Biomonitoring
No ratings yet
Stream Communities and Biomonitoring
13 pages
IS5740 W05 Tutorial Note (Regression)
No ratings yet
IS5740 W05 Tutorial Note (Regression)
12 pages
CNN Basics for Beginners
No ratings yet
CNN Basics for Beginners
33 pages
Home Science Research Methods
No ratings yet
Home Science Research Methods
10 pages
Dodich Et Al - 2015 - NeurologicalSciences
No ratings yet
Dodich Et Al - 2015 - NeurologicalSciences
7 pages
Validasi CAAS-SF untuk Indonesia
No ratings yet
Validasi CAAS-SF untuk Indonesia
11 pages
University Student Performance Stats
No ratings yet
University Student Performance Stats
6 pages
Control Techniques in Experimental Research
No ratings yet
Control Techniques in Experimental Research
40 pages
590-Article Text-5588-1-10-20250109
No ratings yet
590-Article Text-5588-1-10-20250109
10 pages
Nursing Research and Statistics Course
No ratings yet
Nursing Research and Statistics Course
10 pages
Finding The Area Under The Normal Curve
No ratings yet
Finding The Area Under The Normal Curve
2 pages
De VEAUX - Curriculum Guidelines For Undergraduate Programes in Data Science
No ratings yet
De VEAUX - Curriculum Guidelines For Undergraduate Programes in Data Science
18 pages
Practice Questions
No ratings yet
Practice Questions
8 pages
Murder On The Orient Express Poirot Agatha Christie Instant Download
No ratings yet
Murder On The Orient Express Poirot Agatha Christie Instant Download
146 pages
Quantitative vs Qualitative Research
No ratings yet
Quantitative vs Qualitative Research
8 pages
10.1515 - Opag 2021 0042
No ratings yet
10.1515 - Opag 2021 0042
22 pages
Lecture-4 (Fitting of A Power Curve)
No ratings yet
Lecture-4 (Fitting of A Power Curve)
5 pages
Principles of development 5th ed Edition Wolpert ebook entire content available
100% (1)
Principles of development 5th ed Edition Wolpert ebook entire content available
133 pages
Aqa 13502a MS Nov21
No ratings yet
Aqa 13502a MS Nov21
14 pages
Course Outline in Mathematics in The Modern World
No ratings yet
Course Outline in Mathematics in The Modern World
3 pages
SSC CGL Stats Syllabus
No ratings yet
SSC CGL Stats Syllabus
3 pages
Doctoral Course Outline - 2024-25
No ratings yet
Doctoral Course Outline - 2024-25
4 pages
Business Statistics Assignment Results
No ratings yet
Business Statistics Assignment Results
4 pages
Syllabus
No ratings yet
Syllabus
2 pages
Criminal Justice
No ratings yet
Criminal Justice
55 pages

EDA - Unit-1: Prerequisite of The Subject

Uploaded by

EDA - Unit-1: Prerequisite of The Subject

Uploaded by

EDA – Unit-1

Prerequisite of the Subject:

 Basic Statistics and Probability: To understand data distributions, variability, and

Future Link with Other Subjects in Curriculum:

EDA serves as a foundation for the following advanced topics:

Applications of the Subject:

 Healthcare: Discovering patterns in patient records to predict disease outbreaks.

Course Outcomes (COs):

Introduction to the Exploratory Data Analysis:

Kind of model after the EDA process done

3-Clustering (Unsupervised Learning)

5-Time series modeling

7-Deep Learning Models

Saves Cost:- Early detection of data issues reduces downstream errors.

You might also like