Data Analytics Project Task Description January
Data Analytics Project Task Description January
Project Overview
In this project, you will engage in a comprehensive data analytics task that
encompasses data cleaning, exploratory data analysis (EDA), data preprocessing, and
the application of supervised learning techniques for both regression and classification.
For the supervised learning task, you have the freedom to choose or derive a target
variable that interests you within the dataset.
Objectives
1. Data Cleaning: Identify and rectify inconsistencies, missing values, and outliers
in the dataset. This step is crucial for ensuring the quality and reliability of your
analysis.
2. Exploratory Data Analysis (EDA): Conduct a thorough exploration of the dataset
to uncover patterns, trends, and insights. Utilize visualizations and summary
statistics to present your findings eKectively.
3. Data Preprocessing: Prepare your data for modeling by transforming variables,
encoding categorical features, and normalizing or scaling numerical features as
necessary.
4. Supervised Learning:
o Regression Task: Choose a continuous target variable and apply
regression techniques to predict its value based on the features in your
dataset.
o Classification Task: Select or derive a categorical target variable and
implement classification algorithms to categorize the data points.
Deliverables
You are required to submit the following:
• Commented Code: Provide your code in a Jupyter Notebook or Markdown file.
Ensure that your code is well-commented to explain your thought process and
the steps taken throughout the project.
• Comprehensive Report: Prepare a detailed report (max 20 pages) summarizing
your methodology, findings, and insights from the analysis. Alternatively, you
may create a video presentation (maximum 10 minutes) to communicate your
project eKectively.
Team Collaboration
You are allowed to work in teams of up to three participants. Collaboration is
encouraged, but ensure that each team member contributes meaningfully to the
project.
Evaluation Criteria
Your project will be evaluated based on the following criteria:
• Quality and thoroughness of data cleaning and preprocessing.
• Depth and clarity of exploratory data analysis.
• EKectiveness of the regression and classification models applied.
• Clarity and professionalism of the final report or presentation.
• Coherence of the analysis and logical relationship between the analysis steps
performed.
• Collaboration and contribution from all team members.
Submission deadline:
• January 30, 2025, 23:59
• Upload the requested documents on Teams by creating a subfolder at
Student_Submissions or upload a document containg a link to external
repositories for the requested documents to a subfolder at
Student_Submissions
Dataset:
You can find a csv-file in the folder Data. The data has been obtained from the U.S.
Department of Transportation, Bureau of Transportation Statistics and contains 32
attributes related to planned flights in the period from January 2019– August 2023 .
This project aims to provide you with hands-on experience in the data analytics
process, enhancing your skills in data manipulation, analysis, and model building. Good
luck, and enjoy the journey of discovery through data!