Exploring Python's NumPy and Pandas Libraries
Group Composition: 3 Members (Max.)
Deliverables
A presentation summarizing your work including the code snippets (10-15 minutes)
Part 1 - NumPy Basics
Task 1: Introduction to NumPy
Each member should individually explore and document the basic functionalities of NumPy (arrays, array
indexing, array operations, etc.).
Consolidate your findings into a single slide.
Task 2: Array Manipulations
Create a NumPy array and perform the following operations
• Reshape the array.
• Slice and index the array.
• Perform element-wise operations (addition, subtraction, multiplication, division).
• Calculate the mean, median, standard deviation, and variance of the array.
Task 3: Linear Algebra with NumPy
• Create two 2D arrays and perform matrix multiplication.
• Solve a system of linear equations using [Link].
• Find the eigenvalues and eigenvectors of a matrix.
Part 2: Pandas Basics
Task 1: Introduction to Pandas
Each member should individually explore and document the basic functionalities of Pandas (Series,
DataFrame, basic operations).
Consolidate your findings into a single slide.
Task 2: Data Cleaning and Preparation
• Load a dataset from a CSV file (you can use a publicly available dataset from websites like Kaggle
or UCI Machine Learning Repository).
• Perform data cleaning
o Handle missing values.
o Remove duplicate entries.
o Convert data types where necessary.
o Normalize/standardize numerical columns if needed.
Task 3: Data Analysis
Perform exploratory data analysis (EDA) on the dataset
• Generate summary statistics.
• Visualize the data using plots (histograms, box plots, scatter plots, etc.).
• Identify and analyze patterns or correlations in the data.
Part 3: Real-World Application
Task 1: Dataset Selection and Problem Definition
• Choose a real-world dataset relevant to your interests or studies.
• Define a problem statement or hypothesis that you aim to investigate using this dataset.
Task 2: Data Manipulation and Analysis
• Load and clean the dataset as per the methods explored earlier.
• Perform detailed analysis using both NumPy and Pandas.
o Apply NumPy for any required numerical computations.
o Use Pandas for data manipulation and analysis.
o Visualize the results using appropriate plots and charts.
Task 3: Reporting and Presentation
• Document your entire process, including:
o Introduction to the problem.
o Methodology.
o Analysis and findings.
o Conclusion and future work.
• Prepare a presentation summarizing your work:
o Introduce NumPy and Pandas briefly.
o Explain your problem statement.
o Discuss the steps taken for data cleaning and analysis.
o Present your findings with visual aids.
o Conclude with insights and possible future directions.
Evaluation Criteria
• Understanding and application of NumPy and Pandas functionalities.
• Quality and completeness of data cleaning and analysis.
• Clarity and depth of the written report.
• Effectiveness and clarity of the presentation.
• Team collaboration and equal contribution.