1. Assignment Question
1. Assignment Question
For the assignment, you are asked to explore the application of data analytics techniques to
the dataset which is provided. You must study data problems related to the dataset, giving
special consideration to the unique properties of the problem domain, and testing one or
more techniques on it.
Your analysis needs to be thorough and comprehensive and goes beyond the scope of what
has been covered in this course. You should incorporate data exploration, manipulation,
transformation, and visualization concepts with data analysis techniques in your solution. It
is crucial to provide explanations and justifications for the chosen techniques.
You also may need to pre-process your data to get it into an appropriate format. The
assignment should involve a number of techniques by categorize it into different criteria and
a detailed exploration with the commands using in each criterion. Outline the findings,
analyze them and justify correctly with an appropriate graph. Also, a supporting document
is needed to reflect the graph and code using R programming concepts.
This assignment will help you to explore and analyse a set of data and reconstruct it into
meaningful representations for decision making.
3.0 TYPE
Group Assignment (4 members)
This dataset contains the bank customers’ credit-related information. As a data analyst in
banking sector, you are commissioned to conduct an in-depth analysis of the customers with
different credit behaviour with the given dataset to identify the factors that differentiate
credit score of customers and provide useful recommendations to stakeholders.
Techniques
The dataset provided for this assignment consists of information about customers’
demographic such as age and occupation and their credit card usage behaviour such as
utilization ratio, outstanding debts, monthly balance, etc. In addition to the techniques (data
exploration, manipulation, transformation, and visualization techniques) covered in the
course to analyse the dataset, you might consider to explore and implement more advanced
concepts to enhance the effectiveness of data retrieval.
6.0 DELIVERABLES:
The complete RScript (source code) and report must be submitted to APU Learning
Management System (Moodle).
6.1 RScript (Program Code):
• Name the file under your group number.
• Start the first few lines in your program by typing all members names and TP
numbers. For example:
# Name1, TP000001
# Name2, TP000002
# Name3, TP000003
# Name4, TP000004
o For each objective example, provide student id and explain what you want to
discover. For example:
Objective 1: To investigate the relationship between revision method and
academic achievement.
Analysis 1-1: Which revision method are more effective?
Analysis 1-2: Is there a significant correlation between revision method and
academic achievement?
Analysis 1-3: What are the external factor that interact with revision method
to influence the academic achievement?
o For each extra feature example, give an id and provide the explanation.
# Extra feature 1
# Comments about the extra feature
Coursework Title
Intake
Students name and id
Date Assigned (the date the report was handed out).
Date Completed (the date the report is due to be handed in).
B) Contents:
o Introduction
✓ Data Description
✓ Assumptions (if any)
✓ Hypothesis and Objectives
o Data Preparation
✓ Data import
✓ Cleaning / pre-processing (if necessary)
✓ Data Validation (if necessary)
o Data Analysis
✓ Each objective (along with student name) must start in a separate page
and contains:
▪ Analysis Techniques – e.g. descriptive using statistics
▪ Screenshot of source code with output / plot.
▪ Outline the findings based on the results obtained.
✓ The extra feature explanation must be in a separate page and contains:
▪ Screenshot of source code with output/plot.
▪ Explain how adding this extra feature can improve the results.
✓ Interpret the results from each analysis
o Conclusion
✓ Overall discussion on the findings from all objectives
✓ Recommendation
✓ Limitation and future direction
✓ State the word count (at the end of page)
C) Workload Matrix
D) References
You may source algorithms and information from the Internet or books.
Proper referencing of the resources should be evident in the document.
✓ The font size used in the report must be 12pt and the font is Times New Roman.
▪ Plagiarism is a serious offence and will be dealt with according to APU and De
Montfort University regulations on plagiarism.