Assessment Briefing To Students: School of Computing, Science & Engineering
Assessment Briefing To Students: School of Computing, Science & Engineering
1
Date on which brief was given to students
2
Date by which assessment is to be submitted
3
Date by which feedback will be made available to students. This must be within 20 working days of the submission date.
The Assessment Task
Overview
This coursework will give you the opportunity to use the techniques covered in this module to organise and analyse a collection of
data that interests you and to draw conclusions based on your analysis and finally to present your results in the form of a report.
Explanation
You must apply Classification, Association Rules Mining, Clustering and Text Mining as your approaches. This means that you will
need to choose a dataset that is amenable to each of these types of data mining -- i.e., to building a model that will determine,
predict, or estimate one of the attributes in the dataset, based on the values of other attributes.
Task 1: Apply classification on a selected dataset using R & SAS Enterprise Miner (e.g., Decision Tree, K-Nearest
Neighbours, Logistic Regression). (25 Marks)
Task 2: Apply Association Rules Mining on a selected dataset using R & SAS Enterprise Miner. (25 Marks)
Task 3: Apply K-Means Clustering on a selected dataset using R & SAS Enterprise Miner. (20 Marks)
Task 4: Apply Sentimental Analysis on selected 20 hotels in the hotel_reviews.csv dataset Using R & SAS EM (dataset
can be downloaded from the Blackboard) (20 Marks)
Title
The title should provide an overview of the focus of your problem and the expected solution.
Introduction
This section contains a brief background to the topic and leads to the formulation of the specific question, based on your selected
topic. The research question must be focused and clear.
Datasets
You are welcome to choose any datasets that interest you, and that has enough data to enable meaningful analysis. In making your
choice, you should be sure to consider what problem or problems you would be able to solve by employing data mining on the
dataset. In other words, you should ask yourself: How could I use data mining to answer one or more questions about the
datasets?
Implementation in R
Implement your proposed approach using package(s) available in R programming. This section will include:
• A brief description of the R package(s) used.
• The application of data-mining techniques to selected datasets that you choose using R.
• Explanation of the experimental procedure, including the setting and optimisation of model parameters during training.
• Visualisation of the results.
Results analysis and discussion
• Explain and justify the performance metric you choose to use to evaluate the model(s).
• A clear and compelling presentation of the results that you obtain, both from the data mining and any other analysis that you
may perform.
• Compare and discuss the results obtained from R implementation with the results obtained in SAS Enterprise Miner.
Conclusions
The key points from the assignment must be synthesised within the conclusion. This must relate back to the introduction and the
research question and provide an overall evaluation of the validity of the solution you have proposed.
References
You will list all publications referenced in the report. You should show evidence of sufficient readings related to your work.
References must follow the Harvard formatting system as in this guide: http://www.salford.ac.uk/library/help/user-
guides/general/Bibliographic-Citations-APA-QuickRef-Apr2015.pdf
Appendices
Appendices may be used to provide relevant supporting evidence for reference but should only be used if necessary. Students may
wish to include in appendices, evidence which confirms the originality of their work or illustrates points of principle set out in the
main text.
Workload
This assessment should require approximately 120 hours of effort.
Marking scheme
The work will be assessed using a marking grid comprising weighted components (provided below). This is indicative of the
standard of work required at different levels within the assignment.