DA5110 Business Statistics
Take-home assignment
Deadline: TBD
• This assignment requires you to select a dataset of your choice which has AT LEAST 100
observations (rows) and 10 variables (columns).
• From this dataset, you should select one variable (Y) as your key variable of interest and
identify AT LEAST 4 other variables which would have a relationship with your selected Y
variable. Ideally, Y should be a numerical variable and the dataset should have a mix of
numerical and categorical variables.
• You may use a dataset taken from your workplace or from any publicly available data source
(e.g. World Bank or IMF databases for international data, Department of Census and Statistics
or Central Bank of Sri Lanka for data on Sri Lanka, etc.).
• You will use this dataset to explore several research questions/hypotheses of your choice
using the techniques covered in this course and prepare a report based on your analysis.
Your report should be structured as follows:
1. Brief introduction: your research area, research questions, description of data source
2. Descriptive analysis of the variables you have selected for your analysis (you should use graphs
and/or summary statistics and comment on the distribution of the variables)
3. Test FOUR (04) hypotheses based on the variables in your dataset (you should use at least
two of the types of tests covered in the course).
a. Describe the hypothesis in words
b. Translate the hypothesis into a null and alternative hypothesis
c. Identify which type of test you will use to test the hypothesis.
d. Carry out the test and interpret your findings
4. Develop a regression model to explain the factors affecting your key variable of interest (Y)
and interpret the results obtained from estimating the model
Guidelines for the regression model
When building the multiple regression model:
1. Before estimating the model, you should mention which explanatory variables you think
should be in the model, explaining why you think they should be in the model and what the
expected effect should be (positive or negative).
2. Interpret all the coefficients that you have estimated including whether the coefficients have
taken the signs (positive or negative) that you expected.
3. Interpret the R2 and adjusted R2 and overall joint significance of your final model
The content of the report (excluding the Appendix) should be no longer than 7 pages.