This project is designed to show your understanding and application of the concepts covered within
Identify a dataset to produce a linear model which includes one response variable with at least
three possible regressors. You must provide an explanation of the possible relationship between the
variables and the relevance of this relationship. Provide a reference to your data source and a short
description of their original collection method.
Conduct at least three detailed linear regression analyses, using statistical software (R or Rstudio) in
order to determine the most appropriate linear model for the data. This requires you to evaluate
each model, including testing the assumptions and a discussion of the results for each model.
Provide a conclusion on the most appropriate model for your situation. Give predictions of the
response for points that are not included in your dada set.
You can use any data source, but you must correctly cite the origin of the data. If it is original
experimental data (unpublished) then please provide a detailed summary of the collection methods
used. The dataset should contain a minimum of 20 observations and a minimum of 4 variables.
It is recommended to structure your submission in a report format with an introduction, method,
results, discussion and conclusion to fulfil the requirements of marking schedule below.
Include the data in your submission and remember to explain and give code for each analysis you
perform. Also, do not forget to do some descriptive statistic and initial graphs with your data at
the beginning of your analysis. All the outputs and codes you include must come with
justifications and comments.
Clear identification of the response and the regressor variables (at least four). 1 mark
Discussion of possible relationships between variables and the relevance of the
relationship in the context of the data.
Descriptive statistics, justifications and comments 3 marks
Relative graphs of the data, justifications and comments 3 marks
Collection method described, data description, study background information. 3 mark
Correct and sufficient referencing of the data source and the method, codes used. 3 mark
At least three linear regression analyses and models of the data to evaluate
relationships. Model selection and variable selection approaches.
Tests of assumptions of each method used, graphical and statistical tests,
justifications and comments
Discussion of assumptions, test results and validity of each model. 8 marks
Detailed conclusion on the most appropriate model including context. Justify why
and give at least two examples of model predictions with discussion.
Complete Software Code or Session Information provided as an appendix. 5 mark
EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!
E-mail: firstname.lastname@example.org 微信:easydue