本次英国STAT统计代写的主要内容是R语言进行统计建模

Assignment 4 – 5CCM242A

本文档详细介绍了数据分析报告4的内容,如果您正在使用模块的5级版本,则需要在2021年4月8日下午4点之前在模块的KEATS页面上提交。请以.pdf / .docx或同等格式提交最多3(三)A4页,且最小字体大小为12pt。必要时,应包括数字和R代码。如果您提交的内容超过三页,则只会标记您提交的前三页。

您的提交内容必须是匿名的。
您的提交内容将使用以下评分方案,以不超过100的比例进行评分,这将有助于

模块最终标记的40%。

评分方案

练习1:30分

练习2:20分

练习3:30分

介绍:

10分写得好,条理清晰的报告。
10个标记情节的完整性和可读性(标题,标签,位置等)。

练习1

从模块的KEATS页下载,然后将R数据集Darts.csv导入R,该数据集包含在得克萨斯州Fort Hood进行地表勘测期间回收的91个古镖的数据。这些数据已从R包archdata中提取。数据集包含以下变量:

• 姓名。飞镖点类型:Darl,Ensor,Pedernales,Travis,Wells•长度。最大长度(毫米)
• 宽度。最大宽度(毫米)
• 厚度。最大厚度(毫米)

(a)必要时使用适当的模型选择策略和变量转换,为Width选择并拟合最佳线性回归模型。这包括检查模型假设并解决明显的问题。评论您无法解决的潜在问题(如果有)。 [25分]

(b)Travis类型的镖(长度= 50,宽度= 23和厚度= 8)的预计重量是多少? [5分]

练习2

从模块的KEATS页面下载并在R中导入数据集best.csv。该文件包含来自响应变量y和2个预测变量x1和x2的观察值。

1个

(a)通过估计适当的模型,找出使期望响应y最小的x1和x2值。 [15分]

(b)为在(a)部分中选择的x1和x2值的预期响应提供95%的置信区间。 [5分]

练习3

R中的翘曲数据集包含有关2种不同类型的羊毛(羊毛,编码为A和B)和3种不同级别的张力(拉伸,编码为L,M和H)的翘曲断裂数(断裂)的数据。您可以按照以下说明将数据加载到R中:

现在拟合以下广义线性模型:

glm1 <-glm(断裂〜羊毛*张力,家庭=泊松,数据=经纱断裂)

(a)说明已拟合的模型,明确写下模型假设以及中断次数的预期值与预测变量之间的关系。对模型参数的估计是多少? [25分]

(b)当羊毛为A型且张力为M级时,预期的断头次数是多少? [5分]

库(数据集)数据(“ warpbreaks”)

This document details the content of the Data Analysis Report 4 that you will need to submit on the KEATS page of the module by 4pm on April 8th, 2021, if you are taking the level 5 version of the module. Please submit up to 3 (three) A4 pages in .pdf/.docx or equivalent formats with minimum font size of 12pt. This should include figures and R code when necessary. If your submission includes more than three pages, only the first three pages of your submission will be marked.

Your submission must be anonymous.
Your submission will be marked on a scale up to 100 with the marking scheme below and it will contribute to

40% of the final mark for the module.

Marking scheme

  • Exercise 1: 30 Marks
  • Exercise 2: 20 Marks
  • Exercise 3: 30 Marks
  • Presentation:

    10 Marks A well written and well organised report.
    10 Marks Completeness and readability of the plots (title, labels,position, etc…).

    Exercise 1

    Download from the KEATS page of the module and import in R the dataset Darts.csv, which contains data on 91 Archaic dart points recovered during surface surveys at Fort Hood, Texas. These data have been extracted from the R package archdata. The dataset contains the following variables:

    Name. Dart point type: Darl, Ensor, Pedernales, Travis, Wells Length. Maximum Length (mm)
    Width. Maximum Width (mm)
    Thickness. Maxmimum Thickness (mm)

    (a) Using an appropriate model selection strategy and variables transformation if necessary, choose and fit the best linear regression model for Width. This includes checking the model assumptions and fixing obvious issues. Comment on potential issues that you were not able to fix (if any). [25 Marks]

    (b) What is the predicted weight for a dart of type Travis, Length=50, Width=23 and Thickness=8? [5 Marks]

    Exercise 2

    Download from the KEATS page of the module and import in R the dataset optimal.csv. This file contains observations from a response variable y and 2 predictors x1 and x2.

1

(a) By estimating the appropriate model, find out the values of x1 and x2 that minimise the expected response y. [15 Marks]

(b) Provide a 95% confidence interval for the expected response at the values of x1 and x2 chosen in part (a). [5 Marks]

Exercise 3

The dataset warpbreaks in R contains data about the number of warp breaks (breaks) for 2 different types of wool (wool, coded A and B) and 3 different level of tensions (tension, coded L, M and H). You can load the data in R with the following instructions:

Fit now the following generalised linear model:

glm1<- glm(breaks ~ wool*tension,family=poisson, data = warpbreaks)

(a) Explain what model has been fitted, writing down explicitly the model assumptions and the relationship between the expected value of the number of breaks and the predictors. What are the estimates for the parameters of the model? [25 Marks]

(b) What is the expected number of breaks when wool is of type A and the tension is at level M? [5 Marks]

library(datasets) data(“warpbreaks”)

2