这个作业是完成统计模型相关的练习题

Stats413 Homework 6

问题1
假设我们拟合模型
E(Y | X)=Xβ。
a)什么是Cov(Y,e | X)?
b)什么是Cov(Y,e ˆ | X)?
c)什么是Cov(Y,Yˆ | X)?
问题2
在讲座中,我们推导了X和Y均被粗对数转换时系数的解释
组合仅对X或Y中的一个进行对数转换时进行的两种操作。
得出这个答案。 也就是说,类似于我们仅对X或仅对Y进行对数转换时所做的操作,
条件均值(您可能假设p = 1)表明X的P百分比增加与
Y的预测平均乘数变化为(1 + P
100)
βˆ1

Question 3
Given the scatterplot below, with the 3 identified points {1,2,3}, address the following questions. The line
drawn in the scatterplot is the OLS estimated line fit on all observations.
a) Rank the three identified points in terms of highest to lowest leverage. (Based upon visual inspection,
no need for calculation.)
b) Rank the three identified points in terms of highest to lowest influence. (Based upon visual inspection,
no need for calculation.)
c) Which points of {1,2,3}, if any, would you be concerned may be an outlier? Justify your answer.
d) Which points of {1,2,3}, if any, would you be concerned may be a problematic outlier? Justify your
answer.
e) Within the range of values found on the plot (X ∈ {≈ 5, ≈ 10.5}, Y ∈ {≈ 23, ≈ 30}), what would be
the coordinates of a point that you would argue would be the most problematic outlier?
Question 4
For data phbirths in package “faraway”, fit the following two models and look at their residual vs fitted
plots. You should not submit any R or R ouptut for this question, just address the questions
below.
E(grams|X) = β0 + β1black + β2smoke
E(grams|X) = β0 + β1black + β2smoke + β3black × smoke
a) The RVF plots show patterns we haven’t seen so far. Explain why they have these patterns.
b) What could you modify or add to the models to break the pattern seen in a)?
c) Why does the first model’s residual plot appear to have three clusters, while the second model’s residual
plot appears to have four clusters? Provide numerical evidence of why this is happening.
d) What is the minimum number of unique predicted values a model could produce? What would that
model be?