Homework 6

X1：美容院和百货商店中的销售点展示费用（X \$ 1000）。
X2：本地媒体广告支出。
X3：按比例分配的国家媒体广告支出。
Y：销售额（X \$ 1000）。
1.（5）测试销售与三个预测变量之间的回归关系。陈述假设，

H0：所有参数均等于零
Ha：至少一个参数不等于零
F检验统计量= 38.28，p值= 7.821e-12

2
2.（5）使用“通常”图（散点图，

3

QQ情节中出现异常模式。
3.（5）为每个预测变量准备部分回归图。您的阴谋是否暗示

4

4.（5）是否有偏远的Y观测？ （显示诊断图并根据学生情况进行测试

g = 44，alpha = 0.05，p = 4
Bonferroni临界值= 3.450183

Bonferroni临界值= 3.2667
5

5.（5）是否有任何X边的观测值？ （显示诊断图并根据Hat值进行测试）。

6.（5）有影响力吗？ （显示库克的距离图，并根据库克的距离进行测试

6

MSE = 3.33，
e6 = 9.34 –（1.0233 + 0.9657 * 6.1+ 0.6292 * 5.8 + 0.6754 * 3.4）= -3.51979
= -3.52159888（从R输出）
e16 = -2.79435（从R输出）
e30 = -5.42165061（从R输出）
D6 =（-3.52159888）2
/(4(3.33))(0.08230536/(1-0.08230536)2
）= 0.09099
D16 =（-2.79435）2
/(4(3.33))(0.12790933/(1-0.12790933)2
）= 0.09859
D30 =（-5.42165061）2
/(4(3.33))(0.04798828/(1-0.04798828)2
）= 0.1168

7. Is there a serious multicollinearity problem?
a) (4) Include an appropriate scatterplot and correlation values between the explanatory variables.
There is a multicollinearity problem between x1 and x2 based on the correlation and scatterplot.
7
b) (4) Judge by VIF, do you think there is a problem with multicollinearity? (Hint: VIP or tolerance)
Since the VIF of X1 associate with X2 and X3 is 20.07
The VIF for X2 associated with X1 and X3 is 20.7
The VIF for X3 associated with X1 and X2 is 1.2
The VIF for X3 associated with X1 is 1.2, with X2 is 1.2
The VIF for X1 associated with X2 is 19.8
There is a multicollinearity problem in X1 and X2
The answers mean the same conclusion: x1 is highly related to x2.
8. Instead of removing variables, we are going to use the Ridge Regression to determine the parameter values.
a) (5) Make a ridge trace plot. What value of the parameter (? ?? ?) do you believe is best? Explain your
choice.
Based on the output, ? = 1 is the best value.
b) (5) Using the VIF factors, what value of the parameter do you believe should be used? (Hint: Look at both
the graph and the printed numbers.) Explain your choice
8
9
Among all k values, VIF values of the parameters are closest to 1 when k = 0.1.

y = 1.333 + 0.7763×1+0.7571×2+0.6565×3
9.(25) A personnel officer in a governmental agency administered four newly developed aptitude tests
to each of the 25 applicants for entry level clerical positions in the agency. For purpose of study, all 25
applicants were accepted for positions irrespective of their test scores. After a probationary period, each
10
applicant was rated for proficiency on the job. The scores on the four tests (X1, X2, X3, X4) and the job
proficiency score (Y) for the 25 employees were recorded in proficiency.csv
a). (5) Obtain the scatter plot matrix and the correlation matrix of the X variables, what do the scatter
plots suggest about the nature of the function relationship between the response variable and each of the
predictor variables?
It is possible that x3 and x4 have multicollinearity problem.
Y seems to be more related with x3 and x4 than x2 or x1.
b). (5) Fit the multiple function containing all four predictors at first-order terms. Does it appear that
all predictor variables should be retained?
11
Based on the output, the p values are all smaller than 0.05 except for x2. Therefore, not all predictors
should be retained, and x2 can be dropped.
c). (10) Use the proficiency data, select the best subset regression models according to the
????
2
, ??, ????, ????, ??? ????? and discuss your selection.
12
Corresponded model based on small Cp: y~x1+x3+x4
Corresponded model based on large adjusted R2
: y~x1+x3+x4
The smallest AICp is 73.847.
Its corresponded model is y~x1+x3+x4
The smallest SBCp is 78.723
Its corresponded model is y~x1+x3+x4
The smallest PRESS is 471.452
Its corresponded model is y~x1+x3+x4
Therefore, the best subset regression model is y = ?0 + ?1×1 + ?3×3 + ?4×4
d). (5) Run a 5 fold cross validation on the model identified in c).

The model selected from c has the smaller RMSE of 4.23. EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

E-mail: easydue@outlook.com  微信:easydue

EasyDue™是一个服务全球中国留学生的专业代写公司