本次作业代写的主要内容是统计建模相关

截止日期:3月17日,下午5点。
回答所有问题。显示所有工作。

问题1

假设我们拟合模型

a)什么是Cov(Y,e | X)? b)什么是Cov(Yˆ,e | X)? c)什么是Cov(Y,Yˆ | X)?

问题2

E(Y | X)=Xβ。

Stats413家庭作业6

在讲座中,我们仅对X和Y中的一个进行对数转换时,将这两个过程进行了粗略的组合,并获得了对X和Y均进行对数转换时的系数的解释。

来这个答案。换句话说,类似于仅对X或Y执行对数转换时的操作,

条件均值(您可能假设p = 1)表明X的P百分比随

(1 + P)βˆ1的Y的预测平均乘数变化。 100

问题3

给定以下具有3个特定点{1,2,3}的散点图,可以解决以下问题。散点图中绘制的线是所有观测值的OLS估计线拟合。

a)根据最高至最低杠杆来安排三个特定点。 (基于外观检查,无需计算。)

b)从最高影响到最低影响对三个确定的点进行排名。 (基于外观检查,无需计算。)

c)您担心{1,2,3}的哪一点可能是异常值?证明你的答案。

d)您担心{1,2,3}的哪一点可能是有问题的异常值?证明你的答案。

e)在图上找到的值范围内(X∈{≈5,≈10.5},Y∈{≈23,≈30}),您认为最有问题的点的坐标是什么?

问题4

对于“远程”包中的数据,拟合以下两个模型,并查看其残差图和拟合图。您不应为此问题提交任何R或Rouptut,只需解决以下问题即可。

E(g | X)=β0+β1黑色+β2烟E(g | X)=β0+β1黑色+β2烟+β3黑色×烟

a)RVF图显示了到目前为止我们还没有看到的模式。解释为什么他们有这些模式。

b)可以修改或添加哪些模型以破坏a)中所示的模式?

c)为什么第一个模型的残差图似乎具有三个聚类,而第二个模型的残差图似乎具有四个聚类?提供为什么会发生这种情况的数字证据。

d)一个模型可以产生的最小唯一预测数是多少?那是什么型号

2个

ÿ

问题5

密歇根州正试图找到最有效的资金用途,以改善其公民获得适当医疗服务的机会。要进行检查,将为您提供一个数据集,其中每一行都是密歇根州的一个城市,并具有以下变量:

改变中

使用权

医院
教育
运输

描述

没有严重未经治疗的医疗状况的个人所占的百分比

市内医院(10,000美元)市内教育(10,000美元)市内交通(10,000美元)

以下是数据和回归输出的摘要统计信息。

>摘要(访问)
第一区域和第三最大区域的最小中位数。

62.187 69.013 79.221 84.586 89.879 98.970
>摘要(医院)

第一区域和第三最大区域的最小中位数。
12.996 26.764 29.218 30.090 39.770 48.258

>摘要(教育)
第一区域和第三最大区域的最小中位数。

0.000 8.412 19.154 15.114 22.332 37.665
>摘要(运输)

第一区域和第三最大区域的最小中位数。
3.353 5.822 8.715 7.441 12.631 22.005

称呼:
lm(公式=访视〜医院+教育+交通,数据= MIstate)

系数:

(拦截)26.84
医院6.235
教育4.845
运输1.884

1.428 18.80 <2e-16 ***

估计标准误差t值Pr(> | t |)

3.130 1.992
1.697 2.855
0.383 -4.922

0.047 *
0.005 **
1e-06 ***

签名。代码:0’***’0.001’**’0.01’*’0.05’。 ‘0.1’’1

残差标准误差:428自由度上的7.928
多个R平方:0.1720,已调整R平方:0.1131
F统计:3和428 DF上的12.85,p值:4e-08

a)用于拟合模型的数据的样本量是多少?

b)这个模型看起来合适吗?提供至少两个数字证据。

c)如果您要提出建议,您推荐的三种资金用途(医院,教育或交通)中哪一项将是该州的最佳投资?证明您的选择。

d)正式说明您建议的资金使用系数。

3

Due Date: Mar 17, 5pm.
Answer all questions. Show all work.

Question 1

Assume we fit the model

a) What is Cov(Y, e|X)? b) What is Cov(Yˆ , e|X)? c) What is Cov(Y, Yˆ |X)?

Question 2

E(Y |X) = Xβ.

Stats413 Homework 6

In lecture we derived the interpretation of coefficients when both X and Y are log-transformed by crudely combining the two manipulations made when only one of X or Y is log-transformed.

Derive this answer. That is, similar to what we did when only X or only Y are log-transformed, starting

with the conditional mean (you may assume p = 1) show that a P percent increase in X is associated with

a predicted average multiplicate change in Y of (1 + P )βˆ1 . 100

Question 3

Given the scatterplot below, with the 3 identified points {1,2,3}, address the following questions. The line drawn in the scatterplot is the OLS estimated line fit on all observations.

  1. a)  Rank the three identified points in terms of highest to lowest leverage. (Based upon visual inspection, no need for calculation.)
  2. b)  Rank the three identified points in terms of highest to lowest influence. (Based upon visual inspection, no need for calculation.)
  3. c)  Which points of {1,2,3}, if any, would you be concerned may be an outlier? Justify your answer.
  4. d)  Which points of {1,2,3}, if any, would you be concerned may be a problematic outlier? Justify your answer.
  5. e)  Within the range of values found on the plot (X ∈ {≈ 5, 10.5}, Y ∈ {≈ 23, 30}), what would be the coordinates of a point that you would argue would be the most problematic outlier?

1

3

1

2

30

28

26

24

6 7 8 9 10

X

Question 4

For data phbirths in package “faraway”, fit the following two models and look at their residual vs fitted plots. You should not submit any R or R ouptut for this question, just address the questions below.

E(grams|X) = β0 + β1black + β2smoke E(grams|X) = β0 + β1black + β2smoke + β3black × smoke

  1. a)  The RVF plots show patterns we haven’t seen so far. Explain why they have these patterns.
  2. b)  What could you modify or add to the models to break the pattern seen in a)?
  3. c)  Why does the first model’s residual plot appear to have three clusters, while the second model’s residual plot appears to have four clusters? Provide numerical evidence of why this is happening.
  4. d)  What is the minimum number of unique predicted values a model could produce? What would that model be?

2

Y

Question 5

The state of Michigan is trying to figure out the best use of funds in order to improve access to proper medical care amongst its citizens. To examine this, you are given a data set where each row is a city in Michigan, and has the following variables:

Variable

access

hospital
education
transportation

Description

the percent of individuals who do not have outstanding untreated medical conditions

spending on hospitals in the city (in $10,000) spending on education in the city (in $10,000) spending on transportation in the city (in $10,000)

The following are summary statistics of the data and regression output.

> summary(access)
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  62.187  69.013  79.221  84.586  89.879  98.970
> summary(hospital)
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  12.996  26.764  29.218  30.090  39.770  48.258
> summary(education)
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   0.000  8.412   19.154  15.114  22.332  37.665
> summary(transportation)
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   3.353  5.822    8.715   7.441  12.631  22.005
Call:
lm(formula = access ~ hospital + education + transportation, data = MIstate)
Coefficients:
(Intercept)      26.84
hospital         6.235
education        4.845
transportation  -1.884
---
1.428   18.80  < 2e-16 ***
Estimate Std. Error t value Pr(>|t|)
3.130   1.992
1.697   2.855
0.383  -4.922
0.047 *
0.005 **
1e-06 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.928 on 428 degrees of freedom
Multiple R-squared:  0.1720,    Adjusted R-squared:  0.1131
F-statistic: 12.85 on 3 and 428 DF,  p-value: 4e-08
  1. a)  What is the sample size of the data used to fit this model?
  2. b)  Does it appear that this model is a good fit? Provide at least two pieces of numeric evidence.
  3. c)  If you were to make a recommendation, which of the three uses of funds (hospitals, education or transportation) would you recommend would be the best investment for the state? Justify your choice.
  4. d)  Give a formal interpretation of the coefficient on the use of funds you are recommending.

EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

E-mail: easydue@outlook.com  微信:easydue


EasyDue™是一个服务全球中国留学生的专业代写公司
专注提供稳定可靠的北美、澳洲、英国代写服务
专注提供CS、统计、金融、经济、数学等覆盖100+专业的作业代写服务