Stats413 Homework 4
Q = i的观测值。考虑以下模型：
E（Y | Q）=β0+β1Q3
E（Y | X）=β0+β1X，
如果将X替换为X − c，则可以重写方程式：
（X − c）=β
E（Y | X）=β0+β1X+β2X2
E（Y | X）=β0+β1X2
Load the data “Mroz” from the package carData. We’ll focus on variables inc which represents the household
income excluding the wife’s income, and k5 which is the number of children under 5 in the household.
a) Consider fitting a model predicting inc based upon k5. Without actually fitting any models (you
can explore the data), would you recommend including k5 as a continuous variable or a categorical
variable? Justify your recommendation briefly.
b) Regardless of your answer to a), fit the model predicting inc based upon a categorical k5. Note that
this does not imply that including k5 as categorical is the right approach or the correct answer to part
i) What is the reference category for k5?
ii) Interpret the results to briefly tell the full story regarding all the levels of k5.
iii) Having fit the model, provide evidence that is either for or against including k5 as a categorical
variable. (Your answer here and for part a) may be contradictory – it’s perfectly fine to adjust
your recommendation when you receive new data!)
(Hint: The emmip function from the emmeans package may be very helpful for parts ii) and iii).)
a) Consider the model
E(Y |X) = β + βX.
Note that here the intercept and slope are forced to be equivalent. Derive the least squares estimate
of β for this model.
(Hints: Be careful with signs. Your final answer should resemble other least squares estimates of β’s.
You may use any results we have previously derived.)
b) Verify that your estimate of β is unbiased. (Hint: It may make things more clear to simplify the
You are asked to carry out a regression analysis, predicting a respondent’s opinion on Fischer’s Shampoo
(response) which their local newspaper recently carried an advertisement for. The sample size is 2,901
respondents. The predictor variables you have are:
• age – Age (continuous)
• ses – Socio-economic status (Low income, middle income, high income)
• subscriber – Respondent subscribes to their local newspaper (No, Yes)
• primarypurchaser – Response to “I am the primary purchaser of goods in my household” (continuous,
1-5 scale, 1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, 5 = strongly agree)
Your boss tells you that they suspect the relationship between opinion on Fischer’s Shampoo and
primarypurchaser to be quadratic.
a) Design a linear regression model for this data that uses all available predictors and information above.
b) For your model, what are the dimensions of X (the data) and X (the design matrix).
c) Your boss asks you to also include a quadratic relationship between subscriber and the respondent’s
opinion. Either modify your model to include this, or explain why you shouldn’t/can’t do that.
d) What would the predicted value from your model be for a 37-year old middle-class respondent who
agrees that they are the primary purchaser for their household? (Your answer should be a formula
involving predicted coefficients and scalars.)
EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!
E-mail: firstname.lastname@example.org 微信:easydue