QBUS2810: Statistical Modelling for Business
Assignment Task #3
rstock-rf =αstock+βstock(rm-rf)+ ut
rstock-rf =αstock+βstock(rm-rf)+β2SMB+β3HML+εt
rstock是股票的收益率,rf是无风险收益率,rm是收益率整个股市。参数αstock是股票的“ alpha”。它衡量多少库存在CAPM下的表现优于其“理论”的预期收益,而βstock就是该股票的“ beta”,衡量股票在整个市场中的敞口。不同的股票会有不同的参数。
rBHP =在澳大利亚证券交易所观察到的必和必拓股票的月收益。
rm =市场每月收益指数,这里是所有普通股指数(AOI)。
SMB =小市值减去大市值因子。
HML =高市销率减去低市销率
您假设每月无风险利率rf = 0.005。您的任务是估算Fama-French使用给定数据的三因素模型。并确定它是否更好地解释了必和必拓的股票收益率与所有普通股指数给出的市场超额收益率相比。
rstock − rf =β0+β1(rm − rf)+ ut
(e)测试超额市场收益是否解释了必和必拓股票在α= 0.05水平。
(f)测试BHP的“β”值是否在α= 0.05时大于1。
rstock − rf =β0+β1(rm − rf)+β2SMB+β3HML+εt
(h)建立一般线性假设,以测试Fama-French 3-Factor CAPM该模型比单因素CAPM模型更好地解释了股票收益;即确定L,β,对于H0为c:Lβ= c。
(j)一位财务分析师认为,账面市值(HML)对股票收益的影响为是市值(SMB)的两倍。提出适当的假设测试并使用重新参数化将其转换为简单的t检验以测试断言。执行所需的回归并在α= 0.05的水平上陈述您的结论。
形状红色绿色蓝色黄色圈52、44 67、61 36、44 45、41平方34,36 56,58 36,31 21,25
Y =β0+β1C+β2R+β3G+β4B+β5CR+β6CG+β7CB+ε
如果形状=圆形,则C = 1;否则为0。如果颜色=红色,则R = 1;否则,否则为0。如果颜色是G = 1=绿色;否则为0。如果颜色=蓝色,则B = 1;否则,否则为0。
Shape Red Green Blue Yellow
Circle µ11 = β0 + β1 + β2 + β5
(d) The factor effects model is Yijk = µ.. + αi + βj + (αβ)ij + εijk where µ..
is a constant. αi are
constants subject to the restriction Pαi = 0. βj are constants subject to the restriction Pβj
= 0. (αβ)ij are constants subject to the restrictions P
(αβ)ij = 0. εijk are independent
N(0, σ
), i = 1, 2, …, a; j = 1, 2, …, b; k = 1, 2, …, n.
Why are the constraints Pαi =
Pβj =
P(αβ)ij = 0 required? What is the advantage of this
(e) Refer to Part (d). Modify the factor effects model to apply to this study with a = 2 and b = 4.
(f) Set up the Y, X, and β matrices for the factor effects regression model.
(g) Refer to part (e). Obtain the fitted regression function.
(h) Plot the residuals against the fitted values and the QQ-plot of the residuals. Use these two
residual plots to check if the assumptions of two-way ANOVA are justifiable. Briefly explain.
(i) Plot an interaction plot. What does this plot suggest?
(j) Fill in the blanks in the following ANOVA table.
Source of Variation SS df MS Between treatments
Factor A
Factor B
AB Interactions
(k) Test if the two factors interact.
(l) Is it meaningful here to test for main factor effects? If so, test if the main effects for color and shape are present.
(m) All pairwise comparisons among the color group level means via Tukey procedure with a 95 percent family confidence coefficient are constructed below:
Treatment Difference Lower 95% limit Upper 95% bound
Red Green -19.00 -35.8696 -2.1304
Red Blue 4.75 -12.1196 21.6196
Red Yellow 8.50 -8.3696 25.3696
Green Blue 23.75 6.8804 40.6196
Green Yellow 27.50 10.6304 44.3696
Blue Yellow 3.75 -13.1196 20.6196
Determine which means differ using Tukey’s multiple comparison test.
(n) Based on the above analysis, what combination of color and shape should be used for the logo design?
(o) Suppose that in the shape population, 60 percent are circle, and 40 percent are square. Construct a 95% percent confidence interval for the mean overall rating in the shape population.
3. A person’s muscle mass is expected to decrease with age. To explore this relationship in women,a nutritionist randomly selected 4 women from each 10-year age group, beginning with age 40 and ending with age 79. X is age, and Y is a measure of muscle mass.
(a) Below is a scatter plot of the data with muscle mass on the y axis and age on the x axis.
Based on the plot, does it seem reasonable that there are two different (but connected) regression functions – one when age ≤ 60 and one when age > 60?
(b) The nutritionist conjectures that the regression of muscle mass on age follows a two-piece linear relation, with the slope changing at age 60 without discontinuity. State the regression model that applies if the nutritionist’s conjecture is correct.
(c) Refer to part (b). What are respective response functions when age is 60 or less and when age is over 60?
(d) Explain whether or not the model specified in part (b) violates the principle of marginality.
Also, discuss and show whether or not this model is continuous at X = 60. Is continuity or
marginality more important here and why?
(e) Estimate the regression model specified in part (b). Copy and paste the regression output into your answer sheet. Write down the fitted regression equation.
(f) Test whether a two-piece linear regression function is needed at α = 0.05.
(g) Refer to part (e). What is the estimated regression function for muscle mass whose age ≤60? for muscle mass whose age > 60?
(h) Based on your estimated regression function, what is the predicted muscle mass when age =50? When age = 70?
(i) Do you get the same prediction for age = 60 regardless of which estimated regression function in part (e) you use?
(j) Modify the regression model in part (b) with the slope changing at age 60 without continuity.
(k) Specify the regression model for the case where the slope changes at age 40 and again at age 60 with no discontinuities.