这个作业是用R语言计算离散随机变量等问题

STA457/STA2202 – Assignment 2

1.考虑两个离散随机变量X,Y,它们具有列联表给出的联合概率:
P(X,Y)Y = -1 Y = 0 Y = +1
X = -1 .05 .10 .15
X = 0 .15 .15 .10
X = +1 .15 .00 .15
(a)[2分]找出给定X的Y的最小均方误差(MMSE)预测因子,即条件
期望g(X)= E [Y | X]以及它达到的MSE,即E [(Y − g(X))2
]。
(b)[2分]根据BLP系数找到给定X的Y的最佳线性预测器(BLP),即Y = a + bX
a,b及其实现的MSE。
(注意:这是MMSE预测变量和BLP不同的示例。)
2.考虑AR(1)模型Xt =φXt-1+ Wt,Wt〜WN(0,σ2
w)。
(a)[3分]查找提前1和2步BLP错误之间的协方差,即查找
Cov
(Xn + 1-Xn
n + 1)(Xn + 2-Xn
n + 2)

作为(φ,σ2
w)。
(请注意:该值应为非零;通常,不同的提前预测将是相关的。)
(b)[3分]查找随后的提前1个BLP错误之间的协方差,即:
Cov
(Xn − Xn−1
ñ
)(Xn + 1-Xn
n + 1)

作为(φ,σ2
w)。
(注意:在完全了解参数的情况下,这些类似于模型残差。)
3. [5分; STA2202(研究生)仅学生] SS 3.26
(注意:估计的BLP Xˆ n
n + m基于拟合参数(φˆ,θˆ,σˆ
2
w)不如
基于真实参数的理论BLP。这个问题表明,对于AR(1),提前1步
预测,它们的差异在概率上通常为1 /

n。)
实践
描述
1个
这是您上班的第一天,而您的老板在2013年从UofT Statistics计划毕业,
给您提供了预测时间序列的任务。您的预测将作为公司预算的输入,因此
准确无误至关重要。您的老板希望您向他们提供预测,以及
您如何提出他们的描述。
分配结构
您将获得一个时间序列,并且必须对接下来的十二个观测值做出预测。您可以
在RStudio Cloud项目的Student Data子文件夹中找到您的时间序列;数据文件的名称
是您的学生编号,并且您预测的系列名称在文件的第一行。您的
提交内容将包括两个文件:
1.一份500字的PDF书面报告,所有代码均在附录中。
2.一个名为XXXXXXXXXXX.csv的CSV文件,其中XXXXXXXXXX是您的学生编号。此CSV应该
包括对系列下十二个值的预测;第一个条目应该是您的单步预测,而第十二个条目应该是您的12步提前预测(另请参见示例文件
项目的“示例”子文件夹中的123456789.csv。)

Written Report
Your written report should be able to be understood by your boss, someone who remembers the main ideas
from a time series course several years ago, but not the finer details. Be sure to clearly explain what you
have done, and if you are using any advanced concepts a sentence or two to refresh your boss on what they
are is a good idea. Your written report must include the following:
1. A discussion of the characteristics of the time series (e.g. trend, seasonality, stationarity)
2. An explanation of any data preprocessing you had to do.
3. The model which you used.
4. A graph of the time series, with your forecasts in a different colour (see graph below for an example)
5. A discussion of your model’s fit (diagnostics) and limitations.
The list above is what your written report must contain, but not an exhaustive list of all that it can
contain. If there are any other topics that are worth discussing related to how you forecasted the data,
please include them.
0 20 40 60 80
0.5 1.0 1.5 2.0
Example Time Series
Time
Level
2
Tips
• This is a report to your boss. Concise & clear is better. They do not want to see single spaced size 6
font with expanded margins. They want to see all important and relevant information neatly organized.
• If you are going to include a code snippet in your written report (this is not required), make sure it is
important enough to warrant your boss’ attention.
• Make sure the model you choose, and how you fit it, makes sense. The data you are working with may
violate some basic time series assumptions.
Assessment (15pts total)
• 1pt Your written report has a clean layout, and includes the requested graph.
• 1pt The text of your report is easy to follow, and conveys ideas effectively.
• 1pt Your CSV file with your predictions is properly formatted.
• 2pt Time Series Characteristics
– 1/2 Some mention of the important time series characteristics.
– 2/2 A clear identification of all important time series characteristics.
• 2pt Data preprocessing
– 1/2 Some vague explanation of how the data has been preprocessed is provided.
– 2/2 A clear explanation of how the data was preprocessed and the justification for why it was
done.
• 2pt Model Explanation
– 1/2 You have included a model description, but little in the way of explanation.
– 2/2 You have concisely and clearly explained your model.
• 2pt Model Fit and Limitations
– 1/2 Give vague description of the model’s fit and limitations.
– 2/2 Give clear and accurate description of the model’s fit and limitations.
• 4pt Forecast Accuracy
– 1pt Your method beats the naive forecast (the entire forecast is equal to the last datapoint)
– 1pt Your forecast beats the forecast produced by the R code ts_arima_model = auto.arima(x);
forecast(ts_arima_model, h = 12)
– 1pt Your forecast beats the forecast produced by the R code ts_ets_model = ets(x);
forecast(ts_ets_model, h = 12)
– 1pt Your method beats all of the naive, auto.arima(), and ets() methods.
The way your forecasts will be judged is via Mean Absolute Percentage Error (MAPE) on the actual
subsequent 12 values (not given to you, but known to us). Defining At as the actual value at time t, and
Ft as your corresponding forecasted values in the submitted csv file, the MAPE for your forecasts will be
calculated as
MAP E =
1
12
X
12
t=1

At − Ft
At

Your forecast beats another forecast if your MAPE is lower.
3


EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

E-mail: easydue@outlook.com  微信:easydue


EasyDue™是一个服务全球中国留学生的专业代写公司
专注提供稳定可靠的北美、澳洲、英国代写服务
专注提供CS、统计、金融、经济、数学等覆盖100+专业的作业代写服务