STA457/STA2202 – Assignment 2

1.考虑两个离散随机变量X，Y，它们具有列联表给出的联合概率：
P（X，Y）Y = -1 Y = 0 Y = +1
X = -1 .05 .10 .15
X = 0 .15 .15 .10
X = +1 .15 .00 .15
（a）[2分]找出给定X的Y的最小均方误差（MMSE）预测因子，即条件

]。
（b）[2分]根据BLP系数找到给定X的Y的最佳线性预测器（BLP），即Y = a + bX
a，b及其实现的MSE。
（注意：这是MMSE预测变量和BLP不同的示例。）
2.考虑AR（1）模型Xt =φXt-1+ Wt，Wt〜WN（0，σ2
w）。
（a）[3分]查找提前1和2步BLP错误之间的协方差，即查找
Cov
（Xn + 1-Xn
n + 1）（Xn + 2-Xn
n + 2）

w）。
（请注意：该值应为非零；通常，不同的提前预测将是相关的。）
（b）[3分]查找随后的提前1个BLP错误之间的协方差，即：
Cov
（Xn − Xn−1
ñ
）（Xn + 1-Xn
n + 1）

w）。
（注意：在完全了解参数的情况下，这些类似于模型残差。）
3. [5分； STA2202（研究生）仅学生] SS 3.26
（注意：估计的BLP Xˆ n
n + m基于拟合参数（φˆ，θˆ，σˆ
2
w）不如

n。）

1个

1.一份500字的PDF书面报告，所有代码均在附录中。
2.一个名为XXXXXXXXXXX.csv的CSV文件，其中XXXXXXXXXX是您的学生编号。此CSV应该

Written Report
Your written report should be able to be understood by your boss, someone who remembers the main ideas
from a time series course several years ago, but not the finer details. Be sure to clearly explain what you
have done, and if you are using any advanced concepts a sentence or two to refresh your boss on what they
are is a good idea. Your written report must include the following:
1. A discussion of the characteristics of the time series (e.g. trend, seasonality, stationarity)
2. An explanation of any data preprocessing you had to do.
3. The model which you used.
4. A graph of the time series, with your forecasts in a different colour (see graph below for an example)
5. A discussion of your model’s fit (diagnostics) and limitations.
The list above is what your written report must contain, but not an exhaustive list of all that it can
contain. If there are any other topics that are worth discussing related to how you forecasted the data,
0 20 40 60 80
0.5 1.0 1.5 2.0
Example Time Series
Time
Level
2
Tips
• This is a report to your boss. Concise & clear is better. They do not want to see single spaced size 6
font with expanded margins. They want to see all important and relevant information neatly organized.
• If you are going to include a code snippet in your written report (this is not required), make sure it is
important enough to warrant your boss’ attention.
• Make sure the model you choose, and how you fit it, makes sense. The data you are working with may
violate some basic time series assumptions.
Assessment (15pts total)
• 1pt Your written report has a clean layout, and includes the requested graph.
• 1pt The text of your report is easy to follow, and conveys ideas effectively.
• 2pt Time Series Characteristics
– 1/2 Some mention of the important time series characteristics.
– 2/2 A clear identification of all important time series characteristics.
• 2pt Data preprocessing
– 1/2 Some vague explanation of how the data has been preprocessed is provided.
– 2/2 A clear explanation of how the data was preprocessed and the justification for why it was
done.
• 2pt Model Explanation
– 1/2 You have included a model description, but little in the way of explanation.
– 2/2 You have concisely and clearly explained your model.
• 2pt Model Fit and Limitations
– 1/2 Give vague description of the model’s fit and limitations.
– 2/2 Give clear and accurate description of the model’s fit and limitations.
• 4pt Forecast Accuracy
– 1pt Your method beats the naive forecast (the entire forecast is equal to the last datapoint)
– 1pt Your forecast beats the forecast produced by the R code ts_arima_model = auto.arima(x);
forecast(ts_arima_model, h = 12)
– 1pt Your forecast beats the forecast produced by the R code ts_ets_model = ets(x);
forecast(ts_ets_model, h = 12)
– 1pt Your method beats all of the naive, auto.arima(), and ets() methods.
The way your forecasts will be judged is via Mean Absolute Percentage Error (MAPE) on the actual
subsequent 12 values (not given to you, but known to us). Defining At as the actual value at time t, and
Ft as your corresponding forecasted values in the submitted csv file, the MAPE for your forecasts will be
calculated as
MAP E =
1
12
X
12
t=1

At − Ft
At