MAST90138 S2 2020 Assignment 3

（a）使用R中的标准函数来训练QDA分类器和逻辑分类器

（b）使用prcomp和plsr（软件包pls）函数分别获取PCA和

X =（X1，…，Xp）之间的协方差
T和Y = 1 {G = 1}，指示变量

（例如，在课堂上讨论过的PCA的Γ和PLS的Φ）由该函数报告以重新计算
“手动”检查组件，以了解您如何获取组件。 [10]
（c）使用PLS组件训练QDA分类器，使用PCA组件训练另一个。在每种情况下，请根据遗漏清单选择要使用的组件数

[20]
（d）对于每个QDA和后勤分类器，您希望使用哪个版本（PCA或PLS）？

（e）将（c）中训练有素的分类器应用于测试集，并报告结果分类

2
Problem 2 [30 marks]:
In this problem you will train random forest (RF) classifiers to predict the class labels (0 or
1) in the test set.
(a) Using the randomForest package in R, construct a random forest classifier using all p
predictor variables in the training set. When training the classifier, use the default value
of m (the number of random candidate variables for each split), but justify your choice
for the number of trees B using the out-of-bag (OOB) classification error. Plot a graph
showing the OOB error against the number of trees used. [15]
(b) Show two graphs that illustrate the importance of the Xj variables, for both decrease in
OOB prediction accuracy and decrease in node impurities measured by Gini index. Is
there an explanation of why those particular Xj
’s are the most important for classification
in this rainfall example? [5]
(c) Apply the resulting trained classifier to the test data Xtest, and compute the resulting
classification error. Try training your RF multiple times. Do you always get the same
classification error? If yes, why? If not, why and what can you do to make the forest
more stable and why? [10]

EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

E-mail: easydue@outlook.com  微信:easydue

EasyDue™是一个服务全球中国留学生的专业代写公司