这次任务是R语言实现机器学习和金融大数据分析

Assignment 3
Machine Learning and Big Data for Economics and Finance Consider the two variables in the dataset Assign3.csv. We are interested in predicting the second variable Y given the rst variable X. 1. Fit a linear regression model to the data. Show the data scatter plot on the same gure with the values predicted by the linear model. 2. Fit a quadratic regression model to the data. Show the data scatter plot on the same gure with the values predicted by the quadratic model.. 3. We are interested in constructing a step function learner as follows:
First draw a random number U uniformly on the interval spanned by the minimum and maximum values of the inputs (x1; :::; xn) and then use it to construct the following function whose purpose is to give the prediction of Y given X = x:
f(x) = 1I(U 6 x) + 2I(U > x); where 1 and 2 are just unknown constants to be learned. It goes without saying that I(some statement) is the indicator function that equals 1 when
the statement is true and 0 otherwise. a. Use two dierent methods to compute the estimate f^(x) = ^1I(U 6x) + ^2I(U > x). Is f^ a strong learner?
b. Use one of the previous two methods to write an R function that takes as input x and the data (x1;:::;xn; y1;:::; yn) and gives as output f^(x). Make sure the function is capable of dealing with the case where x conatains more than one number. c. Using three dierent runs of the previous function, create three different plots where, on each, f^ is shown together with the scatter plot of the data. 4. Write an R function that applies boosting to the previous step function learner. That R function should take as inputs: the data, B the number of boosting iterations, the learning rate and an optional argument indicating the size of the test subsample in case a validation set approach is needed. As output the function should give: f^boost the boosted learner evaluated at the training data and the training mean squared error evaluated for each iteration b=1;:::;B of the boosting algorithm. Also, in case the size of the test subsample is greater than zero, the function should output: f^boost evaluated at the test sample and the test MSE evaluated for each iteration b =1; :::; B. a. Use that function to plot f^boost on top of the data scatter plot for =0.01 and for B =10000. Show the same with dierent values of B. b. Plot the training MSE vs. the number of iterations. c. Was there overtting when B = 10000?

Note: Even though the algorithm is described in detail in both the slides and textbook, for the sake of making the implementation easier, its special case per- taining to the questions in the assignment is presented here. Boosting algorithm:
1. Inputs:
A sample of covariates (i.e. inputs) x1; :::; xn and responses (i.e. out- puts) y1; :::; yn. A (weak) learner f^. A learning rate > 0. 2. Initialize:
Set f^boost(x) 0. Compute the rst learner f0^ (x) = ^1I(U 6 x) + ^2I(U > x) on the original data. Set ri yi ¡ f0^ (xi) for i = 1; :::; n. 3. Do the following for b = 1; :::; B:
a. Given x1; :::;xn as covariates and r1;:::; rn as responses, t a learner fb^ by rst sampling U and then estimating fb^(x)=^1I(U 6x)+^2I(U >x). b. Set f^boost(x) f^boost(x) + fb^(x). c. Set ri ri ¡ fb^(xi). 4. Output: f^boost(x).

用于经济和金融的机器学习和大数据考虑数据集 Assign3.csv 中的两个变量。我们感兴趣的是在给定第一个变量 X 的情况下预测第二个变量 Y。 1. 将线性回归模型拟合到数据。在具有线性模型预测值的同一图形上显示数据散点图。 2. 对数据拟合二次回归模型。用二次模型预测的值在同一张图上显示数据散点图。 3. 我们有兴趣构建一个阶跃函数学习器，如下所示：
首先在输入的最小值和最大值 (x1; :::; xn) 所跨越的区间上均匀地绘制一个随机数 U，然后用它来构造以下函数，其目的是给出 Y 的预测给定 X = X：
f(x) = 1I(U 6 x) + 2I(U > x); 其中 1 和 2 只是要学习的未知常数。不用说 I(some statement) 是指标函数，当
该语句为真，否则为 0。 A。使用两种不同的方法计算估计值 f^(x) = ^1I(U 6x) + ^2I(U > x)。 f^ 是一个强学习器吗？
b. 使用前两种方法之一编写一个 R 函数，该函数将输入 x 和数据 (x1;:::;xn; y1;:::; yn) 作为输出 f^(x)。确保该函数能够处理 x 包含多个数字的情况。 C。使用前一个函数的三个不同运行，创建三个不同的图，其中，在每个图上，f^ 与数据的散点图一起显示。 4. 编写一个 R 函数，将提升应用到前一步函数学习器。该 R 函数应作为输入：数据、B 提升迭代次数、学习率和一个可选参数，指示测试子样本的大小，以防需要验证集方法。作为输出，该函数应给出：f^boost 在训练数据上评估的增强型学习器和针对增强算法的每次迭代 b=1 评估的训练均方误差；：：：；B。此外，如果测试子样本的大小大于零，该函数应输出： f^boost 在测试样本上评估，测试 MSE 在每次迭代中评估 b =1； ::::; B.一个。使用该函数在 =0.01 和 B =10000 的数据散点图顶部绘制 f^boost。 B 的不同值显示相同。绘制训练 MSE 与迭代次数的关系图。 C。 B = 10000 时是否出现过拟合？

注意：尽管幻灯片和教科书中都对算法进行了详细描述，但为了使实现更容易，这里给出了与作业中的问题有关的特殊情况。提升算法：
1. 输入：
协变量样本（即输入）x1； ::::; xn 和响应（即输出）y1； ::::; 恩。一个（弱）学习者 f^。学习率 > 0。 2. 初始化：
设置 f^boost(x) 0。在原始数据上计算第一个学习器 f0^ (x) = ^1I(U 6 x) + ^2I(U > x)。设 ri yi ¡ f0^ (xi) for i = 1; ::::; 名词 3. 对 b = 1 执行以下操作； ::::; 乙：
A。给定 x1； ::::;xn 作为协变量和 r1;::::; rn 作为响应，t 学习器 fb^ 通过首先采样 U 然后估计 fb^(x)= ^1I(U 6x)+ ^2I(U >x)。 b. 设置 f^boost(x) f^boost(x) + fb^(x)。 C。设 ri ri ¡ fb^(xi)。 4. 输出：f^boost(x)。

R语言代写 | Assignment 3Machine Learning

于2019-10-282019-10-28由easydue发布

这次任务是R语言实现机器学习和金融大数据分析

代写案例

商科代写｜MKF2111 BUYER BEHAVIOUR PRACTICAL APPLICATIONS PART 2

代写案例

金融代写｜ACFI814 International Finance Coursework

代写案例

商科代写｜5BUS1199 Business Operations Assignment 1

R语言代写 | Assignment 3Machine Learning

于2019-10-282019-10-28由easydue发布

这次任务是R语言实现机器学习和金融大数据分析

相关文章

代写案例

商科代写｜MKF2111 BUYER BEHAVIOUR PRACTICAL APPLICATIONS PART 2

代写案例

金融代写｜ACFI814 International Finance Coursework

代写案例

商科代写｜5BUS1199 Business Operations Assignment 1