联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp2

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-10-20 09:30

Assignment 3

Machine Learning and Big Data for Economics and Finance

Consider the two variables in the dataset Assign3.csv. We are interested in

predicting the second variable Y given the rst variable X. 1. Fit a linear regression model to the data. Show the data scatter plot on the

same gure with the values predicted by the linear model. 2. Fit a quadratic regression model to the data. Show the data scatter plot on

the same gure with the values predicted by the quadratic model.. 3. We are interested in constructing a step function learner as follows:

First draw a random number U uniformly on the interval spanned by the

minimum and maximum values of the inputs (x1; :::; xn) and then use it to

construct the following function whose purpose is to give the prediction of Y

given X = x:f(x) = 1I(U 6 x) + 2I(U > x); where 1 and 2 are just unknown constants to be learned. It goes without

saying that I(some statement) is the indicator function that equals 1 when

the statement is true and 0 otherwise. a. Use two different methods to compute the estimate f^(x) = ^1I(U 6

x) + ^2I(U > x). Is f^ a strong learner?

b. Use one of the previous two methods to write an R function that takes

as input x and the data (x1;:::;xn; y1;:::; yn) and gives as output f^(x). Make sure the function is capable of dealing with the case where

x conatains more than one number. c. Using three different runs of the previous function, create three different

plots where, on each, f^ is shown together with the scatter plot

of the data. 4. Write an R function that applies boosting to the previous step function

learner. That R function should take as inputs: the data, B the number of

boosting iterations, the learning rate and an optional argument indicating

the size of the test subsample in case a validation set approach is needed. As output the function should give: f^

boost the boosted learner evaluated

at the training data and the training mean squared error evaluated for each

iteration b=1;:::;B of the boosting algorithm. Also, in case the size of the test

subsample is greater than zero, the function should output: f^

boost evaluated

at the test sample and the test MSE evaluated for each iteration b =1; :::; B. a. Use that function to plot f^

boost on top of the data scatter plot for

=0.01 and for B =10000. Show the same with different values of B. b. Plot the training MSE vs. the number of iterations. c. Was there overtting when B = 10000?

1

Note: Even though the algorithm is described in detail in both the slides and

textbook, for the sake of making the implementation easier, its special case per- taining to the questions in the assignment is presented here. Boosting algorithm:

1. Inputs: A sample of covariates (i.e. inputs) x1; :::; xn and responses (i.e. out- puts) y1; :::; yn. A (weak) learner f^. A learning rate > 0. 2. Initialize:

Set f^

boost(x) 0. Compute the rst learner f0^ (x) = ^1I(U 6 x) + ^2I(U > x) on the

original data. Set ri yi ? f0^ (xi) for i = 1; :::; n. 3. Do the following for b = 1; :::; B:

a. Given x1; :::;xn as covariates and r1;:::; rn as responses, t a learner fb^ by rst sampling U and then estimating fb^(x)=^1I(U 6x)+^2I(U >

x). b. Set f^

boost(x) f^

boost(x) + fb^(x). c. Set ri ri ? fb^(xi). 4. Output: f^

boost(x).


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp