代写STAT 4620、R编程语言代写-代写Algorithm 算法作业

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-23:00
微信：codinghelp

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

代写STAT 4620、R编程语言代写

日期：2023-03-23 09:08

STAT 4620/5620 WINTER 2023

Assignment 4: Due Thursday March 23 2023

1. The following questions explore the fundamentals of nonparametric statis-

tics:

(a) [3] Describe smoothing and give two examples of popular smoothers.

(b) [2] Consider the generalized additive model (GAM) framework. What is

the most significant departure from the GLM framework?

(d) [4] Suppose that you find yourself in a situation where both a GLM and

a GAM initially seem appropriate for your data. Explain the criteria you

would use to determine which of the two methods to recommend.

2. This question re-examines the hubble data.

(a) [6] Fit the model:

Vi = f(Di) + ?i

to the Hubble data, where f is a smooth function and the ?i are i.i.d.

N(0, σ2). Does a straight line model appear to be most appropriate?

How would you interpret the best fitting model?

(b) [4] Examine appropriate residual plots and refit the model with more

appropriate distributional assumptions. How are your conclusions

from part (a) modified?

3. Read and provide a one page summary of the lme4 documentation. [5]

4. The data frame Gun (library nlme) is from a trial examining methods for fir-

ing naval guns. Two firing methods were compared, with each of a number

of teams of 3 gunners; the gunners in each team were matched to have

similar physique (Slight, Average, Heavy). The response variable rounds

is rounds fired per minute, and there are 3 explanatory factor variables,

Physique (levels Slight, Medium and Heavy); Method (levels M1 and M2)

and Team with 9 levels. The main interest is in determining which method

and/or physique results in the highest firing rate and in quantifying team-

to-team variability in firing rate.

(a) [2] Identify which factors should be treated as random and which as

fixed, in the analysis of these data.

(b) [4] Write out a suitable mixed model as a starting point for the analysis

of these data.

of interest and report your conclusions.

5. The Carseats dataset from the R package ISLR is a simulated dataset of

carseat sales at 400 different stores. Full information on the variables in

this dataset can be found using help(Carseats) after loading the package.

(a) [4] Create a new factor variable for the Carseats representing whether

or not Sales is greater than 8. Randomly split the dataset into a testing

and training set. On the training set grow a classification tree using the

R rpart package to classify whether a store had high carseat sales or not

(Hint: Remove the Sales variable). Report the classification accuracy

you got on the testing data set and on the training set.

(b) [4] Prune the tree you grew in part a. Report the pruned tree’s classi-

fication accuracy on the testing data set and on the training set. Why

might pruning have improved the classification accuracy on the testing

set? Why might it have reduced accuracy on the training set?

way you did the tree. Is performance on the testing set better than the

classification trees? Why might that be the case?

(d) [4] Briefly outline the similarities and differences between CARTs and

random forests.

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：BEEM012代写、代写R程序语言

【下一篇】：BEEM012代写、代写R程序语言

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

代写STAT 4620、R编程语言代写

日期：2023-03-23 09:08

相关文章