联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp2

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2021-11-06 11:56

STAT 361 (Fall 2021)

Assignment 3

The assignment is due on Nov. 04 (Thursday) at 23:00 (time of Kingston Ontario). Please submit to

Crowd Mark.

Guidelines for Preparing Solutions

For questions that needs R coding, please only include the important R output and the necessary results in

the main text of your solutions. Present them in a clear and concise fashion (for example, tabulate models

and output).

Give descriptions and discussions for your important exploration and findings.

Put long code and output in an Appendix, at the end of EACH problem.

These Appendix sections will NOT be marked, but will be checked as evidence of your independent work.

Prepare your assignment solutions so that it is easy for the readers (in this case, TAs) to follow, without

having to search everywhere for your answers from lengthy code and output.

1. Consider the multiple regression model Y = Xβ +, where  ~ MVNn(0, σ2

I). See descriptions of model

forms (1) and (2) in Chapter 4.

(a) Show that the residual vector r = (I ? P)Y, where P = X(XT X)?1XT, and show that I ? P is also a

projection matrix.

(b) Let U = (β?, r)T. Find the joint distribution of the random vector U. It may be helpful to notice that.

(c) Show that β? and r are independent.

Hint: For (b) and (c), properties of multivariate normal distribution may be useful.

2. Consider the “Savings.txt” data posted. It is an economic dataset collected in 48 different countries. The

variable “sr” is ratio of savings (aggregate personal saving divided by disposable income). The variables

“pop15” and “pop75” are percentages of population under 15 and over 75 respectively. The variable “dpi”

is disposable income (per-capita, in dollars) while the variable “ddpi” is the rate (percent) of change in

disposable income (per capita).

(a) Draw scatter plot matrix for all the variables involved. Comment on the possible relationships between

variables, focus on those appear interesting to you.

(b) Fit a simple linear regression model with disposable income (“dpi”) as response and percentage of population

under 15 as the only covariate. Describe the model clearly. Report and interpret the fitted model:

is there a significant association between the variables, is this what you expect?

(c) Fit a regression model with ratio of savings (Y , “sr”) as the response, and all other variables as the

covariates. Describe the model clearly, report and discuss the fit of the model. Interpret the estimated

coefficient for the rate of change in disposable income.

(d) Is it reasonable to drop the covariate disposable income (“dpi”) from the model in (c)? Support your

answer with a test, describe the test procedure and results clearly; also calculate a confidence interval for

the regression coefficient for this covariate.

Added Note: Test at level 0.05, and construct a 95% confidence interval.

(e) Based on the model for (c), obtain a 95% prediction interval for the ratio of savings of a country with

x = (20, 3.2, 2200, 2.1)T

for “pop15”, “pop75”, “dpi”, “ddpi” respectively.

3. Four objects are weighed 2 at a time on a spring balance. Denote the 4 unknown weights by β1, . . . , β4.

Six observations are made and are expressed in these forms:

Y1 = β1 + β2 + 1,

Y2 = β1 + β3 + 2,

Y3 = β1 + β4 + 3,

Y4 = β2 + β3 + 4,

1

Y5 = β2 + β4 + 5,

Y6 = β3 + β4 + 6.

Assume that i

iid~ N (0, σ2

), i = 1, . . . , 6.

(a) Find expressions for the least squares estimators β1, . . . , β4 (specify the expressions in terms of Y1, . . . , Y6).

(b) Find an expression for Cov(β?) (specify matrix entries, may involve σ2).

(c) Find expressions for the residuals (specify the expressions in terms of Y1, . . . , Y6).

(d) Create a small data set for this study, for (Y1, . . . , Y6) = (5, 8, 6, 7, 10, 9). Use lm() function in R to fit

the data. Check the results for (a), (b) and (c). Does the output from lm() fit agree with the corresponding

calculation results for the data set based on the expressions you derived above?

(e) Explain how you will construct a 95% confidence interval for β1 + β2. We can still use the tn?k distribution.

Find the confidence interval for the given data.

2


相关文章

版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp