联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp2

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-12-10 10:58

STAT 3312 (Fall, 2019)

Final exam (take-home)

Name (ID):

Instructions

? This take-home exam is due 3:00PM, December 17, 2019.

? All of your answers and work must be your own.

? You are NOT allowed to discuss any part of this exam with anyone. If you have any questions,

ask me.

? For question #2, R or SAS code along with output must me submitted to support your

answer. It would be good if you underline results on the output relevant to your answer.

1. True/False questions (1.5 points each)

(1) The diagnosis of a mental illness (ex: schizophrenia, neurosis, depression) is an ordinal categorical

variable.

True ( ) False ( )

(2) If the odds of success equal 0.5 in a binary response, the the probability of success is 0.25.

True ( ) False ( )

(3) In a logistic regression model, logit[π(x)] = α + βx, e

α equals the odds of success when x = 1.

True ( ) False ( )

(4) In a logit model logit[π(x)] = α+βx, the probability increases at a rate of 0.16β when π(x) = 0.4.

True ( ) False ( )

(5) The Fisher’s exact test can be used to test if the odds ratio of a 2 × 2 table equals 1 when the

frequency counts are small.

True ( ) False ( )

(6) A classical linear regression model with errors having a normal distribution is a special case of

generalized linear model with the probit link.

True ( ) False ( )

1

(7) In testing for independence in two-way contingency tables, likelihood ratio tests and Pearson’s

χ

2

tests are equivalent for small sample sizes.

True ( ) False ( )

(8) In a generalized linear model, the link function is used to connect the values of the random

component and the systematic component.

True ( ) False ( )

(9) When x1 or x2 is the sole predictor for a binary response y, the likelihood ratio test of the effect

has P-value < 0.0001. When both x1 and x2 are in the model, it is possible that the likelihood

tests for H0 : β1 = 0 and for H0 : β2 = 0 could both have P-values larger than 0.05.

True ( ) False ( )

(10) For the logistic regression model with the identity link, the estimated probability of any value

for predictor x could exceed one.

True ( ) False ( )

2. The following table is based on an epidemiological survey of 3,000 subjects to investigate snoring

as a possible factor for heart disease. We use scores (0, 2, 3, 5, 6) for x = snoring level.

Heart Disease

Snoring Yes No

Never 24 1355

Sometimes 35 603

More often than not 21 215

Almost always 30 224

Every night 27 230

(a) Use R or SAS to fit the model with three link functions: the logit, probit, and complementary

Log-Log. Write down the estimated equations for all three models. (12 points)

2

(b) Find the estimated proportion for the logistic model when the snoring level is 2 and interpret

it in terms of the odds. (4 points)

(c) Use the fitted logistic model to calculate an approximate 97% confidence interval for the odds

ratio of a person in the “sometimes” category compared to a person in the “every night” category.

(5 points)

(d) Find the estimated proportion for the probit model when the snoring level is 3. (4 points)

(e) Find the estimated proportions for the complementary Log-Log model when the snoring levels

are “sometimes” and “almost always”, respectively. Which value is larger? (5 points)

3. Consider the following logistic regression model based on the horseshoe data with color and

width predictors:

logit[P(Y = 1)] = α + β1c1 + β2c2 + β3c3 + β4x,

where x denotes width and

c1 = 1 for color = medium light, 0 otherwise

c2 = 1 for color = medium, 0 otherwise

3

c3 = 1 for color = medium dark, 0 otherwise.

Fitting the model yields the following estimated equation:

logit[P(Yd= 1)] = ?13.015 + 1.097c1 + 1.302c2 + 1.254c3 + 0.458x. (1)

Consider this fit for crabs of width x = 21cm.

(a) Estimate two probabilities for medium-light crabs and for dark crabs, and then calculate the

ratio of these two probabilities. (7 points)

(b) Estimate the odds ratio of a satellite for medium-light crabs and for dark crabs. Interpret it in

terms of the context. (7 points)

(c) Is there a big difference between the ratio of probabilities in (a) and the odds ratio in (c)? If

not, why does this happen? (5 points)

(d) Verify the value of the odds ratio in part (b) using the parameter estimates in Equation (1). (5

points)

4

4. In order to investigate effects of AZT in slowing the development of AIDS symptoms, a total of

343 veterans whose immune systems were beginning to falter after infection from the AIDS virus

were randomly assigned either to receive AZT immediately or to wait until their T cells showed

severe immune weakness. The following table is a 2 ×2×2 cross classification of the veteran’s race,

whether AZT was given immediately, and whether AIDS symptoms developed during the 3-year

study.

Symptoms

Race AZT use Yes (Fitted) No (Fitted) Row total

Black Yes 14 (A) 90 (B) 104

No 28 (C) 85 (D) 113

White Yes 10 (E) 55 (F) 65

No 14 (G) 47 (H) 61

Let X = AZT treatment (1 for AZT taken, 0 otherwise), Z = race (1 for blacks, 0 for whites), and

Y = whether AIDS symptoms developed (1 = yes, 0 = no). The ML fit turned out to be

logit(?π) = ?1.1427 ? 0.6537x ? 0.0037z. (2)

(a) Use Equation (2) to find the fitted values (A) - (H). (8 points)

5

(b) Perform a goodness of fit test by calculating the Pearson statistic X2 based on the observed

and fitted values in the table above. Does the model fit decently well? Justify your answer with

the P-value. (8 points)

6

5. Does job satisfaction depend on one’s income? The 1991 General Society Survey shows the

following results. Note that there are four levels in the job satisfaction categories (dissatisfied,

little, moderate, very) and four levels in the income categories (0-5K, 5K-15K, 15K-25K, >35K).

The income values are in dollars.

Income Job satisfaction

Dissatisfied Little Moderate Very

0-5K 2 4 13 3

5K-15K 2 6 22 4

15K-25K 0 1 15 8

>25K 0 3 13 8

Let Y = job satisfaction and let X = income scores (3K, 10K, 20K, 25K). Consider the baselinecategory

logit model with “very” as the baseline category:

log(πj

π4

) = αj + βjx, j = 1, 2, 3.

The following table shows a part of the output regarding the estimated coefficients for a baselinecategory

logit model.

(Intercept):1 (Intercept):2 (Intercept):3

0.430 0.456 1.704

Income:1 Income:2 Income:3

?0.185 ?0.054 ?0.037

(a) Write down the three predicted equations, log(?πj/π?4) for j = 1, 2, 3. (6 points)

7

(b) Notice that β?

j < 0 for each logit. Interpret the implications in terms of the text. (4 points)

(c) What is the meaning of e

?0.185 = 0.83? Explain it rigourously in terms of the context. (4

points)

(d) Find the estimated probability of being “Moderate” category when his/her income is 20K. (4

points)

8


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp