联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-11-18 09:29

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 1

MAT 4378 – MAT 5317, Analysis of categorical data

Assignment 3

Due date: in class on Monday, November 18, 2019

Remark: You can use R for your computations for Questions 2 to 4. If you use

R please provide the output. However, the R output is not an answer to a question.

Please provide one or two sentences to properly answer the question.

1. Consider a ratio estimator h(?θ1,?θ2) = ?θ1/?θ2, where the estimated variancecovariance

2. A carefully controlled experiment was conducted to study the effect of the size of

the deposit level on the likelihood that a returnable one-liter soft drink bottle

will be returned. The data to follow show the number of bottles that were

returned (Wi) out of 500 sold (ni) at each of size deposit levels (Xi

in cents):

Deposit level xi 2 5 10 20 25 30

Number sold ni 500 500 500 500 500 500

Number returned wi 72 103 170 296 406 449

An analysist believes that a logistic regression model is appropriate for studying

the relation between the size of the deposit and the probability a bottle will be

returned.

(a) Find the maximum likelihood estimates for β0 and β1. Give the estimated

regression model.

(b) Obtain a scatter plot of the sample proportions against the level of the

deposit, and superimpose the estimated logistic response onto the plot.

Does the fitted logistic response function appear to fit well?

(c) Obtain exp(β?

1) and interpret this number.

(d) What is the estimated probability that a bottle will be returned when the

deposit is 15 cents?

(e) Estimate the amount of deposit for which 75% of the bottles are expected

to be returned.

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 2

(f) In part (e), we have an estimate ?x = g(β?

0, β?

1) for the level of the deposit

that corresponds to π = 75% of the bottles are returned. This estimator is

a non-linear function of β?

0, β?

1. Use the delta-method to find an asymptotic

estimated standard error for this estimate. Hint: It will be helpful to

use the function vcov on your glm object. Furthermore, to multiply the

matrices A and B with R use A %*% B.

3. A marketing research firm was engaged by an automobile manufacturer to conduct

a pilot study to examine the feasibility of using logistic regression for

ascertaining the likelihood that a family will purchase a new car during the

next year. A random sample of 33 suburban families was selected. Data on

annual family income (x1, in thousands of dollars) and the current age of the

oldest family automobile (x2, in years) were obtained. A followup interview

conducted 12 months later was used to determine whether the family actually

purchased a new car (y = 1) or did not purchase a new car (y = 0) during the

year. The data is found in the file CarPurchase.csv.

(a) Find the maximum likelihood estimates of β0, β1, and β2. State the estimated

logistic regression model.

(b) Obtain exp(β?1) and exp(β?2) and interpret these numbers.

(c) What is the estimated probability that a family with annual income of $50

thousand and an oldest car of 3 years will purchase a new car next year?

4. Rather than finding the probability of success at an explanatory variable value,

it is often of interest to find the value of an explanatory variable given a desired

probability of success. This is referred to as inverse prediction. One application

of inverse prediction involves finding the amount of pesticide or herbicide needed

to have a desired kill rate when applied to pests or plants. The lethal dose level

xπ (commonly called “LDz”, where z = 100 π is defined as

xπ =(cloglog(π) ? β0)β1

for the complementary log-log regression model

cloglog(π) = β0 + β1 x.

(a) Show how xπ is derived by solving for x in the complementary log-log

regression model.

(b) We can obtain 95% confidence interval for xπ as follows:

Describe how this confidence interval for xπ is derived. (Note that there is

generally no closed-form solution for the confidence interval limits, which

leads to the use of iterative numerical procedures.)

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 3

(c) Turner et al. (1992) uses logistic regression to estimate the rate at which

picloram, a herbicide, kills tall larkspur, a weed. Their data was collected

by applying four different levels of picloram to separate plots, and the

number of weeds killed out of the number of weeds within the plot was

recorded. The data are in the file picloram.csv. Complete the following:

(i) We will use a cloglog model instead of a logistic regression model. Give

the estimated complementary log-log model.

(ii) Compute eβ?1 and interpret this number within the context of the problem.

(iii) Plot the observed proportion of killed weeds and the estimated model.

Describe how well the model fits the data.

Note: Here are some commands that you might find helpful. We are

assuming that the dataframe is called picloram.data and that the

fitted model is called mod.

## plot proportions versus x

with(picloram.data, plot(x = picloram, y = kill/total,

xlab = "Picloram", ylab = "Proportion of weeds killed",

panel.first = grid(col = "gray", lty = "dotted")))

# Put estimated esimated response on the plot

curve(expr = predict(object = mod,

newdata = data.frame(picloram = x), type = "response"),

col = "red", add = TRUE)

(iv) Estimate the 0.9 kill rate level “LD90” for picloram. Add lines to the

plot in (iii) to illustrate how it is found (the segments() function can

be useful for this purpose).

(v) We are assuming that your fitted model is the glm object mod. Use

the following commands to compute a 95% confidence interval for the

0.9 kill rate. Note: The function uniroot solves for the root of a

function over an interval.

b0 = summary(mod)$coefficients[1,1]

b1 = summary(mod)$coefficients[2,1]

LD.x<-(log(-log(1-0.9))-b0)/b1

root.func <- function(x, mod.obj, pi0, alpha) {

beta.hat <- mod.obj$coefficients

cov.mat <- vcov(mod.obj)

var.den <- cov.mat[1,1] + x^2*cov.mat[2,2] +

2*x*cov.mat[1,2]

abs(beta.hat[1] + beta.hat[2]*x - log(-log(1-pi0)))/

sqrt(var.den) - qnorm(1-alpha/2) }

lower <- uniroot(f = root.func, interval =

c(min(picloram.data$picloram), LD.x),

mod.obj = mod, pi0 = 0.9, alpha = 0.05)

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 4

upper <- uniroot(f = root.func, interval =

c(LD.x, max(picloram.data$picloram)),

mod.obj = mod, pi0 = 0.9, alpha = 0.05)

lower$root

upper$root

(vi) In part (v), we found a 95% CI for x0.9. Explain in a few sentences

how these commands give us the lower and the upper bound of the

confidence interval.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp