联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-07-20 11:06

CIS-PROB-STAT 2019

Problem set #1

Due Tues July 23

Please strive to make your writing as clear and comprehensible as possible, so that whoever is reading your solution

(e.g. a TA or professor) will appreciate its beautiful clarity and will be able (even in a sleep-deprived state) to read

through smoothly once from beginning to end, nodding with approval and easy comprehension the whole time, without

pausing or backtracking or straining to think hard, ending with a complete understanding of your work. Any

calculations should be carried through to obtain explicit numerical answers. You should include just what is necessary

for a clear understanding of your results and how you obtained them; for example, include relevant R commands and

output but not complete unedited transcripts.

Good luck, and I hope you find this interesting and enjoyable! Don’t worry if you don’t know how to do any given

question or even how to get started; they are not supposed to be easy and different students have different amounts

of experience. The important thing for any of these problems is for you to spend at least a little time trying to work

on the problem, and then to get a hint or help if you need it; if you are starting to spend a long time without making

progress, please ask questions to each other, TA’s, and me! Also a problem may not have been stated in a clear way,

and questions of clarification could be really helpful for everyone. I can have some more office hours after classes.

1. [Monty Hall and Bayes] You are on a game show faced with 3 doors. Behind one of the doors is a car, and behind

the other two doors are goats; you prefer the car. Assume the position of the car is randomized to be equally likely

to be behind any door. You choose door #1. The host prolongs the drama by revealing that there is a goat behind

door #3. The host then asks you if you would like to switch your choice to door #2. Is it to your advantage to

switch? Your goal is to answer the host’s question by finding the conditional probability that the car is behind

door #2 given the relevant information. Actually as stated so far, not enough information is given to determine the

relevant probabilities, so consider the following two variations for the details of how the host decides which door to

open:

(a) This is the typical scenario assumed.* If the car is behind door #2, the host opens door #3; if the car is behind

door #3, he opens door #2; if the car is behind door #1, he opens door #2 or door #3 with probability ? each.

Suppose the host opens door #3. Should you switch to door #2?

(b) For this variation, suppose the host tosses a coin to choose which door (#2 or #3) to open, regardless of what is

behind those doors. Suppose it happens that door #3 is chosen, and it happens that you observe a goat behind

door #3. Should you switch to door #2?

[[Hint: This could be set up in different ways; I’ll try to describe one. To be definite about it, let’s not think of our

own choice to open door #1 as random; we know we will choose door #1. Now it’s like a frog about to take two

hops. The first hop determines the door where the car is hidden; we could call these 3 events C1, C2, and C3. These

3 events are assumed to have probability 1

3

each. From there, the second hop leads to the opening of a door, and we

are told that after two hops the frog ended up in a state where door #3 was opened and revealed a goat. Given that,

what is the probability that the frog passed through each of the 3 possible landing places for his first hop?]]

, If you find this question interesting, I think you will enjoy taking a look at this page.

2. Your favorite number is 3, and the idea of three 3’s makes you deliriously happy. You wonder if you roll a die 100

times, what is the probability of getting three consecutive 3’s at least once among those rolls. In this problem we’ll

see a way to calculate probabilities like this. So imagine you are rolling a die, repeatedly. We’ll say the first roll

occurs at “time 1,” the second roll at “time 2,” and so on,.

(a) Define T to be the first time at which we see any number other than 3. Write down the probability mass

function of T. (That is, find P{T = t} for t = 1,2,....)

*The idea here is that the host wants to open one door that is not the door you already chose, that is, he wants to open door 2 or 3. In this standard

variation of the game, the host knows where the car is, and he will not open that door. So, for example, if the car is behind door #2, then the only

choice the host has is to open door #3. The only case where the host has any choice is when the car is behind door #1, and in this scenario the host

tosses a coin to decide between opening door #2 or #3.

CIS-PROB-STAT 2019 PS#1

(b) Let Ak denote the event that there is at least one run of three consecutive 3’s in k rolls of the die, and let ak

denote the probability P(Ak). Find an expression for ak

in terms of ak1, ak2, and ak3. [[Hint: Condition on

T. That is, use the Law of Total Probability?

to write as P(Ak) = ∑∞t=1 P{T = t}P(Ak

|T = t). Another example

of a helpful observation is that P(Ak

| T = 1) = ak?1.]]

(c) Use R and the relationship you found in part (b) to determine the value of a100.

(d) How many times would you need to roll the die in order to have probability greater than 0.5 of getting at least

one run of three consecutive 3’s?

3. Check part of the previous question by programing and running a simulation to approximate the probability of at

least one run of three consecutive 3’s in 100 rolls of a die.

Hints: To save you some thinking and programming effort, I’ve written a function that returns the

length of the longest run of consecutive 3’s in a vector. So hopefully given this function and what we have done so

far, it will not be hard for you to repeat (many times) the experiment of generating a vector of 100 random die rolls,

and for each such vector, simply run the function on that vector and record the resulting value. Then

you can approximate the desired probability by the fraction of repetitions giving a maxRunOf3s value of at least 3.

Here is an example to help you use maxRunOf3s (and understand how it works, if you want):

4. [[Margin of error of polls, using R]] We are familiar with political polls that find a certain percentage of people

favoring a given candidate, with the poll claiming a certain “margin of error.” Suppose we take a random sample of

size n from the population and find that the fraction in the sample who favor the given candidate is 0.56. Letting ?

denote the unknown fraction of the population who favor the candidate, and letting X denote the number of people

in our sample who favor the candidate, we are imagining that we have just observed X = 0.56n (so the observed

sample fraction is 0.56). Our assumed probability model is X ~ B(n,?). Suppose our prior distribution for ? is

uniform on the set {0, 0.001, .002,..., 0.999, 1}.

This is an important result and we saw it in class but may not have emphasized it as much as it deserves. E.g. you can find it in the notes on the top

of page 24 and, for the Bayesian frog question, near the bottom of page 31.

2

CIS-PROB-STAT 2019 PS#1

(a) Use R to graph the posterior distribution in three cases: when n = 100, n = 400, and n = 1600.

(b) For each of the three cases, find the posterior probability P{ > 0.5 | X}

(c) For each of the three cases, find an interval of values that contains just over 95% of the posterior probability.

[You may find the function useful for this.] For each case, calculate the margin of error (defined to be

half the width of the interval, that is, the “±” value) and describe how the margin of error seems to depend on

the sample size (something like, when the sample size goes up by a factor of 4, the margin of error goes <up

or down?> by a factor of about <what?>).

[[A numerical tip: if you are looking in the notes, you might be led to try to use an expression like, for example,for the likelihood. But this can lead to numerical “underflow” problems because

the answers get so small. The problem can be alleviated by using the  function instead for the likelihood (as

we did in class and in the R script), because that incorporates a large combinatorial proportionality factor, such as


1600

896 

that makes the numbers come out to be probabilities that are not so tiny. For example, as a replacement for

the expression above, you would use.]]

5. [[Examples of maximum likelihood estimators]] For data that comes from a discrete distribution, the likelihood

function is the probability of the data as a function of the unknown parameter. For data that comes from a continuous

distribution, the likelihood function is the probability density function evaluated at the data, as a function of

the unknown parameter, and the maximum likelihood estimator (MLE) is the parameter value that maximizes the

likelihood function. For both of the questions below, write down the likelihood function and find the maximum

likelihood estimator, including a justification that you have found the maximum (this involves something beyond

finding a place where a derivative is 0).

(a) If X ~ Bin(n,), write down the likelihood function and show that the MLE for is Xn.

(b) The exponential distribution with parameter λ (denoted by Exp(λ)) is a continuous distribution having pdf.

Suppose T1,T2,...,Tn are independent random variables with Ti ~ Exp(λ) for all i. Define S = T1 +T2 +···+

Tn. Write down the likelihood function, and show that the MLE for λ is n

S

, the reciprocal of the average of the

Ti’s.

[[To start thinking about part (a) it may help to remember the class when we were doing inference about ? in a

poll of size n = 100 with the observed data X = 56. For that example we calculated and plotted the likelihoods

for = 0,.001,.002,...,.998,.999,1, and it looked like the value that gave the highest likelihood was 0.56. Well,

in that example. Here we are thinking of the likelihood as a function of the continuous variable ?

over the interval [0,1] and showing mathematically that =

X

n maximizes the likelihood. So start by writing down

the likelihood function, that is, writing the binomial probability for getting X successes in n independent trials each

having success probability . Think of this as a function of (in any given example, n and X will be fixed numbers,

like 100 and 56), and use calculus to find the that maximizes this function. You should get the answer

Just as a hint about doing the maximization, you could maximize the likelihood itself, or equivalently you could

maximize the log likelihood (which you may find slightly simpler). If you do not want to take logs, you can still do

the required differentiation using the product rule.]]

3


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp