STATISTICAL LABORATORY
Practical 6: Assessed coursework
ST104
Term 3, 2019
Important Information
This practical session is assessed. The deadline for submission is 11am on Thursday,
9
th May.
Your reports should be submitted electronically on Moodle. Here is the link for submission:
https://moodle.warwick.ac.uk/mod/assign/view.php?id=674702
You can also find the submission link on the right hand side of the module webpage.
Please note that your report must:
- be submitted in PDF,
- be no more than 5 sides of A4 in length (excluding figures),
- be 12pt font for the main body of the report.
You will lose marks if you do not follow these requirements.
Also, please make sure you include your student ID code and lab group number on the
front sheet, and NOT your name!
Exercises
1. Pseudo-random numbers and the inversion method.
(a) For arbitrary choices of initial seeds U1 and U2 (in the interval (0, 1]), let
Ui+2 = [Ui+1 + Ui
] (mod 1) i ∈ N.
Why is this not a good pseudo-random independent U(0, 1] generator? [2 marks]
(b) During lectures we have seen how the inversion method can be used to simulate
from a Bernoulli(p) distribution using U ~ U(0, 1]. To achieve this task the
function generate.bernoulli() is written and presented below.
generate.bernoulli <- function(n = 1, p = 0.5) {
sample <- runif(n)
bernoulli <- as.numeric(sample < p)
return(bernoulli)
}
Explain in words what each line of code is doing and briefly justify why this
method does what it was intended to do. [3 marks]
1
(c) How would you simulate from a Geometric(p) distribution:
(i) Using U ~ U(0, 1]?
Hint: Show that F(x) = 1 (1 p)
bxc
, where bxc denotes the greatest
integer less than or equal to x. [2.5 marks]
(ii) Using E ~ Exp(1)? [1.5 marks]
Write a function generate.geometric() that, given a sample size n and a
probability p, returns a vector of length n which contains realisations from a
Geometric(p). Your function should use either a sample of size n from the
U(0, 1] distribution or a sample of size n from the Exp(1) distribution. Choose
only one of these two approaches but you should NOT use the built-in rgeom()
function. Investigate (via comparisons you deem appropriate) what happens if
you change the size n or the probability p. Try to experiment with the following
combinations of n and p and include your comments in the report.
n p
10 0.1
10000 0.1
10 0.9
10000 0.9
[4 marks]
[Total: 13 marks]
2. Bernoulli random variables and their relatives.
(a) Given a source of Bernoulli random variables, it’s relatively easy to write a
function to generate from the Binomial distribution (remember that a Binomial
random variable with parameters n and p is the sum of n independent Bernoulli
random variables of common success probability p). Write a function which given
n and p will generate a single Binomial(n, p) random variable. Your function can
make use of generate.bernoulli() as given above (or any other user defined
function which simulates from a Bernoulli(p) distribution) but you should NOT
use the built-in rbinom() function. [1 mark]
(b) Write another function which, given m, n and p, will generate m realisations
from the Binomial(n, p) distribution. You can use any of the previously defined
functions but you should NOT use the built-in rbinom() function. Use your
function to generate 5000 realisations from a Binomial(10, 0.25) distribution. In
your report, only include the code for the function and the R command you used
to call it (NOT the 5000 realisations). [1.5 marks]
(c) Plot a histogram of those realisations (normalised like a probability density).
You may need to use the argument breaks to get a sensible histogram. In your
report, include both the histrogram and the R command you used to obtain the
histogram.
[1.5 marks]
2
(d) Use R to compute the sample mean and sample variance for your realisations. In
your report, include both the R commands you used and your answers. How do
your answers compare to the expectation and variance of a Binomial(10, 0.25)
random variable? [2 marks]
(e) If we wish to plot the graph of a function in R, we can evaluate that function
on a grid of points and use the plotting functions to join the dots. Use seq to
generate a suitable grid of points to add the density of a normal distribution of
the same mean and variance to your histogram. Use the lines function (which
adds lines to an existing graph rather than plotting a new one) to add a blue
line showing this normal density to your histogram. Include both your code and
the corresponding graph in your report. [2 marks]
(f) Repeat steps (b)-(e) for a Binomial(1000, 0.25) distribution. What do you observe?
In your report, only include the corresponding histogram (with the corresponding
normal density in blue) and your comments. [2 marks]
(g) Repeat steps (b)-(e) for a Binomial(10000, 0.0001) distribution. What do you
observe? Try to also add the probability mass function of a Poisson distribution
with the same mean. For the Poisson mass function use a red colour. What
is significant about what you observe here? In your report only include the
corresponding histogram (with the corresponding Normal density in blue and
Poisson mass function in red) and your comments. [3 marks]
[Total: 13 marks]
3. Convolutions.
(a) Use rexp() to obtain a sample of 10,000 Exp(1) random variables. Plot a
histogram of your sample on the same scale as a probability density. What do
you observe and why? In your report, include your code, the histogram and
your comments. [1.5 marks]
(b) Write a function which has one argument, n, and which returns a vector of length
n each element of which is obtained as the sum of two Exp(1) random variables.
Plot a histogram of the values obtained using this function for n = 10, 000. In
your report, include your code and the histogram. [1.5 marks]
(c) Compute the density of the sum E1 and E2 if these random variables are independent
Exp(1) distributed random variables. How does this density relate with
the density of a Gamma(α, β) distribution? Add the density you obtained to the
histogram you produced in part (b) using red colour. In your report, include the
computations, your comments, the code and the corresponding plot. [4 marks]
(d) Adapt the function you wrote in part (b) to accept 2 arguments, n and k, and to
return a vector of length n, each element of which comprises the sum of k independent
Exp(1) random variables. Include your code in the report. [2 marks]
(e) Plot a histogram of the values you obtain using the function of part (d) for
k = 10 and k = 50, when n = 10, 000. In your report, include the code
3
you used to obtain the vectors (NOT the vectors) and the corresponding histograms.
[2 marks]
(f) Do the histograms you obtain resemble any common probability density? If so,
add the appropriate density function to the plot in red colour. In your report,
include your answer (only brief justification needed and not a proof) as well as
the corresponding histograms with the appropriate density. [3 marks]
[Total: 14 marks]
Note: For full marks, do not forget to add suitable titles to your plots.
4
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。