Homework 5
For this homework install the updated simts package as well as the expsmooth, quantmod and pageviews R
packages from CRAN.
Question 1
Suppose we collect a time series data X = (X1, . . . , XT ).
(a) According to the following R outputs, please write the estimated AR(1) model for the data X. What is
the estimation for the variance of withe noise?
> fit.ML = estimate(AR(1),X, demean = FALSE)
> fit.ML
Fitted model: AR(1)
Estimated parameters:
Call:
arima(x = as.numeric(Xt), order = c(p, intergrated, q), seasonal = list(order = c(P,
seasonal_intergrated, Q), period = s), include.mean = demean, method = meth)
Coefficients:
ar1
0.97
s.e. 0.04
sigma^2 estimated as 0.8929: log likelihood = -137.17, aic = 278.34
(b) Please give out the 95 percent confidence interval for φ(the 97.5 percent quantile is 1.96).Is there any
drawback of this confidence interval?Any method to avoid this issue? Please state in detail.
(c)Now we use AIC and BIC to select the model from a candidates pool. Model 1 has AIC=246.34 and
BIC=268.64. Model 2 has AIC=256.45 and BIC=264.26.
If the sets of candidate models (i.e. model 1 and model 2 we are considering) do not include the true model.
Which information criteria we should use? Which model we should select? If the sets of candidate models
(i.e. model 1 and model 2 we are considering) include the true model. Which information criteria we should
use? Which model we should select?
(d)After selecting the model, we have the following R outputs.
Based on this estimated model, if we know that XT −4 = 1, XT −3 = 1.3, XT −2 = 0.8, XT −1 = 1 and XT = 0.6,
please give one day ahead prediction XT
T +1 and two day ahead prediction XT
T +2.
Show the details of your work.
Fitted model: AR(2)
Estimated parameters:
Call:
arima(x = as.numeric(Xt), order = c(p, intergrated, q), seasonal = list(order = c(P,
seasonal_intergrated, Q), period = s), include.mean = demean, method = meth)
Coefficients:
1
ar1 ar2
-0.3 0.6
s.e. 0.0804 0.0819
sigma^2 estimated as 0.9282: log likelihood = -138.97, aic = 246.34
Question 2
The data hospital contains monthly patient counts for 767 hospitals from January 2000 to December 2007.
Focus on the eighth hospital (i.e. hospital[, 8]):
1. Comment on the plot of the time series: does it appear to be stationary?
2. If not stationary, use a linear regression to remove possible trends and/or seasonalities.
3. Perform a diagnostic analysis on the residuals. Please state the purpuse of six graphs one by one in
detail. does there appear to be dependence between lags?
4. If there appears to be dependence in the residuals, propose and estimate a time series model for them.
Justify the model.
Question 3
Using the quantmod library, download the stock prices of Microsoft from January 1st, 2000 (use the symbol
“MSFT” within the getSymbols() function).
1. Comment on the plot of the time series: does it appear to be stationary? If not, suggest how to make it
stationary.
2. To obtain the stock returns we need to take a first difference of the stock prices: does this time series
appear to be stationary?
3. Analyse the ACF and PACF plots of the returns: does there appear to be dependence?
4. Analyse the ACF and PACF plots of the absolute value of the returns: does there appear to be
dependence? Discuss.
5. Propose and estimate a time series model for the returns and for the absolute returns. Justify the
models.
Question 4
Consider the utility data that contains hourly utility demand in the Midwest from January 1st 2003 to
May 7th 2003. Suppose you’re an analyst for the energy company:
1. Apply linear regression (or another method) on time-related variables (e.g. hours, days, months) to
obtain stationary residuals.
2. Remove the last 24 hours (observations) from the residuals:
a) Suggest and estimate a time series model for the first 3000 observations.
b) Deliver point forecasts and 95% confidence intervals for the next 24 hours.
c) Based on the point forecasts you obtained from the previous question, compute the Median
Absolute Prediction Error (MAPE) of your point forecasts defined as
MAP E = median
are the j-ahead point forecasts and Xt+j are the j-ahead realizations representing the
removed 24 hours.
2
d) Based on the confidence intervals computed in question 3b, compute the empirical coverage of
your confidence intervals (i.e. the percentage of times your confidence intervals contain the actual
corresponding future realization). Is it close to 95% ? If not, explain possible reasons why.
Question 5
Using the ukcars data representing the quarterly production of cars in the UK (in thousands) from the first
quarter of 1977 to the first quarter of 2005, do the following:
1. Check if the time series is stationary and, if not, perform a linear regression to make the residuals
stationary (use time and quarters as covariates).
2. Estimate an AR(8) for the residuals using the MLE and give the parameters estimates
a) Give the parameter confidence intervals using their asymptotic distribution.
b) Give the parameter confidence intervals using parametric bootstrap (B = 500)
Question 6
Consider the following code:
set.seed(2)
Xt = gen_gts(n = 200, AR(phi = c(0.75, 0.2), sigma2 = 1))
B = 5000
mat = matrix(0,B,52)
mat[,1] = rep(Xt[199], B)
mat[,2] = rep(Xt[200], B)
for (i in 1:B){
for (j in 3:52){
mat[i,j] = 0.75*mat[i,(j-1)] + 0.2*mat[i,(j-2)] + rnorm(1)
}
}
1. Explain what this code is doing.
2. Change the code to obtain only point forecasts for the next 10 observations (i.e. 201 to 210).
3. Use this code to deliver 95% confidence intervals for the next 10 observations (i.e. 201 to 210).
4. Modify the code in order to obtain point forecasts and 95% confidence intervals for the next 20
observations of following time series:
Yt = gen_gts(n = 100, AR(phi = c(0.8), sigma2 = 0.5))
Question 7
Using the article_pageviews() function in the pageviews package, download the number of views for the
article on “Cheese” (Cheese) from August 1st to September 30th (2018).
1. Comment on the plot of this time series.
2. Comment on the ACF and PACF plots of the time series.
3. Use the robacf() function in the robcor package: do you notice a difference with the standard ACF?
Discuss.
3
4. Use the estimate() function to estimate the φ parameter of an AR(1) model using the MLE and the
RGMWM: comment on the two estimates of φ.
5. Adapting the code from Question 5, use the RGMWM estimates to deliver point forecasts and confidence
intervals for the next 28 days (October 1st to October 28th):
a) Compute the MAPE for your point forecasts.
b) Compute the empirical coverage of your confidence intervals.
Question 8
Consider the Ljung-Box statistic defined as:
Qh = T(T + 2)X,
where ρˆj is the estimated autocorrelation on a given time series. Show that, as T → ∞,
meaning that the statistic tends towards a Chi-square distribution with h degrees of freedom.
Question 9
Please check the following models satisfy the conditions of causal and invertible ARMA models. (check the
paramertric redundancy firstly)
1. Xt − 1.5Xt−1 + 0.5Xt−2 = Wt − 1.8Wt−1 + 0.8Wt−2
2. Xt − 1.1Xt−1 + 0.28Xt−2 = Wt − 0.7Wt−1
Question 10
Consider the causal model:
Xt = φXt−1 + Wt, t = 1, . . . , T,
where Wt are i.d.d N(0, 1).
1. Derive the conditional MLE for φ, say φˆ.
2. Please find the theretical PACF values of Xt.
3. From R output, we have φˆ = 0.94 and the standard error of φˆ is 0.05. please give our 95 percentage
confidence interval. Is this result reasonable or not? If we have X8 = 10, Please give the best linear
prediction of X10.
4. If we don’t have model assumptions on a series of observations: Y1, . . . , YN .. Now using the AIC
principal to select the model. If we have model 1 with AIC=275 and Model 2 with AIC=276. Which
model we shall use and explain the reason.
5. Why is the property of stationay important when working with a time series?
4
Question 11
Consider the theoretical ACF and PCAF are presented in the figure below. Using the figure below:
1. Propose a reasonable model for this time series. Justify your answer.
2. Propose a value of the model’s parameters. Justify your answer.
## Warning: package 'simts' was built under R version 3.5.2
Lags
ACF
Theoretical ACF plot
0 5 10 15 20
−0.5 0.0 0.5 1.0
Lags
PACF
Theoretical PACF plot
5 10 15 20
−0.5 −0.4 −0.3 −0.2 −0.1 0.0
Question 12
(Adapted from Cryer & Chan Exercise 7.28 ) The data file named deere3 contains 57 consecutive values from
a complex machine tool at Deere & Co. The values given are deviations from a target value in units of ten
millionths of an inch. The process employs a control mechanism that resets some of the parameters of the
machine tool depending on the magnitude of deviation from target of the last item produced. Load the TSA
package and then use data(deere3) to load the data.
1. Plot the data. Does it appear stationary?
2. Plot the ACF and PACF for the data. Which values for ARMA(p, q) are suggested?
5
3. Estimate the parameters of an AR(1) model for this series using maximum likelihood. Repeat this for
an AR(2) model. Report the estimates, their standard errors, and the AIC values.
4. Simulate from both fitted models using the estimated parameters, with n = 57. Plot the simulated
data for both models, and compare them to the original data.
5. Using your observations from parts (3) and (4), which of AR(1) or AR(2) would you prefer and why?
Question 13
The data in sheep.dat are the sheep population (in millions) for England and Wales from 1867-1939. It can
be read into R with the command
sheep=ts(scan(‘sheep.dat’),start=1867)
1. Plot the time series.
2. Perform any transforms, take differences, and/or remove mean structure via regression to produce a
stationary series.
3. Plot diagnostics from removing mean structure.
4. Produce plots of the ACF and PACF of the stationary series.
5. Justify a preliminary order of ARMA model.
6. Fit the model, simulate from it, and compare with the data.
7. Perform model diagnostics: calculate the residuals, check them for normality, and their ACF for
remaining time series structure. Apply the Ljung-Box test.
8. Make any adjustments to your model suggested by residuals diagnostics.
9. State your final model, and include 1-2 sentences of justification.
Question 14
Please calculate the auto correlation function ρ(h) of Xt, where Xt comes from the following model.
Xt = 1.4Xt−1 − 0.48Xt−2 + Wt − 0.2Wt−1 − 0.48Wt−2
where Wt is a Gaussian white noise with variance σ
2
.
Question 15
In this question, after fitting a linear model, The model diagnostics plots are presented in Figure 1.
Based on this diagnostic plot, what kind of model you will suggest to try? Please state your reason for
selecting this model.
6
Figure 1: Diagnostic of the model
7
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。