For all problems below, please use x= 5043
1. (30 points) Nearly 100,000 observations are available on air temperature and specific humidity.
From these observations scientists have estimated that the temperature is approximately
Normally distributed with mean= 280 deg K with a standard deviation of 6 deg K. The specific
humidity has been found to vary approximately as:
h = C/1000* exp(C*T/500) (1)
Where C varies to some extent such that it is also Normally distributed with mean =x, and
standard deviation = x/10
a) Generate a sample of size 30 for temperature and humidity given this information.
b) Using your, sample assess whether there is a trend (of any kind) in the temperature data.
c) Estimate a relationship that can predict the probability of h>h75 given T using your sample.
Here, h75 is the 75th percentile of the h data from your sample. Present the regression
diagnostics, and justify your model choice – linear, nonlinear, GLM, etc.
d) Now consider a relationship between h and T. Given equation 1 above and the description
of the probability distributions of T and C, what would be a good form for the model relating
h and T? You are welcome to consider transforms or local regression or any other method
you would like to apply. DO NOT FIT THIS REGRESSION MODEL. Assuming that the model
you have formulated is a linear model between some predictor and some response variable,
predict the value of the response variable corresponding to the lowest temperature in your
data set by constructing an appropriate weighted average of the response variable.
e) Would your approach and answer to d) change if equation 1) included a random error term
on the right hand side? How and why? Do not solve.
Ei 乘在 1 或者 ei 加在 1, 不用考虑具体地 ei,只要考虑 how it display
1-> h=dxexp(dT/500) ei
(30 points)
2. Twenty groundwater wells are located in a rectangular region. The region is exactly 10 km by 10
km. The wells are located randomly with uniform sampling in the x and the y location
coordinates. Water level data has been recorded at each of the wells for 30 years. It can be
obtained by executing the code below
S=runif(1)
loc=matrix(runif(40,0,10),ncol=2,nrow=20)
plot(loc)
d=dist(loc,diag=T, upper=T)
c=exp(-d/max(d))
c=as.matrix(c)
diag(c)=rep(1,20)
library(MASS, lib.loc = "C:/Program Files/R/R-3.5.3/library")
data=matrix(ncol=20,nrow=30)
data[1,]=mvrnorm(mu=rep(S,20), Sigma=c)
for ( i in 2:30){for (j in 1:20)data[i,j]=0.95*data[i-1,j]+rnorm(1,0,sqrt(1-0.95^2))}
a. Is there any evidence of common patterns in this data set? What are some methods you could
use to explore this? Apply one of those methods; explain why you chose it and report the
results.
先做 bc 再做 a
Common pattern in time series in 20 wells
b. What is your estimate of the water level in year 15 at a location whose coordinates are (5,5)?
Clearly explain the procedure you used to develop this estimate, including a brief discussion of
competing methods you may have considered; why you chose the one you did; the assumptions
of that method, and whether they were satisfied when you applied that method. What is the
uncertainty of estimation for this estimate?
看以前的作业
c. Now consider the estimation of the water level in year 31 at the same location. Do not attempt
to compute this estimate. Sketch out two possible algorithms that you may use to develop this
estimate, and comment very briefly on what may be the possible advantage of one over the
other?
(30 points)
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。