Econ 3818
Spring 2019
R exercise 5
Due Wednesday, May 1st in class. If you are unable to attend class that day, please email it to me
before class and submit a hard copy the following class. Only hard copies will be graded.
Submit R code used to answer all question as part of the document. Separate the R code above
and below by three asterisk (***).
You may work in groups of three (but no more than three!). Please put the name of all group
members at the top of the text file.
1. For this exercise we will run a regression using Swiss demographic data from around
1888. The sample is a cross-section of French speaking counties in Switzerland.
This data come with the R package datasets. The first step is to load the package into
your current environment by typing the command library(datasets) in to the R console.
1
This loads a number of datasets including one called swiss. Type help(swiss) in the
console for additional details. The basic variable definitions are as follows:
Use the summary() command to report the mean and median for the variables Fertility,
Education, and Catholic.
2. We want to estimate the expected Fertility level in a Swiss county conditional on the
county’s education level. We assume the relationship is linear. So, we are interested in
If we use Ordinary Least Squares to estimate and we have the following formulas
Where is the left hand side variable, is the right hand side variable, the bar A denotes
the mean, is the standard deviation, and is the correlation.
1 If you encounter an error you might have to install the package onto your computer using the
install.packages(datasets) command.
a. Find the correlation between Education and Fertility using the cor() function, as
well as the sample standard deviation for each variable using the sd() function.
Report these numbers and the code used.
b. Use the cor() and sd() function to get an estimate for ? in the equation relating
Fertility to Education. Keep this value stored in a scalar called beta_hat. Report
this number.
c. Now use the estimate beta_hat, along with the function mean() to get an estimate
for alpha. Keep this value stored as a scalar called alpha_hat. Report this number.
3. Use alpha_hat and beta_hat to predict the average fertility rate in a county where 40% of
the population is educated.
4. Plot the relationship between Fertility and Education using the plot() function with
Education on the horizontal axis. Do this using the plot() function. Make sure to label
your axis!
5. Now estimate the model the model relating Fertility Rate to Education using the lm()
function in R’s base code. Typically, if you want to estimate ?B = ? + ??B + ?B you use
the syntax
lm(yvar ~ xvar, dataframe).
a. Store the estimation results from estimating the model in a list called model_1.
This list should include a number of details include the estimated parameters, the
coefficient of determination (r-squared), all of the residuals from the model, and
more.
b. Use the command summary(model_1) to report the summary of the ordinary least
squares estimation and paste the results in the word document. Do you have the
same estimates for and from Question 2 part b and c?
c. What is the R-squared from this regression? Interpret it in a meaningful way.
6. For each one of the estimated parameters reported in Question 5:
a. Interpret the coefficient in a meaningful way.
b. Report the results from testing the null hypothesis that the true parameter value is
zero.
7. Recreate the figure in Question 4, and then add the line of best fit using the abline()
function with the coefficients from model_1, model_1$coefficients.
8. Plot Education with the residuals associated with the model, model_1$residuals. Paste the
plot in the document. Do the residuals show any pattern?
9. Use the mean() command to show that the average of the residuals associated with
model_1 is zero. Report the code used and the average of the residuals.
10. Now estimate the parameters in the following model
Using the summary() and lm() commands.
a. Interpret the point estimate for beta
b. Is the estimate statistically significant?
c. How does the R-squared compare to model 1 estimated in equation 5?
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。