联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp2

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2020-07-16 11:29

Problem Set 2

Time Series (and a bit of Causality)

EC 421: Introduction to Econometrics

Due before noon (11:59am) on July 16th, 2020 (on canvas)

DUE Upload your answer on Canvas before noon on July 16th, 2020.

IMPORTANT You must submit two les:

1. your typed responses/answers to the question

2. the R script you used to generate your answers. Each student must turn in their own answers.

If you are using RMarkdown, you can turn in one le,

but it must be an HTML or PDF that includes your responses

and R code.

OBJECTIVE This problem set has three purposes: (1) reinforce the topics of time series and statistical inference;

(2) build your R toolset; (3) start building your intuition about causality within econometrics/regression.

INTEGRITY If you are suspected of cheating, then you will receive a zero. I may report you to the dean. Everything

you turn in must be in your own words.

2 / 9

Conceptual Questions

1. Remember that we've discussed three types of time-series models: (1) static models, (2) dynamic models with

lagged explanatory variables, (3) dynamic models with lagged outcome variables.

1a. If the disturbance is not autocorrelated, for which of the 3 types of models is OLS unbiased?

If any of the models are biased, explain why.

1b. If the disturbance is not autocorrelated, for which of the 3 types of models is OLS consistent?

If any of the models are inconsistent, explain why.

1c. If the disturbance is autocorrelated, for which of the 3 types of models is OLS unbiased?

If any of the models are biased, explain why.

1d. If the disturbance is autocorrelated, for which of the 3 types of models is OLS consistent?

If any of the models are inconsistent, explain why.

2. In our time-series lecture, we discussed how static time-series models are a pretty restrictive and simplistic

way to model time-series data.

2a. Explain why static time-series models are generally restrictive and simplistic.

2b. Give an example of a reasonable static time-series model. By reasonable we mean that it would be

reasonable to model the relationship as a static relationship. Explain why it is reasonable to model the

relationship as static rather than dynamic—and make sure you tell us what (t) would represent (e.g., days,

months, years).

Note: The model should look something like:

2c. Give an example of a reasonable dynamic time-series model. By reasonable we mean that it would be

reasonable to model the relationship as a dynamic relationship. Explain why this relationship should be

modeled as a dynamic relationship. Make sure you tell us what (t) would represent (e.g., days, months, years).

Note: The model should look something like

3. Time-series models frequently include the lag of a variable, e.g., . Explain why we usually do not use lags in

cross-sectional models, e.g., .

ut

ut

ut

ut

Birthst = β0 + β1

Incomet + ut

Birthst = β0 + β1

Incomet + Incomet?1 + ut

xt?1

xi?1

3 / 9

Some Real Data

Now we're going to work with some real data. The data come from the Environmental Protection Agency (EPA).

Specically,

the data describe electricity generation in the United States at a monthly level—the amount of

electricity generated, associated emissions, the number of retirements, etc.

For more information on the dataset, see the table on the last page of this problem set.

Why? Electricity generation is obviously important for day-to-day life: it runs our heating and air conditioning, it

allows us to have computers/phones/internet/refrigerators/etc., and it supports many businesses and critical

parts of our health systems and economy.

Emissions are important, because burning fossil fuels (e.g., coal and natural gas) produces toxic gases that are

released above the plant. These gases (emissions) have been traced to a bunch of negative outcomes—for people,

animals, plants, and the general environment (e.g., acid rain). Economics is about thinking on the margin: Where do

the marginal benets

from something equal the marginal costs? We know we need electricity, so we do not want to

make it too expensive for electricity generators to operate, but if we do not regulate electricity generation, then the

power plants may poison our air and water. Thus, one job of economists (specially environmental and energy

economists) is guring

out how regulations affect health, environment, and energy costs.

4. Load packages and your dataset hw_2_data.csv .

5. Which dates does the dataset cover (what are the start and end dates)? How many months?

6. How many plants retired during this sample?

7. Create (and include) three gures:

(1) the time series of total monthly generation ( generation_gwh ), (2) the

time series of NOx

(Nitrogen Oxide) emissions ( emissions_nox ), and (3) the time series for the number of

electricity generators who retired in the given month ( n_retirements ).

Hint: A time-series graph has time on the axis and a variable on the axis. Your axis can have either time t

(time relative to the beginning of the sample) or date ( month ).

8. For each of the three time-series graphs in 7, explain whether the variable appears to be positively

autocorrelated, negatively autocorrelated, or not autocorrelated. Make sure you explain your reasoning.

9. Estimate a static time-series model where monthly NOx emissions ( emissions_nox ) are the outcome variable

and our two explanatory variables are the number of retirements in the month ( n_retirements ) and the amount

of electricity generation in the month ( generation_gwh ).

Report your coefcient

estimates and their statistical signicance.

x y x

4 / 9

10. Now estimate a dynamic model in which you include the rst

lag for each of your explanatory variables

(number of retirements and amount of electricity generation). Note: You still want the non-lagged version of the

variables too—i.e., include and . Interpret the coefcient

on the lagged number of retirements.

11. Why might it make sense to include lags of the variable number of retirements? In other words: Why might we

want a dynamic model with lagged explanatory variables in this setting?

12. If the disturbance is autocorrelated, what problems does it cause for OLS regression estimates in 10?

Answer: If 10 has an autocorrelated disturbance, then OLS is inefcient

and has biased standard-error estimates.

13. Use the residuals from the regression in 10 to test for rst-order

autocorrelation in your disturbance. Report

the results from the hypothesis test.

Hint: Don't forget about the missing values due to lags (see lecture notes).

xt xt?1

5 / 9

14. Now estimate a dynamic model (still with NOx emissions as the outcome variable) with 0, 1, 2, and 3 lags of

the number of retirements and also the current month's electricity generation (no lags). Interpret the coefcient

on the third lag of the number of retirements.

15. Based upon your estimates in 14, what is the total effect of a retirement on NOx emissions?

Note: This estimate essentially assumes that the effect is gone after four months, which is not likely.

16. Now estimate an ADL(1,1) model with NOx emissions as the outcome and with number of retirements and

electricity generation as the explanatory variables. Report/interpret the coefcient

on the lag of NOx emissions.

Hint: Your regression should have an intercept plus ve

more terms.

The coefcient

on the lag of NOx emissions tells us that a one-ton increase in NOx emissions in the previous

month is associated with a 0.925-ton increase in NOx emissions in the current month. This relationship is very

statistically signicant.

The relationship says that our outcome is strongly correlated with itself in time.

17. Does it make sense to regress current NOx emissions on the previous month's emissions? Explain your answer.

18. If the disturbance is autocorrelated, then OLS is not consistent for the coefcients

in 16. Explain how you

could test for an autocorrelated disturbance using the model from 16.

Note: You do not actually need to run this test.

6 / 9

Causality

Imagine that we are interested in analyzing a government program. We consider individuals as treated if they

participated in the program (and untreated if they did not). Following the notation of the Rubin causal model,

imagine that we observe the following sample (which would be impossible observe in real life):

Table: Imaginary dataset

i Trt. y1 y0

1 0 17 8

2 0 7 5

3 0 10 4

4 1 5 1

5 1 0 0

6 1 1 4

19a. Calculate and report the treatment effect for each individual (i.e., ).

19b. Is the treatment effect heterogeneous or homogeneous? Briey

explain your answer.

19c. Calculate and interpret the average treatment effect for the sample.

19d. What does it mean if for one individual and for another individual?

19e. Estimate the average treatment effect by comparing the mean of the treatment group to the mean of

the control group. Report your estimate.

19f. Should we expect our estimator in 19e to provide unbiased estimates? Explain.

19g. Why would it be impossible to actually observe all of the data in the table (in real life)? Specically:

Which

parts of the dataset would we not observe in real life? Think about the fundametal problem of causal inference.

19h. Dene

and explain selection bias.

19i. Calculate (and report) the selection bias in this sample.

τi

τi < 0 τj > 0

7 / 9

Extra Credit: IV

The purpose of this question is to illustrate why you cannot estimate demand and supply equations with OLS.

You may receive up to 4 extra credit homework points on this question.

Suppose we we are interested in estimating demand elasticities. We think supply and demand relationships are

given by:

This next piece of information will be crucial for a later part of the problem so make note of it. You may assume

and . The rst

assumption says the unconditional mean of the demand and supply

shocks are zero. The second assumption says that the demand and supply shocks are independent from one

another (e.g their covariance is zero). You may assume that the variance of the shocks (disturbances) is

homoskedastic. Note that these assumptions, taken together, imply that and

.

Unfortunately, in our data we only observe equilibrium prices and quantities (that is: and ). This will lead to

endogeneity of in both equations. To see this endogeneity, we can impose the equilibrium condition and

solve for :

Clearly, and . This is where the endogeneity is coming from. More

intuitively, the equilibrium price, (our data) is impacted by both demand and supply shocks -- and .

20a. Ignoring the endogeneity for a moment, why can we interpret and as elasticities?

20b. Compute the equilibrium quantity . Hint: Just use the equilibrium price equation I gave you. There is

only one step here (plus maybe a few extra if you simplify the expression).

20c. Calculate the covariance between and . Remember, the covariance of two random variables,

and is given by:

20d. Recall: . Use your answer to 20c. to compute this probability limit.

20e. We will attempt to estimate with instrumental variables. As an example, suppose the demand and supply

equations we are estimating are for cigarettes. Our instrument is the general sales tax per pack in each state,

. What are the two conditions for the instrument to be valid? If you can test either of them, write out how you

would (specically

give a regression equation if it is possible).

20f. Now I want you to argue for the and against the exogeneity condition. Specically,

write out, at most, two

sentences arguing why I should believe the instrument is exogenous, and two as to why I shouldn't believe it is

exogenous. This is tough to get "right" -- so spend a bit of time thinking about it.

log(Qit) = κd + α1

log(Pit) + uit demand

log(Qit) = κs + α2

log(Pit) + ?it supply

Variable Description

t Time, relative to the rst

month of the sample (1, 2, ...)

month Month of the sample (e.g., 2015-12-01)

generation_gwh Total monthly electricity generation (Gigawatt hours, GWh)

emissions_so2 Total monthly emissions of SO2

(in tons)

emissions_nox Total monthly emissions of NOx

(in tons)

n_plants Number of unique electricity-generating units (EGUs) operating in the month

n_retirements Number of retired electricity generating units in the month

cumulative_retirements Cumulative number of retirements (through the given month)

i_cair Binary indicator for months during the Clean Air Interstate Rule (CAIR)

i_csapr Binary indicator for months during the Cross-State Air Pollution Rule (CSAPR)

9 / 9


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp