本次加拿大代写是关于R数据分析的一个Assignment

Part 1: Bootstrap inference

For this part, use the dataset CPS1985 from the AER package. Then, select observations randomly using the following code with your student ID in the set.seed function. The idea is to have a small enough sample to make it beneficial to use bootstrap methods.

data(CPS1985, package=”AER”) set.seed(112233) ## You will all have a different n n <- sample(60:120, 1) ind <- sample(534, size=n, replace=FALSE) dat <- CPS1985[ind,]

1. Consider the following model

log(wage)= β0 +β1females+β2union+β3age+β4age2 +β5(females × union)+u,

with females being 1 for female workers and union being 1 for workers who have a union job. We want to construct confidence intervals for the gender gap in percentage for workers in union jobs and for workers in non-union jobs. We approximate percentage gaps G12 between two groups (1 and 2) using the following:

G12 ≡E(wage|group2)− E(wage|group1)/E(wage|group1) ×100 ≈ heE(log(wage)|group2)−E(log(wage)|group1) −1i ×100.

For example, if you have the regression result log(wage)=6+0.10males+3educ the percentage gap is he(6+0.10+3educ)−(6+0+3educ) −1i ×100=10.52%.

Compare the normal and percentile bootstrap confidence intervals using pairs and wild bootstrap . Interpret your result.

2. Compare the intervals from the previous question with the one obtained using the Delta method. Interpret the difference. 3. Consider the following model:

wage =β0 +β1education+β2females+β3experience+β4union +β5(females × education)+β6(females × union)+u.

Test the null hypothesis: If we hold experience and education constant, the gender gap for unionized and non-unionized workers is equal, against the alternative that it is no equal at 5%. For each of the following test, compute the p-value and interpret your result:

• T-test with asymptotic distribution

• Bootstrap T-test using pairs bootstrap

• Bootstrap T-test using the restricted wild bootstrap with the Rademacher distribution.

4. Consider the following model:

wage =β0 +β1education+β2experience+β3females +β4union+β5married+u.

We want to test the joint hypothesis H0 : β1 =1, β3 +β4 =0, β5 =0 at 5%. For each of the following test, compute the p-value and interpret your result:

• Wald test using the asymptotic distribution

• Bootstrap Wald test using pairs bootstrap

• Bootstrap Wald test using the restricted wild bootstrap with the Rademacher distribution.

• Bootstrap LM test using pairs bootstrap

• Bootstrap LM test using the restricted wild bootstrap with the Rademacher distribution. For each question, when it is possible, you are allowed to use any package. Just make sure the package does what you expect.

Part 2: One of Many Return to Education Study (from a Nobel

Memorial Price winner)

For this assignment, you may want to read the article Card (1993) to help you. You will find it on Learn. Also, you will need the file Card.rda also on Learn. The file contains a subset of the data used by Card (1993). The main goal is to test whether the return to education is the same for white and black workers. You will not obtain the same results because you will be using a subset of the original data. The whole dataset is available on Card’s homepage if you are interested. The description of the variables is in the file code_bk.txt . I want everyone of you to have a different dataset. Once you have loaded the data, run the following code, with your student ID in the set.seed function:

load(“Card.rda”) set.seed(112233) n <- sample(800:1200, 1) ind <- sample(nrow(dat), n, replace=FALSE) dat <- dat[ind,]

We want to estimate the following model:

log(wage76)=β0 +β1ed76+β2black +β3(ed76× black)

+β4exp76+β5exp762 +β6reg76r +β7smsa76r +u

You will see that exp76 (years of experience) is not in the dataset. You have to compute it using the formula exp76= age76− ed76−6. It is not the actual experience but the potential experience. This is a commonly used measure of experience in the labour economics literature.

1. Explain what is the issue with OLS when the objective is to estimate the return to education. Then, explain why the solution proposed by David Card can potentially solve the problem. You can use the article to answer the question, but you have to explain in your own words. Can you think of a possible reason for rejecting the validity of the instrument?

2. Estimate the model by OLS and interpret the result. Is there a different return to education for back and non-black workers?

3. For this question only, consider the model log(wage76)= β0 +β1ed76+u.

• Using the nearc4 as instrument, show that the Wald estimator is the same as the IV estimator.

• Explain the intuition behind the Wald estimator in this particular example.

• Can you explain why controlling for nearc4 is not the same as using it as instrument? What is the different?