本次美国代写是一个计量经济学的assignment

Part 1

The following questions are from Wooldridge’s Introductory Econometrics – 7e

Question 1

Using data from 1988 for houses sold in Andover, Massachusetts, from Kiel and McClain (1995), the following equation relates housing price (

price

) to the distance from a recently built garbage incinerator (

dist

):

log(price)^=9.40+0.312log(dist)

n=135,R2=0.162

Q1-1 Interpret the coefficient on log(

dist

). Is the sign of this estimate what you expect it to be?

Q1-2 Do you think simple linear regression provides an unbiased estimator of the ceteris paribus elasticity of 

price

 with respect to 

dist

? (Think about the city’s decision on where to put the incinerator)

Q1-3 What other factors about a house might affect its price? Might these be correlated with distance from the incinerator?

Question 2

Consider the savings function

sav=β0+β1inc+u

u=incε

where 

ε

 is a random variable with 

E[ε]=0

 and 

V[ε]=σ2

. Assume that 

ε

 is independent of 

inc

.

Q2-1 Show that 

E[ε|inc]=0

, so that the key zero conditional mean assumption (A3) is satisfied.

Q2-2 Show that 

V[u|inc]=σ2inc

, so that the homoscedasticity assumption (A4) is violated. In particular, the variance of 

sav

 increases with 

inc

.

Q2-3 Provide a discussion that supports the assumption that the variance of savings increases with family income.

Question 3

We are interested in the birth weight (

bwght

) of infants and the number of cigarettes the mother smoked per day during pregnancy (

cigs

). The following simple regression was estimated using data on 

n=1388

 births

bwght^=119.770.514cigs

Q3-1 What is the predicted birth weight when 

cigs

 = 0? What about when 

cigs=20

 (one pack a day)? Comment on the difference

Q3-2 Does this simple regression necessarily capture a causal relationship between the child’s birth weight and the mother’s smoking habits? Explain.

Q3-3 To predict a birth weight of 125 ounces, what would 

cigs

 have to be? Comment.

Q3-4 The proportion of women in the sample who did not smoke while pregnant is about 0.85. Does this help reconcile your finding from part (3)?

Part 2

These next questions will be using R

Question 4

In this question we will compare the difference between the finite sample properties and large sample properties of OLS. Let’s say the population regression is

Yi=β0+β1Xi+εi

where


  • β0

     = 3


  • β1

     = 5


  • XiN(2,1)


  • εiN(0,1)

Q4-1 Simulate 

{(Yi,Xi)}i=15000

 (i.e., 5000 data points), save it as a data frame, and plot the histogram of 

Yi

 and 

Xi

. Properly label your graphs (you will lose points if you don’t – you can add + xlab("appropriate label for X") + ylab("appropriate label for Y") to your line of code)

Q4-2 Now let’s show the unbiasedness of 

β^

.

Do the following steps in R.

  1. Create a function that will calculate 
    β^0

     and 

    β^1

     from a sample of size N

  • Initiate your function using
 regOLS <- function(N){
   
   
 }
  • Now inside your function, use samp <- df[sample(nrow(df), N), ] to draw a sample of size N from your data frame and save it as samp
  • Calculate the OLS 
    β^1

     and 

    β^0

     based your samp data

  • Have your function return data.frame(b0 = __, b1 = __)
  1. Create 4 empty data frames to store your values of 
    β^1

     and 

    β^0

val1 <- data.frame(b0 = double(), b1 = double())
val2 <- data.frame(b0 = double(), b1 = double())
val3 <- data.frame(b0 = double(), b1 = double())
val4 <- data.frame(b0 = double(), b1 = double())
  1. Using a for loop, run your regOLS function for 100, 500, 1000, 5000 times, saving 
    β^1

     and 

    β^0

     each time into your val1,val2,val3,val4 dataframes, respectively. Use 

    N=5

     for your sample size, so that you are running the regression on a sample of size 5 each time. val1 should have size 100, val2 should have size 500, and so on.

  2. Report the average of 
    β^1

     and 

    β^0

     for each of your val data frames by running the following code as is:

results = data.frame(n= double(), beta0_avg = double(), beta1_avg = double())
results[1:4,'N'] =c(100,500,1000,5000)
results[1,2:3] = colMeans(val1)
results[2,2:3] = colMeans(val2)
results[3,2:3] = colMeans(val3)
results[4,2:3] = colMeans(val4)

print(results)

Show the output of print(results) for credit.

Q4-3 Interpret your results. Does having a small sample size of 5 matter in terms of expected values?

Q4-4 Since we simulated the unbiasedness of the OLS estimator, now let’s simulate the consistency of it. Using the same regOLS function from before, run the function four times with 

N=10,50,500,5000

 each. You can just run the code below.

results = data.frame(n= double(), beta0_avg = double(), beta1_avg = double())
results[1:4,'n'] =c(10,50,500,5000)
results[1,2:3] = colMeans(regOLS(10))
results[2,2:3] = colMeans(regOLS(50))
results[3,2:3] = colMeans(regOLS(500))
results[4,2:3] = colMeans(regOLS(5000))

print(results)

Show the output of print(results) for credit.

Q4-5 Interpret your results. What happens to your estimators as 

N

 increases?

Q4-6 Explain the difference between unbiasedness (finite sample property) and consistency (large sample / asymptotic property).