## Part 1

The following questions are from Wooldridge’s Introductory Econometrics – 7e

### Question 1

Using data from 1988 for houses sold in Andover, Massachusetts, from Kiel and McClain (1995), the following equation relates housing price (

$price$

) to the distance from a recently built garbage incinerator (

$dist$

):

$\stackrel{^}{\mathrm{log}\left(price\right)}=9.40+0.312\mathrm{log}\left(dist\right)$

$n=135,{R}^{2}=0.162$

Q1-1 Interpret the coefficient on log(

$dist$

). Is the sign of this estimate what you expect it to be?

Q1-2 Do you think simple linear regression provides an unbiased estimator of the ceteris paribus elasticity of

$price$

with respect to

$dist$

? (Think about the city’s decision on where to put the incinerator)

Q1-3 What other factors about a house might affect its price? Might these be correlated with distance from the incinerator?

### Question 2

Consider the savings function

$sav={\beta }_{0}+{\beta }_{1}inc+u$

$u=\sqrt{inc}\ast \epsilon$

where

$\epsilon$

is a random variable with

$E\left[\epsilon \right]=0$

and

$V\left[\epsilon \right]={\sigma }^{2}$

. Assume that

$\epsilon$

is independent of

$inc$

.

Q2-1 Show that

$E\left[\epsilon |inc\right]=0$

, so that the key zero conditional mean assumption (A3) is satisfied.

Q2-2 Show that

$V\left[u|inc\right]={\sigma }^{2}inc$

, so that the homoscedasticity assumption (A4) is violated. In particular, the variance of

$sav$

increases with

$inc$

.

Q2-3 Provide a discussion that supports the assumption that the variance of savings increases with family income.

### Question 3

We are interested in the birth weight (

$bwght$

) of infants and the number of cigarettes the mother smoked per day during pregnancy (

$cigs$

). The following simple regression was estimated using data on

$n=1388$

births

$\stackrel{^}{bwght}=119.77-0.514cigs$

Q3-1 What is the predicted birth weight when

$cigs$

= 0? What about when

$cigs=20$

(one pack a day)? Comment on the difference

Q3-2 Does this simple regression necessarily capture a causal relationship between the child’s birth weight and the mother’s smoking habits? Explain.

Q3-3 To predict a birth weight of 125 ounces, what would

$cigs$

have to be? Comment.

Q3-4 The proportion of women in the sample who did not smoke while pregnant is about 0.85. Does this help reconcile your finding from part (3)?

## Part 2

These next questions will be using R

### Question 4

In this question we will compare the difference between the finite sample properties and large sample properties of OLS. Let’s say the population regression is

${Y}_{i}={\beta }_{0}+{\beta }_{1}{X}_{i}+{\epsilon }_{i}$

where

• ${\beta }_{0}$

= 3

• ${\beta }_{1}$

= 5

• ${X}_{i}\sim N\left(2,1\right)$

• ${\epsilon }_{i}\sim N\left(0,1\right)$

Q4-1 Simulate

$\left\{\left({Y}_{i},{X}_{i}\right){\right\}}_{i=1}^{5000}$

(i.e., 5000 data points), save it as a data frame, and plot the histogram of

${Y}_{i}$

and

${X}_{i}$

. Properly label your graphs (you will lose points if you don’t – you can add + xlab("appropriate label for X") + ylab("appropriate label for Y") to your line of code)

Q4-2 Now let’s show the unbiasedness of

$\stackrel{^}{\beta }$

.

Do the following steps in R.

1. Create a function that will calculate
${\stackrel{^}{\beta }}_{0}$

and

${\stackrel{^}{\beta }}_{1}$

from a sample of size N

• Initiate your function using
 regOLS <- function(N){

}
• Now inside your function, use samp <- df[sample(nrow(df), N), ] to draw a sample of size N from your data frame and save it as samp
• Calculate the OLS
${\stackrel{^}{\beta }}_{1}$

and

${\stackrel{^}{\beta }}_{0}$

based your samp data

• Have your function return data.frame(b0 = __, b1 = __)
1. Create 4 empty data frames to store your values of
${\stackrel{^}{\beta }}_{1}$

and

${\stackrel{^}{\beta }}_{0}$

val1 <- data.frame(b0 = double(), b1 = double())
val2 <- data.frame(b0 = double(), b1 = double())
val3 <- data.frame(b0 = double(), b1 = double())
val4 <- data.frame(b0 = double(), b1 = double())
1. Using a for loop, run your regOLS function for 100, 500, 1000, 5000 times, saving
${\stackrel{^}{\beta }}_{1}$

and

${\stackrel{^}{\beta }}_{0}$

each time into your val1,val2,val3,val4 dataframes, respectively. Use

$N=5$

for your sample size, so that you are running the regression on a sample of size 5 each time. val1 should have size 100, val2 should have size 500, and so on.

2. Report the average of
${\stackrel{^}{\beta }}_{1}$

and

${\stackrel{^}{\beta }}_{0}$

for each of your val data frames by running the following code as is:

results = data.frame(n= double(), beta0_avg = double(), beta1_avg = double())
results[1:4,'N'] =c(100,500,1000,5000)
results[1,2:3] = colMeans(val1)
results[2,2:3] = colMeans(val2)
results[3,2:3] = colMeans(val3)
results[4,2:3] = colMeans(val4)

print(results)

Show the output of print(results) for credit.

Q4-3 Interpret your results. Does having a small sample size of 5 matter in terms of expected values?

Q4-4 Since we simulated the unbiasedness of the OLS estimator, now let’s simulate the consistency of it. Using the same regOLS function from before, run the function four times with

$N=10,50,500,5000$

each. You can just run the code below.

results = data.frame(n= double(), beta0_avg = double(), beta1_avg = double())
results[1:4,'n'] =c(10,50,500,5000)
results[1,2:3] = colMeans(regOLS(10))
results[2,2:3] = colMeans(regOLS(50))
results[3,2:3] = colMeans(regOLS(500))
results[4,2:3] = colMeans(regOLS(5000))

print(results)

Show the output of print(results) for credit.

Q4-5 Interpret your results. What happens to your estimators as

$N$

increases?

Q4-6 Explain the difference between unbiasedness (finite sample property) and consistency (large sample / asymptotic property).

EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

E-mail: easydue@outlook.com  微信:easydue

EasyDue™是一个服务全球中国留学生的专业代写公司