这个作业是分析了一项研究的数据,评估“ Progresa”的选举影响

Homework 4
POL 850
Spring 2020
This homework is due by 5 PM on Friday, April 24. Please use this R Markdown template to report your
code, ouput, and written answers in a single document. You may also submit your R script, output, and
typed written answers separately. In either case, upload a single pdf of your final document to the Assignment
portal on Classes. Comment your code. Report results in the correct units of measurement. Do not report
more than two digits to the right of the decimal point.
Question 1: The Electoral Effects of Conditional Cash Transfers
In this exercise, we analyze the data from a study that estimated the electoral impacts of ‘Progresa’, Mexico’s
conditional cash transfer program (CCT program). The original study relied on a randomized evaluation
of the CCT program in which eligible villages were randomly assigned to receive the program either 21
(Early Progresa) or 6 months (Late Progresa) before the 2000 Mexican presidential election. The authors
hypothesized that the CCT program would mobilize voters, leading to both an increase in turnout and an
increase in support for the incumbent party (PRI in this case). The analysis was based on a sample of
precincts that contained at most one participating village in the evaluation. The data we analyze is available
as the CSV file progresa.csv. The names and descriptions of variables in the data set are listed in Table 1.
Question 1.1
First, create two new variables that measure a) the change in turnout between 1994 and 2000, as shares
of the voting eligible population (using t1994 and t2000), and b) the change in incumbent party (PRI)
support between 1994 and 2000, as shares of the voting eligible population (using pri2000s and pri1994s).
Then, estimate the impact of the earlier availability of the CCT program on the changes in turnout and
PRI support using two different strategies. First, construct difference-in-means estimators by comparing
the average changes in outcomes in the treated’ (Early *Progresa*) precincts versus the ones
observed incontrol’ (Late Progresa) precincts. Next, estimate these effects by regressing the outcome change
variables on the treatment variable. Interpret and compare the estimates under these approaches. Do the
results support the hypothesis? Provide a brief interpretation.
##insert code here
Insert written answer here.
Question 1.2
Now, fit a regression model for each outcome change variable that includes the average poverty level in a
precinct (avgpoverty), the total precinct population in 1994 (pobtot1994), the total number of voters who
turned out in the previous election (votos1994), and the total number of votes cast for each of the three
main competing parties in the previous election (pri1994 for PRI, pan1994 for Partido Acci’on Nacional
or PAN, and prd1994 for Partido de la Revolución Democrática or PRD). Use the same outcome change
variables as in the previous question. According to this model, what are the estimated average effects of the
program’s earlier availability on changes in turnout and support for the incumbent party? Are these results
different from what you obtained in the previous question?
## insert code here
1
Table 1: Variable descriptions in progresa.csv dataset
Variable Description
treatment Whether an electoral precinct contains a village where
households received Early *Progresa*
pri2000s PRI votes in the 2000 election as a share of precinct
population above 18
pri2000v Official PRI vote share in the 2000 election
t2000 Turnout in the 2000 election as a share of precinct
population above 18
t2000r Official turnout in the 2000 election
pri1994 Total PRI votes in the 1994 presidential election
pan1994 Total PAN votes in the 1994 presidential election
prd1994 Total PRD votes in the 1994 presidential election
pri1994s Total PRI votes in the 1994 election as a share of
precinct population above 18
pan1994s Total PAN votes in the 1994 election as a share of
precinct population above 18
prd1994s Total PRD votes in the 1994 election as a share of
precinct population above 18
pri1994v Official PRI vote share in the 1994 election
pan1994v Official PAN vote share in the 1994 election
prd1994v Official PRD vote share in the 1994 election
t1994 Turnout in the 1994 election as a share of precinct
population above 18
t1994r Official turnout in the 1994 election
votos1994 Total votes cast in the 1994 presidential election
avgpoverty Precinct Avg of Village Poverty Index
pobtot1994 Total Population in the precinct
villages Number of villages in the precinct
Insert written answer here.
Question 1.3
Some variables such as population or income are often skewed in their distributions, and don’t have nice
linear relationships with outcome variables that are not skewed. We often use the log transformations of such
variables in regression models so that we can estimate linear relationships between variables. To see this,
make a scatterplot with precinct population on the x axis, and turnout in 2000 as a share of population 18
and over on the y axis. Label the axes and give your plot a title. Does it look like there is a linear relationship
between population and turnout? Next, plot the natural logarithm transformation of precinct population, or
log(pobtot1994), on the x axis, and turnout in 2000 as a share of population 18 and over on the y axis.
Label the axes and give your plot a title. Does it look like there is a linear relationship between the log of
population and turnout? Also, in both graphs, do you notice anything unusual about the distribution of the
turnout variable?
## insert code here
Insert written answer here.
2
Question 1.4
Now, consider an alternative model specification. Use the same regression model as in Question 1.2, but
include the electoral variables in the previous election measured as shares of the voting age population
(t1994, pri1994s, pan1994s, and prd1994s) instead of measured in counts. In addition, include the natural
logarithm transformation of the precinct population variable instead of the raw population (simply include
log(pobtot1994) as a predictor in your regression). Are the results based on this new model specification
different from what we obtained in Question 1.2? If the results are different, which model fits the data better?
To compare the model fit, use adjusted R squares.
## insert code here
Insert written answer here.
3


EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

E-mail: easydue@outlook.com  微信:easydue


EasyDue™是一个服务全球中国留学生的专业代写公司
专注提供稳定可靠的北美、澳洲、英国代写服务
专注提供CS、统计、金融、经济、数学等覆盖100+专业的作业代写服务