Shared Flashcard Set

Details

Title

Program Evaluation Final

Description

Program Evaluation Final

Total Cards

Subject

Political Studies

Level

Graduate

Created

07/26/2013

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Political Studies Flashcards

Cards Return to Set Details

Term

What is the role of the control variables in cross-sectional multiple regression designs for OIE?

Definition

Control variables capture the factors that may affect Y independently from the program and that may be differently distributed between the treated and non-treated units. Once inserted in a cross-sectional multiple regression design, control variables enable the analysis to separate the impact on Y due to the program from the impact on Y due to other factors that differentiate the treated units from the non-treated units.

Term

The control variables should be selected for a multiple regression model using what criteria in OIE?

Definition

They have to represent factors that may potentially affect the outcome Y of the analysis independently from the program. b) They are distributed differently between the treated and non-treated group

Term

What is the meaning of the term: “sensitivity of impact estimates to different functional forms of the control variables”?

Definition

means that the impact estimates changes drastically based on the type of functional form (dummy, continuous, categorical, etc.) used for inserting the control variables in the estimation model. Such high volatility of the impact estimates is a problem because it does not allow to draw meaningful policy recommendations from the results of the analysis.

Term

What possible solutions can be adopted when results are sensitive to the different functional forms of the control variables?

Definition

1. Run an extensive sensitivity analysis by replicating the model estimation under all possible combinations of alternative functional form choices for the control variables.
2. Implement the analysis with a Propensity score matching technique (through the balancing property test, propensity score matching does allow the analysis to be implemented with a validated functional form for the control variables)

Term

Explain what is the role of the Propensity score for OIE.

Definition

The propensity score (PS) is a parameter (ranging from 0 to 1) capable of summarizing all different control variables adopted in the analysis. For programs targeting units of observation in need of assistance, the PS can be intuitively assimilated to the % degree of initial distress of the units of observations (PS values close to 0 represent low distress, PS values close to 1 represent high distress). PS scores are then used to match treated units with comparable non-treated units through different propensity score matching techniques.

Term

When should you use Cross-Sectional Multiple Regression?

Definition

Treated and non-treated units have not identical characteristics and Panel Data is not available

Term

What form of bias do multiple regression control variables combat?

Definition

Inserting such characteristics as “control variables” in a multiple regression model helps in reducing selection bias:

Term

How do you choose the correct functional form of a variable in multiple regression?

Definition

Unfortunately there are no certain criteria to choose “correct” functional forms.
Possible Solutions:
1) extensive sensitivity analysis to check whether or not results are stable
to different functional form options;
2) using PSM (propensity score matching) to run the analysis instead of multiple regression models.

Term

How does Propensity Score Matching avoid the problems that arise with differing functional forms in regression?

Definition

This is due to the PSM “balancing property”

Term

In what way is using Propensity Scores like regression

Definition

It basically is logit regression.

Besides a couple of technical details, Propensity Scores work very similar to multiple regression. The model is only as good as the control variables.

Term

How is the output of the propensity score interpreted?

Definition

It is the output of a logit regression (from 0 to 1)

-close to 1 = highly disadvantaged initial conditions, more likely to see intervention
(e.g areas with high crime, low income, …)
-close to 0 = favorable initial conditions unlikely to receive intervention

Term

How do we put into practice the "balancing property" for Propensity score matching?

Definition

1) The propensity score - P(T-1) - B0 + B1 + B2 + B3 is estimated based on a specific functional form for each variable

2) All units (both the treated and the non-treated) are sorted according to their Propensity Score value

Term

Under the balancing property of Propensity Score matching, how do you the functional form of your variables is validated?

Definition

The functional form is validated if:
The entire sample can be stratified with contiguous strata containing
at least one treated and one non-treated unit (We need to divide the data into different strata, in our class data for example, the first acceptable strata would be 1-36 since 37 is the first treated unit, 1-36 are all T=0)

2)Within each strata the mean value for each control variable (B1, B2, B3, etc) has to have no statistically significant differences between the treated and the non-treated units

Term

How does "Nearest Available" Propensity Score Matching work?

Definition

You can only use each district once, so you're removing them in pairs of two. The more you use them, the worse the matches will get as there will be less suitable comparisons

1) The treated units (NT) are listed in a separate file and sorted based on their PS value (or in a random order)

2) The first of the treated units (NT) is matched with the non-treated
unit having the most similar PS

3) The two matched units are removed from the original lists and
placed in a third file. Steps 1-3 are replicated for each of the NT
treated units.

Term

Impact Estimate for "Nearest Match" Propensity Score

Definition

1) summing the single ΔY (change in the dependent variable) between each treated and matched units

2) computing the weighted averages of the single ΔY between each treated and matched units with weight Wi

Term

Impact estimate for "Nearest Match" Propensity Score - formula.

Definition

[image]

Term

[image]

Definition

"Nearest Match" Propensity Score Matching

Term

What are trade-offs involved with choosing the tolerance in radius (PS) matching estimators?

Definition

On the one hand, it would be an advantage to enlarge the radius in order to obtain a larger estimation sample (improving the statistical efficiency of the impact estimates). On the other hand, to limit selection bias it would be an advantage to choose a radius as small as possible.

Term

What are advantages and disadvantages of PS matching with replacement versus nearest available PS matching?

Definition

Advantages: it reduces the risk of having to match the last treated units with non-treated units with too distant PS.
Disadvantages: it is more sensitive to measurement errors in the data affecting PS values of non-treated units (i.e. one non-treated unit with high PS can be matched to a large number of treated units, amplifying the effects of the possible measurement errors on the impact estimates).

Term

What is the main advantage of PS matching over multiple regression models?

Definition

PS matching can exploit the “balancing property” to test whether or not a given functional form of the control variables is appropriate. As a consequence, PS matching does not suffer from possible sensitivity of impact estimates to different functional forms of the control variables.

Term

What is the additional advantage of PS matching versus multiple regression models when Y data have to be retrieved through primary data collection?

Definition

Using a PS matching procedure (except for kernel matching) reduces the number of units used in the analysis and for which data collection has to take place;
Reduced costs for the primary data collection of the Y data (in cases in which Y data are not available from statistical offices)

Term

Radius Matching

Definition

Radius matching allows each treated units to be matched with more
than one non-treated unit

A “tolerance radius” is established, so that each non-treated units
with a PS within the tolerance is selected to be part of the comparison
group for a given treated unit.

Radius matching is usually implemented with replacement: a same
non-treated unit can be included in the comparison group of more
than one treated unit.

Term

How to choose the tolerance radius in radius matching PS?

Definition

--Trade-off to be balanced:

•To obtain a larger estimation sample (improving the statistical efficiency of the impact estimates) radius should be kept not too small

•To limit selection bias issues radius should be chosen as small as possible.

In all cases, once eliminated the units outside the common support, min radius has to be chosen so that each treated unit has a non-zero comparison group.

Term

[image]

Definition

Radius Matching

Term

Kernel Matching Details

Definition

Most advanced statistical matching procedure in this class

With Kernel matching the outcome (YT) of each treated unit is compared to a weighted average of the outcomes (YNT) of all non-treated units

Weights are inversely proportional to the distance between PS of the given treated unit (i) and the PS of the non-treated units

Term

[image]

Definition

Kernel Matching

Term

When is Conditional Matching (Kernel Matching) the appropriate matching form?

Definition

Conditional matching is recommended when it is suspected that
relevant unobservable characteristics may be differently distributed between treated and non-treated units that

For example, non-treated units:
1) Are located in different regions or;

II) Operate in different sectors (for analyses with firm-level
outcomes B) or;

III) Have different sizes.

Term

How do you conduct conditional matching?

Definition

Treated and non-treated units are separately sorted into the
categories defined by location or business sector or size

Separately for each category based on location, business sector or size.one matching procedure is applied

such that, the treated units are matched only with the non-treated units belonging to the same category (and therefore
possessing the same unobserved characteristics of the treated unit).

Term

The Importance of using Panel Data

Definition

Comparing treated with non-treated units is not useful when inferring the
impact of the program because of different initial characteristics

Term

Difference in Difference Estimator

Definition

a^=E(Ypost - Ypre | Ti=1) - E(Ypost - Ypre|Ti=0)

DD estimators require the availability of both pre- and post-intervention data

DD allows the analysis to control for some unobservable differences between the treated and the non-treated units

Term

What is the major advantage of panel data?

Definition

Every time we compare outcomes between the treated and non-treated we face a certain danger - namely, the two groups are not really comparable and suffers from selection bias. The treated unit might simply have different characteristics than non-treated units. The major advantage of panel data is that when you don't compare not just the outcomes but also the pre-post change between T=1 and T=0, is that any initial characteristic that is potentially different between T=1 and T=0 (as long as it is a fixed effect) does not have to be controlled for with a control variable. At the very least, you don't need AS MANY control variables.

Term

How does adding additional pre-intervention data make impact estimates for reliable?

Definition

the additional data (e.g. 1995) allows to estimate whether or not the pre-intervention trend of Y was different between the treated and non-treated units. Any difference that is detected between treated and non-treated is incorporated in the analysis as a factor used to adjust the initial estimate of the counterfactual trend.

Term

Difference in Difference in Difference impact formula

Definition

a^ = E[(Y2005 - Y2000) - (Y2000 - Y1995)|Ti=1] - E{(Y2005 - Y2000) - (Y2000 - Y1995)|Ti=0]

Term

With a DDD model, in which way is the counter-factual estimated?

Definition

The counterfactual is estimated as the pre-post intervention change of Y recorded in the non-treated units, corrected by the pre-intervention differential change of Y between the treated and non-treated units.

Term

What is the advantage of combining a DD scheme with Multiple regression or PS matching compared to multiple regression or PS matching without a DD scheme?

Definition

When panel data are available, combining a DD scheme with Multiple regression or PS matching reduces the need to include in the analysis observable control variables. This is because all factors that can be assumed to be fixed effects

Term

How does conditional Difference in Difference with Propensity Score Matching work?

Definition

1) Based on an appropriate set of control variables, a PS variable is estimated
2) A nearest available (with or without replacement) PS matching, or a radius matching procedure is implemented
3) The impact estimates are obtained comparing the pre-post intervention difference of Y between the treated and the matched non-treated units (i.e. a^=E(Ypost - Ypre|Ti=1) - (Ypost - Ypre |Ti=0)

Term

What is the advantage of Conditional DD with multiple regression models (and of Conditional DD with PS matching models) compared to pure DD models?

Definition

Compared to pure DD model, in order to obtain unbiased results, CDD with MR models do not require making the hypothesis that the observable control variables X are fixed effects

Term

What is the advantage of Conditional DD with PS matching models compared to Conditional DD with multiple regression models?

Definition

Compared to Conditional DD with multiple regression models, CDD with PS matching offers the following advantages:
-it solves the issue of sensitivity of impact estimates to different functional forms of the control variables
-it reduces costs for data-collection if Y data has to be collected for the evaluation

Term

With Conditional DD with Multiple regression models (and Conditional DD with PS matching), in which time do you have to measure the control variables? Why is this the case?

Definition

Typically you have to measure control variables at the pre-intervention time. This is to reduce the risk of the control variables becoming endogenous to the treatment (i.e. the control variables becoming affected by the treatment itself)

Term

The Endogeneity Problem

Definition

Very often control variables cannot be included if measured during the same times of the program intervention. This is because of “endogeneity” problems

For example, If EZ incentives works very well, they could lower crime rates in the years during the program intervention. Crime rate changes during the program intervention would not be something to control for, but they would be a secondary outcome of the program intervention

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Political Studies Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile