Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
Impact evaluation
Impact evaluation assesses the extent to which an intervention (such as a project, program or policy) is causally responsible for observed outcomes, intended and unintended. In contrast to outcome monitoring, which examines whether targets have been achieved, impact evaluation is structured to answer the question: how would outcomes such as participants' well-being have changed if the intervention had not been undertaken? This involves counterfactual analysis, that is, "a comparison between what actually happened and what would have happened in the absence of the intervention." Impact evaluations seek to answer cause-and-effect questions. In other words, they look for the changes in outcome that are directly attributable to a program.
Impact evaluation helps people answer key questions for evidence-based policy making: what works, what doesn't, where, why and for how much? It has received increasing attention in policy making in recent years in the context of both developed and developing countries. It is an important component of the armory of evaluation tools and approaches and integral to global efforts to improve the effectiveness of aid delivery and public spending more generally in improving living standards. Originally more oriented towards evaluation of social sector programs in developing countries, notably conditional cash transfers, impact evaluation is now being increasingly applied in other areas such as agriculture, energy and transport.
Counterfactual analysis enables evaluators to attribute cause and effect between interventions and outcomes. The 'counterfactual' measures what would have happened to beneficiaries in the absence of the intervention, and impact is estimated by comparing counterfactual outcomes to those observed under the intervention. The key challenge in impact evaluation is that the counterfactual cannot be directly observed and must be approximated with reference to a comparison group. There are a range of accepted approaches to determining an appropriate comparison group for counterfactual analysis, using either prospective (ex ante) or retrospective (ex post) evaluation design. Prospective evaluations begin during the design phase of the intervention, involving collection of baseline and end-line data from intervention beneficiaries (the 'treatment group') and non-beneficiaries (the 'comparison group'); they may involve selection of individuals or communities into treatment and comparison groups. Retrospective evaluations are usually conducted after the implementation phase and may exploit existing survey data, although the best evaluations will collect data as close to baseline as possible, to ensure comparability of intervention and comparison groups.
There are five key principles relating to internal validity (study design) and external validity (generalizability) which rigorous impact evaluations should address: confounding factors, selection bias, spillover effects, contamination, and impact heterogeneity.
Impact evaluation designs are identified by the type of methods used to generate the counterfactual and can be broadly classified into three categories – experimental, quasi-experimental and non-experimental designs – that vary in feasibility, cost, involvement during design or after implementation phase of the intervention, and degree of selection bias. White (2006) and Ravallion (2008) discuss alternate Impact Evaluation approaches.
Under experimental evaluations the treatment and comparison groups are selected randomly and isolated both from the intervention, as well as any interventions which may affect the outcome of interest. These evaluation designs are referred to as randomized control trials (RCTs). In experimental evaluations the comparison group is called a control group. When randomization is implemented over a sufficiently large sample with no contagion by the intervention, the only difference between treatment and control groups on average is that the latter does not receive the intervention. Random sample surveys, in which the sample for the evaluation is chosen randomly, should not be confused with experimental evaluation designs, which require the random assignment of the treatment.
The experimental approach is often held up as the 'gold standard' of evaluation. It is the only evaluation design which can conclusively account for selection bias in demonstrating a causal relationship between intervention and outcomes. Randomization and isolation from interventions might not be practicable in the realm of social policy and may be ethically difficult to defend, although there may be opportunities to use natural experiments. Bamberger and White (2007) highlight some of the limitations to applying RCTs to development interventions. Methodological critiques have been made by Scriven (2008) on account of the biases introduced since social interventions cannot be fully blinded, and Deaton (2009) has pointed out that in practice analysis of RCTs falls back on the regression-based approaches they seek to avoid and so are subject to the same potential biases. Other problems include the often heterogeneous and changing contexts of interventions, logistical and practical challenges, difficulties with monitoring service delivery, access to the intervention by the comparison group and changes in selection criteria and/or intervention over time. Thus, it is estimated that RCTs are only applicable to 5 percent of development finance.
RCTs are studies used to measure the effectiveness of a new intervention. They are unlikely to prove causality on their own, however randomisation reduces bias while providing a tool for examining cause-effect relationships. RCTs rely on random assignment, meaning that that evaluation almost always has to be designed ex ante, as it is rare that the natural assignment of a project would be on a random basis. When designing an RCT, there are five key questions that need to be asked: What treatment is being tested, how many treatment arms will there be, what will be the unit of assignment, how large of a sample is needed, how will the test be randomised. A well conducted RCT will yield a credible estimate regarding the average treatment effect within one specific population or unit of assignment. A drawback of RCTs is 'the transportation problem', outlining that what works within one population does not necessarily work within another population, meaning that the average treatment effect is not applicable across differing units of assignment.
Hub AI
Impact evaluation AI simulator
(@Impact evaluation_simulator)
Impact evaluation
Impact evaluation assesses the extent to which an intervention (such as a project, program or policy) is causally responsible for observed outcomes, intended and unintended. In contrast to outcome monitoring, which examines whether targets have been achieved, impact evaluation is structured to answer the question: how would outcomes such as participants' well-being have changed if the intervention had not been undertaken? This involves counterfactual analysis, that is, "a comparison between what actually happened and what would have happened in the absence of the intervention." Impact evaluations seek to answer cause-and-effect questions. In other words, they look for the changes in outcome that are directly attributable to a program.
Impact evaluation helps people answer key questions for evidence-based policy making: what works, what doesn't, where, why and for how much? It has received increasing attention in policy making in recent years in the context of both developed and developing countries. It is an important component of the armory of evaluation tools and approaches and integral to global efforts to improve the effectiveness of aid delivery and public spending more generally in improving living standards. Originally more oriented towards evaluation of social sector programs in developing countries, notably conditional cash transfers, impact evaluation is now being increasingly applied in other areas such as agriculture, energy and transport.
Counterfactual analysis enables evaluators to attribute cause and effect between interventions and outcomes. The 'counterfactual' measures what would have happened to beneficiaries in the absence of the intervention, and impact is estimated by comparing counterfactual outcomes to those observed under the intervention. The key challenge in impact evaluation is that the counterfactual cannot be directly observed and must be approximated with reference to a comparison group. There are a range of accepted approaches to determining an appropriate comparison group for counterfactual analysis, using either prospective (ex ante) or retrospective (ex post) evaluation design. Prospective evaluations begin during the design phase of the intervention, involving collection of baseline and end-line data from intervention beneficiaries (the 'treatment group') and non-beneficiaries (the 'comparison group'); they may involve selection of individuals or communities into treatment and comparison groups. Retrospective evaluations are usually conducted after the implementation phase and may exploit existing survey data, although the best evaluations will collect data as close to baseline as possible, to ensure comparability of intervention and comparison groups.
There are five key principles relating to internal validity (study design) and external validity (generalizability) which rigorous impact evaluations should address: confounding factors, selection bias, spillover effects, contamination, and impact heterogeneity.
Impact evaluation designs are identified by the type of methods used to generate the counterfactual and can be broadly classified into three categories – experimental, quasi-experimental and non-experimental designs – that vary in feasibility, cost, involvement during design or after implementation phase of the intervention, and degree of selection bias. White (2006) and Ravallion (2008) discuss alternate Impact Evaluation approaches.
Under experimental evaluations the treatment and comparison groups are selected randomly and isolated both from the intervention, as well as any interventions which may affect the outcome of interest. These evaluation designs are referred to as randomized control trials (RCTs). In experimental evaluations the comparison group is called a control group. When randomization is implemented over a sufficiently large sample with no contagion by the intervention, the only difference between treatment and control groups on average is that the latter does not receive the intervention. Random sample surveys, in which the sample for the evaluation is chosen randomly, should not be confused with experimental evaluation designs, which require the random assignment of the treatment.
The experimental approach is often held up as the 'gold standard' of evaluation. It is the only evaluation design which can conclusively account for selection bias in demonstrating a causal relationship between intervention and outcomes. Randomization and isolation from interventions might not be practicable in the realm of social policy and may be ethically difficult to defend, although there may be opportunities to use natural experiments. Bamberger and White (2007) highlight some of the limitations to applying RCTs to development interventions. Methodological critiques have been made by Scriven (2008) on account of the biases introduced since social interventions cannot be fully blinded, and Deaton (2009) has pointed out that in practice analysis of RCTs falls back on the regression-based approaches they seek to avoid and so are subject to the same potential biases. Other problems include the often heterogeneous and changing contexts of interventions, logistical and practical challenges, difficulties with monitoring service delivery, access to the intervention by the comparison group and changes in selection criteria and/or intervention over time. Thus, it is estimated that RCTs are only applicable to 5 percent of development finance.
RCTs are studies used to measure the effectiveness of a new intervention. They are unlikely to prove causality on their own, however randomisation reduces bias while providing a tool for examining cause-effect relationships. RCTs rely on random assignment, meaning that that evaluation almost always has to be designed ex ante, as it is rare that the natural assignment of a project would be on a random basis. When designing an RCT, there are five key questions that need to be asked: What treatment is being tested, how many treatment arms will there be, what will be the unit of assignment, how large of a sample is needed, how will the test be randomised. A well conducted RCT will yield a credible estimate regarding the average treatment effect within one specific population or unit of assignment. A drawback of RCTs is 'the transportation problem', outlining that what works within one population does not necessarily work within another population, meaning that the average treatment effect is not applicable across differing units of assignment.