An Evaluation Initiative to Support Learning the Impact of USAID’s Democracy and Governance Programs


Nearly two decades after the U.S. government and other donors began making major investments in promoting democracy and governance (DG) abroad, a number of international studies found that surprisingly little hard empirical evidence exists about the impact of these investments (see Chapter 2 for a discussion of these studies). New cross-national quantitative research suggests that DG funding on aerage has spurred democracy, but this analysis reveals nothing about the efficacy of specific projects or activities—such as local government capacity building, investments in civil society organizations, or judicial training—that have come to dominate the U.S. Agency for International Development (USAID) DG menu (Al-Momani 2003; Finkel et al 2007, 2008; Kalyvitis and Vlachaki 2007;

Azpuru et al 2008). Decades of monitoring and process evaluation reports have yielded significant amounts of data on outputs (e.g., local governments supported, nongovernmental organizations (NGOs) funded, judges trained) and valuable reflections on the process of delivering DG assistance. But as discussed in earlier chapters, they have so far provided little evidence that meets accepted standards of impact evaluation about whether these projects have strengthened local governments, contributed to more robust civil societies, or helped create more legitimate judicial sectors in the countries in which they have been implemented.

Five years from now, the committee hopes that the USAID will be in a position not only to clearly and persuasively identify the effects of its DG programs but also to claim leadership in the procedures for conducting  0 IMPROVING DEMOCRACY ASSISTANCE sound impact evaluations of them where feasible and appropriate. To do this, USAID must invest in creating an ethos of evaluation, so that at least some of its DG projects are seen as presenting valuable opportunities to learn about what works and what does not in encouraging the growth of democratic institutions and values around the world.

Earlier chapters analyzed current USAID approaches to assessment and evaluation and proposed ways to provide the evidence of project impact that USAID needs both for its own programming and for presenting and defending its programs to the broader policy community in Washington and internationally. Earlier chapters focused on the specific policy and process changes that the committee believes are needed to help USAID overcome concerns that hinder undertaking sound impact evaluations and to augment USAID’s overall learning to support DG programming. This chapter outlines a suggested strategy for USAID and its Strategic and Operational Research Agenda (SORA) to implement such changes.

The committee recommends a special initiative—a synthesis of many of its earlier proposals for what USAID should do in the future—to examine the feasibility of applying the most rigorous impact evaluation methods to Dg projects. Recognizing both the current skepticism in the Dg assistance community about impact evaluations and the significant organizational barriers that their implementation faces given current U.S. contracting and management practices, the committee’s recommendation is relatively modest, more in the way of undertaking a pilot or set of demonstration projects within the current USAID structure.


Obtaining more impact evaluations to determine the effects of DG programs is chiefly a matter of setting priorities, and that is the domain of leadership. Strong leadership is essential if USAID is to become an organization that prizes learning about the successes and failures of its DG projects, whether launched in the missions, regional bureaus, or the central DG office. Because DG programs are such an important—and often controversial—part of U.S. foreign policy, the committee recommends that leadership should come from the top—in the form of a DG evaluation initiative led by a senior USAID official. This initiative should be guided by a policy statement outlining the strategic role of investments in impact evaluations of Dg programming. It is particularly important that the “vision” behind impact evaluations make clear that gaining knowledge of what works and what does not work is the primary goal.

Impact evaluations should thus be targeted as far as possible to study projects as designed and carried out; the discussion in Chapters 6 and 7 


shows that actual projects—not just artificial or deceptively simple versions of them—could likely be given sound impact evaluations, including the most effective randomized designs. In addition, missions and implementers with generally good records will be positively recognized, and not sanctioned, if they uncover sound evidence that programs do not work or work poorly.

This statement would provide a valuable opportunity to adjust the balance of motivations that currently drive monitoring and evaluation (M&E) in DG. The administrator should see the need for this initiative, both to ensure the sound and effective use of the considerable increases in budgetary resources going into DG programs in the past five years and to create a leading edge for revitalizing evaluation across the agency. 1 The initiative would begin a conscious and deliberate effort to undertake the highest-quality impact evaluations (including randomized designs where possible), in order to restore a better balance among different types of M&E activities, which are now largely focused on tracking project outputs or very proximate outcomes. Impact evaluations would help USAID accumulate knowledge that would (1) distinguish project models that work from those that do not, (2) identify the conditions under which particular approaches are more or less effective, and (3) help USAID avoid costly investments that may cause harm or may simply be ineffective.

The committee’s charge is limited to recommendations for improving USAID’s ability to evaluate its DC projects but there could be advantages to making this an agency-wide initiative. USAID implements social programs in many parts of the agency, so the changes the committee recommends could yield much wider benefits. As discussed in Chapter 2, the World Bank has taken this approach through its Development Impact Evaluation (DIME) Initiative and NGOs such as the Poverty Action Lab at the Massachusetts Institute of Technology and the Evaluation Gap Working Group of the Center for Global Development are working to promote impact evaluations for a range of social programs.2 This is a time when many policymakers, both within and outside the United States, are calling for reinvigoration and rethinking of foreign assistance programs (among myriad sources, see, e.g., Lancaster 2000, 2006; National EndowA 2006 study from the National Research Council addressed the broader issues of the decline in evaluation capacity across USAID (NRC 2006).

2 Information about the evaluation gap initiative can be found at http://www.cgde.org/ section/initiaties/_actie/ealgap. Accessed August 27, 2007. Information about the Abdul Latif Jameel Poverty Action Lab can be found at http://www.poertyactionlab.com/. Accessed on August 3, 2007. Information about the DIME initiative can be found at http://econ.

worldbank.org/WBSITE/EXTERNAL/EXTDEC/0,,contentMDK:0~menuPK:~ pagePK:0~piPK:0~theSitePK:,00.html. Accessed on August 3, 2007.

 IMPROVING DEMOCRACY ASSISTANCE ment for Democracy 2006; Epstein et al 2007; HELP Commission 2007;

Hyman 2008).

In addition to its program benefits, a DG evaluation initiative could place USAID among those in the forefront of improving development policy. Although there are sound reasons to think that impact evaluations may often not prove feasible, and committee member Larry Garber has often noted such concerns, the potential gains to accurate and defensible knowledge where such evaluations do prove feasible would be considerable. USAID is unique among donors in the range of assistance projects and the number of countries in which it operates at any given time. The committee is thus unanimous in recommending that USAID undertake a pilot program to learn whether impact evaluations will yield new insights into the effectiveness of Dg projects.


Improving the evaluation of DG programs should embrace a multitiered approach. Not all projects need be, or should be, chosen for the most intensive evaluation using the techniques of randomized assignment to treatment and control groups outlined in Chapter 5. Neither USAID staff nor their implementing partners currently have the capacity to implement impact evaluations widely, and these skills require time and experience to develop. Moreover, as already discussed, the skepticism the committee encountered about whether impact evaluations were feasible persuaded members that a well-organized piloting of impact evaluations on a few select programs would be the best way to start. Moving too quickly or too sweepingly could impose an unacceptably high cost on USAID’s efforts to assist the development of democracy and good governance throughout the world.

Tasks for the Dg Evaluation Initiative The committee strongly recommends that, to accelerate the building of a solid core of knowledge regarding project effectiveness, the Dg evaluation initiative should immediately develop and undertake a number of well-designed impact evaluations that test the efficacy of key project models or core development hypotheses that guide USAID Dg assistance. A portion of these evaluations should use randomized designs, as these are the most accurate and credible means of ascertaining program impact. By key models, the committee refers to projects that (1) are implemented in a similar form across multiple countries and (2) receive substantial funding (examples include projects to support local government, civil society, judicial training). By core hypotheses the 


committee refers to the assumptions guiding USAID project design that, whether drawn from experience or prevailing ideas about how democracy is developed and sustained, have not been tested as empirical propositions. Examples include the assumption that public service delivery improves if citizens have oversight over the spending of public monies or the idea that exposure to democratic practices increases people’s faith in democratic institutions.

The Dg evaluation initiative should identify three or four program models that are widely used in Dg promotion and two or three core hypotheses that guide Dg thinking on democracy assistance and then plan and conduct impact evaluations of these models/hypotheses across a range of countries or contexts over the next five years. As many of these as possible should be chosen to offer feasible designs for random assignment evaluations. However, for important programs for which USAID desires impact evaluations but for which randomization is not feasible, carefully developed alternative designs, of the types discussed in Chapter 5, should be developed and implemented.

At the end of this five-year period, USAID would have:

• practical experience in implementing the evaluation designs that can indicate where such approaches are feasible, what the major obstacles are to wider implementation, and whether and how these obstacles can be overcome;

• where the evaluations prove feasible, a solid empirical foundation to begin (1) assessing the validity of some of the key assumptions that underlie DG projects and (2) learning which commonly used projects work and which do not in achieving program goals; and

• the basis for judging how widely to apply such impact evaluations to DG program evaluations in the future.

