«Development, Security, and Cooperation Policy and Global Affairs THE NATIONAL ACADEMIES PRESS 500 Fifth Street, N.W. Washington, DC 20001 NOTICE: The ...»
The F Process indicators reflect both of these problems, although they had little impact on day-to-day project implementation during the course of this study. As noted above, these mandate collecting data at the “Objective” and “Area” levels, which correspond to macro- and meso-level indicators in the table, and at the “Element” level, which corresponds mostly to the output level. Data at the outcome level, which seems crucial to evaluating how well specific projects actually achieve their immediate goals, thus suffer relative neglect.
USAID mission staff and program implementers complained that the success of their projects was being judged (in part) on the basis of macrolevel indicators that bore very little or no plausible connection to the projects they were running, given the limited funds expended and the macro nature of the indicator. The most common example given was the use of changes in the Freedom House Political Rights or Civil Liberties Index as evidence of the effectiveness or ineffectiveness of their projects, even though these national-level indices were often quite evidently beyond their control to affect. One implementer commented that his group had benefited from an apparent perception that his project had contributed to improvements in the country’s Freedom House scores over the past several years. While this coincidence worked in his firm’s favor, he made it clear that this was purely coincidental; he was also concerned that if the government policies that currently helped his work changed and made his work more difficult, this would be taken as evidence that his project had “failed.” This is a poor way to measure project effectiveness. To use the example in Table 2-1, although USAID may contribute to better elections or even more democracy in a nation as a whole, there are always multiple forces and often multiple donors at work pursuing these broad goals. USAID may be very successful in helping a country train and deploy election monitors and thus reduce irregularities at the polling stations. But if the national leaders have already excluded viable opposition candidates from running, or deprived them of media access, the resulting flawed elections should not mean that USAID’s specific election project was not effective.
As a senior USAID official with extensive experience in many areas of
foreign assistance has written regarding this problem:
To what degree should a specific democracy project, or even an entire USAID democracy and governance programme, be expected to have an EVALUATION IN USAID DG PROGRAMS independent, measurable impact on the overall democratic development in a country? Th[at] sets a high and perhaps unreasonable standard of success. Decades ago, USAID stopped measuring the success of its economic development programmes against changes in the recipient countries’ gross domestic product (GDP). Rather, we look for middlelevel indicators: we measure our anti-malaria programmes in the health sector against changes in malaria statistics, our support for legume research against changes in agricultural productivity. What seems to be lacking in democracy and governance programmes, as opposed to these areas of development, is a set of middle-level indicators that have two characteristics: (a) we can agree that they are linked to important characteristics of democracy; and (b) we can plausibly attribute a change in those indicators to a USAID democracy and governance programme. It seems clear that we need to develop a methodology that is able to detect a reasonable, plausible relationship between particular democracy activities and processes of democratic change. (Sarles 2007:52) The appropriate standard for evaluating the effectiveness of specific DG projects and even broader programs is how much of the targeted improvement in behavior and institutions can be observed compared to conditions in groups not supported by such projects or programs. It is in identifying how much difference specific programs or projects made, relative to the investment in such programs, that USAID can learn what works best in given conditions.
Of course, it is hoped that such projects do contribute to broader processes of democracy building. But these broader processes are subject to so many varied forces—from strategic interventions to ongoing conflicts to other donors actions and the efforts of various groups in the country to obtain or hold on to power—that macro-level indicators are a misleading guide to whether or not USAID projects are in fact having an impact.
USAID efforts in such areas as strengthening election commissions, building independent media, or supporting opposition political parties may be successful at the project level but only become of vital importance to changing overall levels of democracy much later, when other factors internal to the country’s political processes open opportunities for political change (McFaul 2006). Learning “what works” requires that USAID focus its efforts to gather and analyze data on outcomes at the appropriate level for evaluating specific projects—what is labeled “outcome” measures in Table 2-1.
The committee wants to stress that there are good reasons for employing meso- and macro-level indicators of democracy and working to improve them. They are important tools for strategic assessment of a country’s current condition and long-term trajectory regarding democratization. But these indicators are usually not good tools for project evaluation. For the latter purpose, what is needed, as Sarles noted, are IMPROVING DEMOCRACY ASSISTANCE measures that are both policy relevant and plausibly linked to a specific policy intervention sponsored by USAID. The committee discusses these policy-relevant outcome measures and provides examples from our field visits in Chapter 7.
If one concern regarding USAID’s evaluation processes is that they may rely too much on meso- and macro-measures to judge program success, the committee also found a related concern regarding USAID’s data collection for M&Es: USAID spends by far the bulk of its M&E efforts on data at the “output” level, the first category in Table 2-1.
Current M&E Practices and the Balance Among Types of Evaluations In the current guidelines for USAID’s M&E activities given earlier, only monitoring is presented as “an ongoing, routine effort requiring data gathering, analysis, and reporting on results at periodic intervals.” Evaluation, by contrast, is presented as an “occasional” activity to be undertaken “only when needed.” The study undertaken for SORA by Bollen et al (2005) that is discussed in Chapter 1 found that most USAID evaluations were process evaluations. These can provide valuable information and insights but, as already discussed, do not help assess whether a project had its intended impact.
Although we cannot claim to have made an exhaustive search, the committee asked repeatedly for examples of impact evaluations for DG projects. The committee learned about very few. One example was a welldesigned impact evaluation of a project to support CSOs in Mali (Management Systems International 2000). Here the implementers had persuaded USAID to make use of annual surveys being done in the country, and to use those surveys to measure changes in attitudes toward democracy in three distinct areas: those that received the program, those that were nearby but did not receive the program (to check for spillover effects), and areas that were distant from the sites of USAID activity. The results of this evaluation suggested that USAID programs were not having as much of an impact as the implementers and USAID had hoped to see.
The response within USAID was informative. Some USAID staff members were concerned that a great deal of money had been spent to find little impact; complaints were thus made that the evaluation design had not followed changes made while the program was in progress or was not designed to be sensitive to the specific changes USAID was seeking. On the other hand, there were also questions about whether annual surveys were too frequent or too early to capture the results of investments that were likely to pay off only in the longer term. And the project, by funding hundreds of small CSOs, might have suffered from its own design flaws; some of those who took part in the project suggested that fewer EVALUATION IN USAID DG PROGRAMS and larger investments in a select set of CSOs might have had a greater impact. All of these explanations might have been explored further as a way to understand when and how impact evaluations work best. But from the committee’s conversations, the primary “lessons” taken away by some personnel at USAID were that such rigorous impact evaluations were not worth the time, effort, and money given what they expected to get from them or did not work.
While certainly only a limited number of projects should be subject to full evaluations, proper impact evaluations cannot be carried out unless “ongoing and routine efforts” to gather appropriate data on policyrelevant outcomes before, during, and after the project are designed into an M&E plan from the inception of the project. Current guidelines for M&E activity tend to hinder making choices between impact and process evaluations and in particular make it very difficult to plan the former.
Chapter 7 discusses, based on the committee’s field visits to USAID DG missions, the potential for improving, in some cases, USAID M&E activities simply by focusing more efforts on obtaining data at the policy outcome level.
Using Evaluations Wisely: USAID as a Learning Organization Even if USAID were to complete a series of rigorous evaluations with ideal data and obtained valuable conclusions regarding the effectiveness of its projects, these results would be of negligible value if they were not disseminated through the organization in a way that led to substantial learning and were not used as inputs to planning and implementation of future DG projects. Unfortunately, much of USAID’s former learning capacity has been reduced by recent changes in agency practice.
A longstanding problem is that much project evaluation material is simply maintained in mission archives or lost altogether (Clapp-Wincek and Blue 2001). For example, the committee found that when project evaluations involved surveys, while the results might be filed in formal evaluation reports, the underlying raw data were discarded or kept by the survey firm after the evaluation was completed. While many case studies of past projects, as well as many formal evaluations, are supposed to be available to all USAID staff online, not all evaluations were easy to locate.
Moreover, simply posting evaluations online does not facilitate discussion, absorption, and use of lessons learned. Without a central evaluation office to identify key findings and organize conferences or meetings of DG officers to discuss those findings, the information is effectively lost.
As mentioned above, CDIE is no longer active. USAID also formerly had conferences of DG officers to discuss not only CDIE-sponsored evaluations but also research and reports on DG assistance undertaken by IMPROVING DEMOCRACY ASSISTANCE NGOs, academics, and other donors. These activities appear to have significantly atrophied. The committee is concerned about the loss of these learning activities. Even the best evaluations will not be used wisely if their lessons are not actively discussed and disseminated in USAID and placed in the context of lessons learned from other sources, including research on DG assistance from outside the agency and the experience of DG officers themselves. The committee discusses the means to help USAID become a more effective learning organization in Chapters 8 and 9.
CONCLUSIONSThis review of current evaluation practices regarding development assistance in general and USAID’s DG programs in particular leads the
committee to a number of findings:
• The use of impact evaluations to determine the effects of many parts of foreign assistance, including DG, has been historically weak across the development community. Within USAID the evaluations most commonly undertaken for DG programs are process and participatory evaluations; impact evaluations are a comparatively underutilized element in the current mix of M&E activities.