With the exception of some activities relating to national-level policies, all interventions under the project took place in seven selected subnational regions (also called departments): Ayacucho, Cusco, Huanuco, Junin, Pasco, San Martin, and Ucayali.4 These seven regions contain 61 provinces, which in turn contain 536 districts.5 Workshops on participatory budgeting, training of civil society organizations (CSOs), and other interventions took place at the regional, provincial, and district levels. 6 The ultimate goal of the project was to promote “increased responsiveness of subnational elected governments to citizens at the local level in selected regions” (USAID/Peru 2002). This outcome is potentially measurable on different units of observation. For example, government capacity and responsiveness could be measured at the district or provincial level (through expert appraisals or other means), while citizens’ perceptions of government responsiveness may be measured at the individual level (through surveys).

The PRODES decentralization project represented an ambitious effort.

By all accounts it was a well-executed program; the performance of the local contractor received high marks from mission staff at USAID/Peru.

The questions of interest here do not relate to the performance of the contractor in relation to project outputs or very proximate outcomes, which 4 The regions were nonrandomly selected for programs because they share high poverty rates, significant indigenous populations, and narcotics-related activities and because a number of the departments were strongholds for the Shining Path movement in the 1980s.

5 Peru has 24 departments plus one “constitutional province”; the 24 departments in turn comprise 194 provinces and 1,832 districts. Provinces and districts are often both called “municipalities” in Peru and both have mayors. Sometimes two or more districts combine to form a city, however.

6 Relevant subnational authorities include members of regional councils, provincial mayors, and mayors of districts.

 IMPROVING DEMOCRACY ASSISTANCE were the focus of the project monitoring plan used by the implementer.7 Instead, the question is how we could know whether such a project had impacts on targeted policy outcomes, such as the responsiveness of local governments to citizens’ demands.

Since the project was not designed with impact evaluation, as defined here, in mind, it suffered from a number of serious deficiencies in that regard. The main deficiencies parallel the general points raised in Chapter 5: the absence of indicators for at least some of the most important policy outcomes, the absence of comparison units, and the absence of treatment randomization. Taken together, these shortcomings present almost insuperable obstacles to an impact evaluation. One important finding of the team was that with foresight some of these deficiencies might have been fairly easily corrected and for not much additional cost. Indeed, some of the changes outlined below would likely yield cost saings.

As mentioned, the decentralization project sought to foster citizen participation, transparency, and accountability at the local level, with the ultimate objective of promoting “increased responsiveness of subnational elected governments to citizens.” Though some of these outcomes are potentially, albeit imperfectly, measurable, indicators gathered at the local level related almost exclusively to outputs rather than outcomes.

For example, the indicators gathered included the percentage of municipalities that signed “participation agreements” with local contractors;

the percentage of participating municipalities from which at least two individuals (local authorities or representatives of CSOs) attended a training course in participatory planning and budgeting; the percentage of targeted provincial governments in which at least two CSOs exercised regular oversight of municipal government operations, as measured by participation in at least two public forums during the year; and the percentage of participating local governments that establish technical teams to assist with decentralization efforts (PRODES PMP 2007).

Such indicators are designed to monitor the implementer’s performance and perhaps measure very proximate outcomes, such as formal participation in the decentralization process. However, they do little to help discern the impact of interventions on the main outcomes that the project was designed to affect. For purposes of evaluating impact—and even for improved project monitoring—we want to know not how many training courses there were or how many officials attended them but rather whether they led subnational elected governments to be more responsive to their citizens.8

–  –  –

Several indicators gathered through surveys did tap citizens’ perceptions of the responsiveness of subnational elected governments in targeted

municipalities. Surveys taken in 2003, 2005, and 2006 asked respondents:

Are the services provided by the (district, provincial, or regional) government very good, good, average, bad, or very bad? Another question, administered only in the 2003 and 2005 surveys,9 asked: Do you think that the (district, provincial, or regional) government is responsive to what the people want almost always, on the majority of occasions, from time to time, almost never, or never? (PRODES PMP 2006, 2007).

In principle, such survey questions may provide useful proxy measures of the outcomes of interest. In practice, however, there were a number of issues that limited the usefulness of these measures. First, only the first question was asked in a comparable manner across all three surveys, allowing for a very limited time series on the outcome of interest. Second and perhaps more importantly, as discussed further below, was the failure to gather measures on control units in all but the 2006 survey.

Finally, a “baseline” assessment of municipal capacity was prepared at the start of the program by a local institution. All district and provincial municipalities in the seven selected regions were coded along several dimensions, including extent of socioeconomic needs and management capacities of district and provincial governments (GRADE 2003).

Poverty rates and related indicators played a preponderant role in the local institution’s calculations, which may have limited the usefulness of the index for assessing changes in subnational government capacity or responsiveness. In theory, however, repeated assessments of this kind could have provided useful data on municipal capacity, which is an outcome of interest under the decentralization project. As far as the team could determine, the assessment was not repeated.

USAID/Peru’s implementer was tasked with carrying out the decentralization project in all 536 districts of the seven selected regions. Once the rollout of interventions in all municipalities had been completed, no untreated municipalities remained available in the selected regions. The absence of appropriate control units (untreated municipalities) is perhaps the biggest problem for effective evaluation of the decentralization project. In addition, since rollout was completed by the second year of the program, there was little opportunity to compare outcomes in treated and untreated units in the seven regions.

distinction is made in some of the relevant program monitoring plans (e.g., PRODES PMP 2006). However, most of the impact measures appear to be fairly proximate outcome measures related to the process of supporting decentralization.

9 The 2003 and 2005 questions were administered as a part the Democratic Indicators Monitoring Survey, whereas for 2006, data came from the Latin American Public Opinion Project.

 IMPROVING DEMOCRACY ASSISTANCE In principle, comparisons could be made across treated municipalities in the seven selected regions and untreated municipalities outside these regions. Since the seven regions were nonrandomly selected on the basis of characteristics that almost surely covary with municipal capacity and subnational government responsiveness (e.g., high poverty rates, narcotics-related activities, past presence of the Shining Path), however, inferences drawn from such comparisons would be problematic, although not completely uninformative. In practice, however, the data do not exist for such comparisons because virtually no data were gathered on control units. The exception is the 2006 commissioned survey taken as a part of the Latin American Public Opinion Project (LAPOP), which administered a questionnaire to a nationwide probability sample of adults including an oversample of residents in the seven regions in which USAID works (Carrión et al 2007).10 This survey includes several questions that would be useful measures of the outcome variables (though only one question is comparable to questions asked in the earlier non-LAPOP surveys taken in treated municipalities in 2003 and 2005).11 The 2006 LAPOP national survey, had it been carried out beginning in 2003, could have established a national baseline against which the selected regions could have been measured before the program began.12 The project implementers would then have known, for example, if as was hypothesized, satisfaction with local government, participation in local government, corruption in local government, and so forth, were more problematic in the targeted regions than in the rest of the country. Since the regions selected were poorer and more rural than the nation as a whole, covariate controls could have been introduced in an analysis-of-variance design that could have statistically forced the nation and the control groups to look more alike. Then, in each subsequent round of surveys, comparisons could have been made between the nation and the targeted regions, thereby making it possible to observe the rate of change. Had satisfaction with local government nationwide remained unchanged while the targeted areas showed increased satisfaction, project impact could have been established with a reasonable degree of confidence. Indeed, if national satisfaction had

10 In addition to 1,500 respondents in the nationwide sample, an oversample of 2,100 (300

per region) was taken from the seven regions (Patricia Zárate, Instituto de Estudios Peruanos, personal communication June 2007). Inter alia, this survey asked respondents their opinions of the quality of local government services, as noted above.

11 The LAPOP instruments include questions that are comparable across 20 surveyed countries; see Seligson (2006). For useful information, the committee is grateful to Patricia Zárate, Instituto de Estudios Peruanos.

12 Of course, the national sample would need to have had removed from it any sample

–  –  –

declined over the life of the project while the target areas held steady, this, too, could have been an indicator of project success. It is important to stress that since the mission was already regularly conducting national samples of public opinion, there would hae been no added data-gathering costs in the hypothetical strategy just proposed. The only cost would have been the minimal expense of analyzing the data.

Outside the LAPOP 2006 survey, no data were gathered on untreated municipalities. The universe of the 2003 and 2005 surveys was limited to residents of the seven regions (and thus only to residents of treated municipalities). Evaluations of municipal capacity (e.g., the GRADE study mentioned above) were conducted only on districts and provinces in the seven selected regions.

Although some data were collected in control municipalities outside the seven regions, the absence of a control group within the regions has serious consequences for evaluation. As just one example, many municipalities in the seven regions had been ravaged by the conflict with the Shining Path during the 1980s and 1990s. Investment and population return have picked up in some areas during the past decade, especially the past five years; at least some of this upturn must be due to the end of the war and other factors.13 Improvements in measured municipal capacity or in citizens’ perceptions of local government responsiveness during the life of the program may, therefore, not be readily attributable to USAID support for decentralization. If control municipalities had been selected from the outset at random and the treatment municipalities had outperformed the controls, we would have greater confidence that the project had a positive impact.

