How useful are user satisfaction surveys?

Transcript How useful are user satisfaction surveys?

Measuring impact of inspection –
A theoretical (and critical) overview
Lena Lindgren
School of Public Administration, University of Gothenburg
Agenda

The results agenda:
From public sector management focusing on procedures towards an
approach that pays greater emphasis on results - beyond outputs.

Key concepts:
Program theory, results, outputs, outcomes, impact, evidence.

Approaches to impact evaluation and outcome measurement.

Measuring and evaluating impact of inspection.
The results agenda
– performance measurement systems

Governments worldwide and international organizations like the
OECD and World Bank have invested considerable resources in the
development of performance measurement systems.

Performance measurement. The process of designing and implementing
(predominantly) quantitative measures of program results, including
outputs and outcomes.


Are the investments over the top? Are we measuring too much?
”USAID has become infected with a very bad case of Obsessive
Measurement Disorder (OMD), an intellectual dysfunction rooted in the
notion that counting everything in government programs will produce
better policy choices and improved management.”
The Evaluation Monster:
Performance measures in the public sector

Performance measures cannot
be designed so as to do justice
to the nature and objectives of
public sector activities.

Once measurement systems
are established, they have a
propensity to trigger strategic
behaviour which produces
perverse consequences.

Creaming, teaching to the test,
goal displacement, etc.
The results agenda – evidence based practice

Social practices (including school inspection?), programs and
policies should be driven by evidence.

Practitioners and policy-makers should base their decisions on
scientific evidence about what works and why and what types of
policy initiatives are likely to be most effective.

The kind of knowledge policy makers need and researchers ought
to produce should be based on research designs that can confidently
establish causal effects (i.e. impact).
An evidence-based society

Medicine, mental health, education, social services, eldercare,
policing, management, coaching, inspection……

While there appears to be strong consensus that evidence is a
highly valued commodity in the public sector, there is heated
disagreement about what counts as evidence of causality.

”Randomistas” are advocating RCT as the main tool for studying
and evaluating impact, and thus causality.

”Pluralistas ” believe there are other, equally scientific, ways to go
that are just of coming to causal conclusions about impact.
The results agenda – impact assessment

The new public management paradigm and the “evidence movement”
require accountability for results including outcomes and impact, i.e.
beyond outputs.

Measuring impact requires establishing the links between an
intervention (i.e. school inspection) and the social conditions it
is intended to improve.

As a consequence, conferences, courses , “discourse” and tools
for impact assessment abound.
But haven’t we heard this before?
Donald T Campbell (1916-1996)
The Experimenting Society
A vision of how public policy could be
improved through experimentation.
Experimental and Quasi-experimental
Designs for Research.
A standard book in evaluation
”The accidental evaluator”
Devoted to understanding causality,
human behavior and how to solve
social questions.
Key concepts – program theory

Program, a generic term for whatever is being evaluated - or inspected.

A program theory is a simple, or more complex, assumption, implicit
in the way a program is designed, about how the program's actions are
supposed to achieve the outcomes it intends.

A tool to assist with conceptualizing a program.

Can be applied to:
- a small program
- a process
- a large, multi-component program
- an organization
- an inspection policy or activity.
Everyday example of a program theory
H
U
N
G
R
Y
Get food
Eat food
University of Wisconsin-Extension, Program Development and Evaluation
Feel better
A simple program theory for school inspection
Resources

Everything that goes into the program
Budget, staff, technology, etc
Preparations before inspection
Activities

Functions and activities performed
Inspection
Inspection feedback and report
Outputs


Products and services provided to program recipients
Outcomes
What happens when outputs reach the recipients
Increased knowledge of improvement needed
Deficiencies are looked after
School quality is improved.
Overall national goals are attained
Key concepts – outputs, outcomes and results

Outputs are the phenomena that come out of government bodies
(or any organization) in the form of various services and goods.

Outcomes are what happen when the outputs reach the addressees,
their actions, but also what occur beyond the addressees in the
organization or society.
R
Input 

Activities  Output 
IMPLEMENTATION
E
S
U
L
Outcome 1 
T
S
Outcome 2 
Results a) a summarizing term for outputs and outcomes, or
b) indicate either outputs or outcomes.
Outcome 3
Key concepts – impact, two definitions

A category of outcomes.

Long term and broad outcomes – end outcomes.

All the outcomes of an intervention.
 Any evaluation which refers to impact is an impact evaluation,
including outcome monitoring which only measure the state of a
target population.

Impact is that portion of an outcome that can be attributed uniquely
to an intervention.
 Impact evaluation is an assessment which try to make a causal
inference that connects the intervention with an outcome.
Impact evaluation requires judgement about causality

All impact evaluation begins with a representation of activities
and outcomes (i.e. a program theory).

Impact evaluation is a study which attempts to attribute observed
changes by identifying what the outcomes would have been had
there been no intervention, or a different intervention.

How to address this issue in a way that produces credible
evidence of causality is a matter of much method controversy.
Enormous efforts withstanding…

“There is no satisfactory, widely recognized solution to the
causality problem in the social sciences generally. Enormous efforts
withstanding, the best minds among Western and Oriental social
scientists have not proved capable of providing more than ambitious
attempts, however genial, at resolving the issue” (Evert Vedung 1989).

Experimental designs might be the best possible approach in some
situations, but other designs are more suitable and feasible in other.

Good designs need to be combined or patched up to fit a specific
situation.
Educational research – the hardest science of all?
The important distinction is really not between the hard and soft sciences.
Rather, it is between the hard and the easy sciences. Easy to do science is what
those in physics, chemistry, geology do. Hard-to-do science is what…
educational researchers do.
We do our science under conditions that physical scientists find intolerable.
Simultaneously, student behavior is interacting with teacher characteristics,
such as the teacher’s training and conceptions of learning. But it doesn’t end
there because other variables interact too; the curriculum materials, the
socioeconomic status of the community, and so forth.
Moreover, we are not even sure in which directions the influences work.
(David Berliner, professor Arizona State University)
Experiments with randomized controls (RCT)

Targets are randomly divided into an treatment (experimental) group to
whom the intervention is administered, and a control group from whom the
intervention is withheld or given “treatment as usual”.

Schools to be inspected are randomly distributed in treatment and control
groups. To be meaningful, the activities involved in the inspection
strategy (e.g. strategy X) must be carefully specified as well as its expected
outcomes (i.e. again a program theory).

Collect baseline data for variables of interest.

Put inspection activities into action (treatment, treatment as usual, or no
treatment).

Measure the impact.

Estimate if the impact is a) not due to chance, b) large enough to be conclusive.
When randomization is not possible

Matched, generic, statistical, reflexive or shadow controls.

Matched controls. Outcome measures among schools who have been
exposed to inspection strategy are compared to outcome measures
among a theoretically equivalent group of schools from which the
inspection strategy X is withheld or which has been exposed to another
inspection strategy.

Shadow controls. Outcomes among schools who receive or have
received inspection strategy X are compared to the judgments of
experts, inspection managers, staff, or participants on what outcomes
they believe would have happened without inspection strategy X.
Contribution analysis (John Mayne 1999, 2001)
Does not attempt to prove that one factor – e.g. inspection strategy X – caused
the desired outcome, but rather explore the contribution strategy X is making to
observed results.
Steps
1. Set out the attribution problem to be assessed: identifying which outcome you
hope to improve or change with inspection strategy X.
2. Develop a program theory in order to understand and articulate how strategy X
is expected to bring about that change, and being clear about the expected short,
medium and long term outcomes.
3. Populate the model with existing data (mixed methods).
4. Assemble and assess the “performance story” based on existing data, and trough
critical discussions with colleagues and stakeholders. Identify and gather new data
needed, e.g. a certain link in the program theory.
5. Revise the performance story.
Systematic review

Review of literature that aim to provide an account of the literature in a
domain that is comprehensive, capable of replication, and transparent in
its approach.

Steps:
1) What is known about the impact of school inspections, or strategies of
school inspection?
2) Formulate criteria to guide the selection of studies.
3) Seek out and incorporate studies that meet the criteria in 2), based on
keywords relevant to the question defined in step 1.
4) Identify key features of each study (e.g. date, location, sample size, data
collection methods, and main findings).
5) Synthesize the results.
Performance measurement
– outcome monitoring

The process of designing and implementing (predominantly) quantitative
measures of program results, including outputs and outcomes.

Steps:
1. Describe what school inspection intends to achieve, and how this
achievement is to come about (i.e. program theory again...).
2. Select the outcomes to be tracked by the performance measurement
system, e.g. compliance with rules.
3. Identify the indicators (numerical measurement) that signals progress
toward achieving the outcomes selected.
4. Collect and report relevant data.
5. Analyse data in order to find out whether desired outcomes were reached.
Measuring and/or evaluating
outcomes and impact of school inspection

Should it be done? Could it be done?

Who will use the results? How will the results be used?

100% compliance to rules among inspected schools means that
inspection is effective and works as intended.

Does compliance to rules inevitably lead to improved school quality
and attainment of overall national goals?

What if the rules are inadequate and do not produce the outcomes
that are expected in overall national goals?

How useful are user satisfaction surveys?

Transcript How useful are user satisfaction surveys?

Directory