Evaluation Overview-

Download Report

Transcript Evaluation Overview-

Evidence-Based Public Health: A
Course in Chronic Disease Prevention
MODULE 9: Evaluating the Program or
Policy
Ross Brownson
Anjali Deshpande
Darcy Scharff
March 2013
Learning Objectives
1. Understand the basic components of program evaluation.
2. Describe the differences and unique contributions of
quantitative and qualitative evaluation.
3. Understand the various types of evaluation designs useful in
program evaluation.
4. Understand the concepts of measurement validity and
reliability.
5. Understand some of the advantages and disadvantages of
various types of qualitative data.
6. Understand some of the steps involved in conducting
qualitative evaluations.
7. Describe organizational issues in evaluation.
2
Best available
research evidence
Environment and
organizational context
Decision-making
Population
characteristics,
needs, values,
and preferences
Resources,
including
practitioner
expertise
3
Discontinue
Disseminate widely
Retool
What is program evaluation?
“a process that attempts to determine as
systematically and objectively as possible the
relevance, effectiveness, and impact of activities in
light of their objectives.”
A Dictionary of Epidemiology, 2008
The best evaluations often “triangulate”
The combination of quantitative and qualitative
methods
- Looking in a room from two windows
Prominent example: the evaluation of California Prop 599
Evaluation is basically …
a process of measurement & comparison
Why evaluate?
•
•
•
•
Improve existing programs
Measure effectiveness
Demonstrate accountability
Share effective strategies and lessons
learned
• Ensure funding and sustainability
Evaluation is a tool that can both measure and
contribute to the success of your program.
7
Evaluation versus research
Evaluation
• Controlled by
stakeholders
• Flexible design
• Ongoing
• Used to improve
programs
Research
• Controlled by
investigator
• Tightly controlled
design
• Specific timeframe
• Use to further
knowledge
8
Do you have…
•
•
•
•
A research/evaluation person on staff?
Time and other resources?
Staff to assist?
Necessary skills?
From Mattessich, 2003
9
What are the most significant challenges you
face in program evaluation?
• Program personnel may be threatened by the
evaluation
• Need for personnel involvement vs. objectivity
• Comprehensive evaluation versus nothing at all
• “10% Rule” as you design and implement programs
10
In program planning when should you begin
planning an evaluation?
11
Logic Model (Analytic Framework) Worksheet: Evidence-Based Public Health
Program Title:__________________________________________
Goal:
Long-Term Objective:
What are the evidencebased determinants?
Intermediate
Objective (Govt./Org.
Level):
Intermediate Objective
(Environmental Level):
Activities:
Activities:
Activities:
Activities:
Costs:
Costs:
Costs:
Costs:
Evaluation
Process
How much will it cost?
What other resources
are needed
d?
Intermediate
Objective (Social
Level):
Impact
Based on an evidence
review, what activities
will address these
determinants? What do
you do? How long will
it take?
Intermediate
Objective (Individual
Level):
Instructions: First discuss your target population. Using data, evidence-based recommendations (the Community Guide or
others), your own knowledge, and group discussion, develop a program strategy for controlling diabetes in your
community. Define the goal, objectives, activities, and costs: and describe them in this sample logic model.
12
Some important questions:
What are sources of data?
How might program evaluation differ from policy
evaluation?
13
Some notes on policy evaluation
•
•
•
•
Same principles apply
Lack of control over the intervention (policy)
Time frame may be much shorter
No evaluation is completely “objective,”
value-free, or neutral
14
Logic Model
PROGRAM PLANNING
EVALUATION
Goal
Outcome
Objective
Impact
Activities
Formative/
Process
15
Evaluation Framework
Program
- instructors?
- content?
- methods?
- time allotments?
- materials
Impact
Evaluation Types
(Adapted from Green et al., 1980)
Behavior/cognition
- knowledge gain?
- attitude change?
- habit change?
- skill development?
Health
- mortality?
- morbidity?
- disability?
- quality of life?
16
Types of Evaluation
Formative evaluation
– Is an element of a program or policy (e.g., materials,
messages) feasible, appropriate, and meaningful for
the target population?
– Often, in the planning stages of a new program
– Often, examining contextual factors
17
Types of Evaluation
• Considerations for formative evaluation
1. Sources of data
• (pre-) program data
2. Limitations of data (completeness)
3. Time frame
4. Availability & costs
• Examples
– Attitudes among school officials toward a proposed
healthy eating program
– Barriers in policies toward healthy eating
18
Types of Evaluation
Process evaluation
– “Field of Dreams” evaluation
– shorter-term feedback on program implementation,
content, methods, participant response, practitioner
response
– what is working, what is not working
19
Types of Evaluation
Process evaluation (cont)
– direct extension of action planning in previous
module
– uses quantitative or qualitative data
– data usually involves counts, not rates or ratios
20
Unraveling the “Black Box”
21
Types of Evaluation
• Considerations for process evaluation
1. Sources of data
• program data
2. Limitations of data (completeness)
3. Time frame
4. Availability & costs
• Examples
– Satisfaction with a diabetes self-management
training
– How resources are being allocated
22
California Local Health Department Funding by
Core Indicator
2004/07
2001/04
2% 3%
1%
7%
12%
In-store Ads
4%
3%
16%
Cess.
Exterior Ads
7%
Cess.
5%
2%
Sponsorship
Spon. 4%
2%
11%
School Instruction
Bar Compliance
6%
SSD
Bars
License
6%
Tobacco-Free Schools
License
1%
18%
5%
24%
Outdoor
11%
Outdoor
17%
4%
18%
1%
10%
Smoke-free Homes
Outdoor Smoke-free Areas
Tobacco Sales to Minors
Retail Licensing
SSD Ban
Cessation Availability
School Cessation
23
Types of Evaluation
Impact evaluation
– long-term or short-term feedback on knowledge,
attitudes, beliefs, behaviors
– uses quantitative or qualitative data
– also called summative evaluation
– probably more realistic endpoints for most public
health programs and policies
24
Types of Evaluation
• Considerations for impact evaluation
1. Sources of data
• surveillance or program data
2. Limitations of data (validity and reliability)
3. Time frame
4. Availability & costs
• Example
– Smoking rates (tobacco consumption) in California
25
California and U.S. minus California adult per capita cigarette
pack consumption, 1984/1985-2004/2005
Packs/Person
200
$0.25 tax increase
150
$0.02 tax increase
US minus CA
$0.50 tax increase
100
California
50
0
Source: California State Board of Equalization (packs sold) and California Department of Finance (population).
U.S Census, Tax Burden on Tobacco, and USDA.. Note that data is by fiscal year (July 1-June 30).
Prepared by: California Department of Health Services, Tobacco Control Section, February 2006.
26
Obesity maps in the US
27
28
Types of Evaluation
Outcome evaluation
– long-term feedback on health status, morbidity,
mortality
– uses quantitative data
– also called summative evaluation
– often used in strategic plans
29
Types of Evaluation
Considerations for outcome evaluation
1. Sources of data
• routine surveillance data
2. Limitations of data (validity and reliability)
3. Time frame
4. Availability & costs
- often the least expensive to find
Example
• Geographic dispersion of heart disease
30
Acute myocardial infarction rates, Missouri,
2010-2011 (age-adjusted)
Quantitative Evaluation
32
Program Evaluation Designs
Reasons for research on causes
• to identify risks associated with health-related
conditions
type 1 evidence
Reasons for evaluating programs
• to evaluate the effectiveness of public health
interventions
type 2 evidence
33
Program Evaluation Designs
Intervention
initiated
14
12
10
8
6
4
2
0
J
A S O N D
J
F M A M
J
J
A S O N D
Can we conclude that the intervention is effective?
34
Program Evaluation Designs
Intervention
initiated
14
12
10
8
6
4
2
0
J
A
S O N D
J
F M A M
J
J
A
S O N D
Can we conclude that the intervention is effective?
35
Program Evaluation Designs
Experimental
• randomized controlled trial
• group randomized trial
Observational
 cohort
 case-control
 cross-sectional
Quasi-experimental
• pre-test / post-test with external control group
(non-randomized trial)
• pre-test / post-test without external control group
(before-after or time series)
36
Program Evaluation Designs
Population
Ineligibility
Eligibility
Participation
No Participation
Randomization ?
Intervention Group
Outcome(s)
No Intervention Group
Outcome(s)
37
Pretest-Posttest with comparison group
General Population
Exclusion Criteria
institutionalized adults
without working telephone
Inclusion Criteria
6 pairs of communities in the Missouri Ozark Region
matched on size, race/ethnicity, and proportion of
population living below poverty level
Non-random assignment of communities
Follow-up
July-Sept
2004
Intervention group
Control group
rates of walking
moderate physical activity
in the last week
rates of walking
moderate physical activity
in the last week
Reference: Brownson et al. Preventive Medicine 2005; 41: 837-842
38
Pretest-Posttest with comparison group
Dose Category
Walking
% OR (95% CI)
Moderate PA
% OR (95% CI)
All Participants
Low
20.1 Referent
36.6 Referent
Medium
21.4 2.88 (1.04, 7.98)
31.7 0.73 (0.37, 1.44)
High
25.1 3.31 (0.93, 11.86)
34.5 1.96 (0.81, 4.75)
Low access to physical environment
Low
19.7 Referent
38.2 Referent
Medium
19.4 7.03 (1.64, 30.07)
30.4 0.79 (0.33, 1.91)
0.967 2.25 (0.36, 14.12)
34.5 2.76 (0.73, 10.41)
High
High access to physical activity
Low
20.7 Referent
33.8 Referent
Medium
23.8 1.82 (0.36, 9.11)
33.5 0.89 (0.29, 2.68)
High
32.2 4.63 (0.72, 29.84)
34.5 1.78 (0.50, 6.40)
Reference: Brownson et al. Preventive Medicine 2005; 41: 837-842
39
Pre-test / Post-test
without external control group
Employees & visitors,
at JHMC
Before new
smoke-free policy
After new
Smoke-free policy
Cigarettes smoked per day
Cigarette remnant counts per day
Nicotine concentrations
Cigarettes smoked per day
Cigarette remnant counts per day
Nicotine concentrations
Reference: Stillman et al. JAMA 1990;264:1565-1569 40
% People Smoking
Cafeterias
Visitors
Staff
Overall
Lounges
Before
policy
After
policy
Before
policy
After
policy
13%
0.3%
41%
0%
2
0.0
39
0
12
0.2
40
0
Reference: Stillman et al. JAMA 1990;264:1565-1569
41
Avg. Daily Cigarette Remnant Counts
Number
Number
%
before policy after policy chang
e
Location
Elevator lobbies
morning
afternoon
256
702
89
95
-65
-86
Lounges
morning
afternoon
49
293
5
6
-89
-98
17
73
32
65
+88
-10
Outside entrances
morning
afternoon
Reference: Stillman et al. JAMA 1990;264:1565-1569
42
Median Nicotine Vapor Concentrations
(ug/m3)
Location
Cafeteria
Visitor / patient waiting areas
Restrooms
Patient areas
Offices
Staff longes
Corridors / elevators
Number before
policy
Number after
policy
7.06
3.88
0.22
0.28
17.71
0.84
2.05
10.00
0.12
0.12
2.43
2.28
0.12
0.20
Reference: Stillman et al. JAMA 1990;264:1565-1569
43
% Employees Smoking
50
% smokers
40
30
20
10
0
Nurses
Physicians
Clerical
Before
Supervisory
Service
After
44
Pre-test / Post-test
without external control group
California
residents
1980-1982
before federal tax
1983-1988
before state tax
1989-1990
after state tax
cigarette sales
per capita
cigarette sales
per capita
cigarette sales
per capita
References:
Emery et al. AJPH 2001;21(4)278-283
Flewelling et al. AJPH 1992;82:867-869
Siegel et al. AJPF 2000;90:372-379
45
California and U.S. minus California adult per capita cigarette pack consumption,
1984/1985-2004/2005
Packs/Person
200
$0.25 tax increase
150
$0.02 tax increase
US minus CA
$0.50 tax increase
100
California
50
0
Source: California State Board of Equalization (packs sold) and California Department of Finance (population).
46
U.S Census, Tax Burden on Tobacco, and USDA.. Note that data is by fiscal year (July 1-June 30).
Prepared by: California Department of Health Services, Tobacco Control Section, February 2006.
Program Evaluation Designs
Quality of evidence from program evaluation depend
on …
• type of program evaluation design
• execution of the program evaluation
• generalizability of program evaluation results
47
Consider for ‘generic’ evaluation design
Population
Ineligibility
Eligibility
Participation
No Participation
Randomization ?
Intervention Group
Outcome(s)
Control Group
Outcome(s)
48
Concepts of validity and
reliability and their importance
(evaluation “threats”)
49
Validity vs. Reliability
50
What is validity?
51
Measurement Issues
• Evaluation “threats”
– Validity
• Is the instrument or design measuring exactly
what was intended?
52
Measurement Issues
• Validity: best available approximation to the
“truth”
• Internal
– the extent of causality (the effects are really
attributable to the program)
– related to specific design and program
execution
• External
– the extent of generalizability
53
– importance??
Measurement Issues
• Example
– validity
• self-reported rate of smoking among pregnant
women compared with cotinine validation
54
What is reliability?
55
Measurement Issues
• Evaluation “threats”
– Applies across any study design
– Reliability
• Is it measurement being conducted consistently?
56
Measurement Issues
• Example
– reliability
• test-retest data on self-reported smoking rates
among women
57
Measurement Issues
• Literature/contacting researchers may show you
accepted methods
• Check out existing tools like BRFSS
• Evaluation instruments often need community
contouring
– participatory methods may prevent use of existing
instruments/questions
58
Qualitative Evaluation
59
Qualitative data used for evaluation
• Qualitative –helps to define issues/concerns and
strengths from the community perspective
–
–
–
–
What needs to be done?
Existing or new data
Needs, assets, attitudes
Status of available health promotion programs/health care
facilities
– Resources/physical structures in the community
60
Characteristics of qualitative data
• Verbal or narrative data
• Multiple methods: ethnography, natural
experiment, focus groups, in-depth interviews,
direct observation, etc.
• Words, text chunks, photographs, gestures,
tones
• Fuller, richer understanding of what people
believe, think, do
Qualitative Methods
• Observation
•
-
- Audiovisual methods
- Spatial mapping
Interviews
- Individual
- Group/focus group
Written documentation
From LeCompte & Schensul, 1999
62
Group interviews
•
•
•
•
Relatively homogeneous group
6-10 people
Semi-structured
Focus group - build on each others ideas - advantage
of group process…can see influence of social
networks on issue at hand
• Group interview - several people together, not
necessarily take advantage of group process
In-depth interviewing
• Informal conversational interview
• General interview guide
• Standardized open-ended interview
64
Capturing Data -- How
• Permission and consent
• Tape record and transcribe groups
versus notes or memory
• Listen
• For inconsistent, vague or cryptic
comments
• Probe for understanding
Capturing Data -- What
• Debriefing
• Key points
• Group process
• Notes
• Key points
• Notable quotes
• Observations: silent agreement, obvious body language,
indications of group mood, ironic or contradictory statement
• Use diagram of seating arrangement
Interview Guide
• Used as framework
– General outline to discuss on predetermined subject with
probes to get additional information
– Maintain conversational quality
• Question flow
–
–
–
–
–
Opening – getting to know you
Introduction – begin to think about topic
Transition – move into topic more specifically
Key – address evaluation questions
Summary, confirmation, all things considered
Recording the data
• Permission and consent
• Notes - during and after the interview; include your
feelings [ tired/excited; how did the interview seem to
you?
• Note surroundings
• Tape recording - make sure it is on, make sure it
worked – two tape recorders
68
Data Verification
• Participants: clarify statements; summarize
beliefs
• Other researchers/team members
•
•
•
•
Field notes
Recordings and transcribed text
Oral summaries of key points
Post-group debriefing
Units of Analysis
• Words
• Reduce text to the fundamental meaning of
specific words
• Lose context
• Misinterpretation
• Text
• Chunks of text are coded so that data is reduced
• Codebooks developed a priori and continue to be
refined throughout the research process
• Context
• Non-verbal communication
• Group process
Analysis
• From transcriptions and notes
• Focused coding - with predetermined categories in
mind
• Open coding - categories and themes from the data
itself
• Multiple coders
• Label so you can go back to context
• Triangulation
• Computer programs - NUDIST, ATLAS.ti
71
Coding Example – What are some ways to
group the following?
•
•
•
•
•
•
Red tulip
Yellow rose
Red apple
Yellow tulip
Yellow apple
Green pear
Analysis Based On …
• Frequency of comments
• Number of times comments was made, not number
of people making the comment
• Extensiveness of comments
• Number of people who made the comment
• Intensity of comments
• Passion, depth of feeling
• Specificity of responses
• Specific, based on experience
Class Exercise
74
Dissemination
•
Report should include:
–
–
–
–
–
•
purpose of the evaluation
stakeholders involved
description of the program
description of the methodology used
the evaluation update and/or results and
recommendations
Don’t wait until the end of the program
75
Dissemination
• Be creative!!
– Formal or informal
– Written or oral
– Newsletters, internet, community forums
76
Productive Use of Evaluation
Consultants
•
•
•
•
•
•
•
•
•
Have a program theory
Intend to use the results
Make expectations clear
Develop good advisory committee
Consider every step collaborative
Focus on information needs of users
Budget enough time & money
Develop standards for communication
Embrace some ambiguity
From Mattessich, 2003
77
Organizational Issues/Summary
• “How can I do evaluation when there’s so much
‘real work’ to do?”
• “Independent” (outside) evaluation may be useful
• What to look for in a “good” evaluation
• Useful to consider qualitative as well as
quantitative methods
• Remember the potential role of economic
evaluation
78
A few final thoughts…
• Involve stakeholders in development of program
objectives and evaluation questions
• Measure program processes, impacts, and
outcomes using measures appropriate for the
questions asked
• Expect frustration from those collecting the data,
resistance from those feeling judged…
and appreciation if you can present the data in an
objective and useful light –
79