Part - time MSc course Epidemiology & Statistics Module

Download Report

Transcript Part - time MSc course Epidemiology & Statistics Module

The following lecture has been approved for
Adults
This lecture may contain information, ideas, concepts and
discursive anecdotes that may be thought provoking and
challenging
It is not intended for the content or style of delivery to cause
offence
Any issues raised in the lecture may require the viewer to engage
in further thought, insight, reflection or critical evaluation
Critical Evaluation
of
Clinical Research
Dr. Craig A. Jackson
Senior Lecturer in Health Psychology
Division of Trauma & Critical Care
Faculty of Health
UCE Birmingham
hcc.uce.ac.uk/craigjackson
Session Outline
• Main Research Designs
•Experiments
RCTs
•Observation
Case-Control
• Critical Evaluation Criteria
• Ethical clearance / considerations
• Sample and Population issues
• Methods & Data collection
• Analyses
• Write up & Publication issues
Cohort studies
Brief Research History
Role of Bran Fibre dietary increases in IBS patients -- 1997
(Randomised Controlled Trial)
Mental Health of UK Farmers using OP Pesticides (X2) -- 1997-2000
(Epidemiological Surveys)
Neurobehavioural Performance of desert-based Oil Drillers -- 1998-2000
(Clinical assessment)
Temporary Hearing Loss in Student Bar Staff – 2000-2002
(Epidemiological Survey)
Benefits of Occupational Health Advice in Primary Care Settings -- 2001-2004
(Randomised Controlled Trial)
Smaller-Scale projects – (Tri-Services, NHS Personnel, NHS Patients)
(Cross-sectional Surveys, Clinical Trials)
Budget Airline Pilot Fatigue – 2002
(Cross-sectional Survey)
Multiple roles of psychologist, statistician, and methodology designer
Formats of Clinical Research
Experimental vs. Observational
Longitudinal vs. Cross-sectional
Prospective vs. Retrospective
Experimental
Longitudinal
Observational
Longitudinal
Cross-sectional
Prospective
Prospective
Retrospective
Randomised Controlled Trial
Cohort studies
Case control studies
Survey
Qualititative VS Quantitative Research
False opposition
Observational methods equally valid
Complementary roles
Quantitative
Qualitative equally as hard to do
(if not harder)
Qualitative
Quantitative Research Designs
Patients
Staff
Healthy
Laboratory
Experimental
RCT
approach
Case - control
Epidemiology
Cohort study
Observational
Survey
Postal questionnaire
Experimental Studies
Investigator makes intervention
A “manipulation”
Then studies the effects of that intervention
Features:
Comparison e.g.
Always longitudinal
Always prospective
Experimental
Clinical
Trials
RCTs
before vs. after
control vs. treatment
Experimental Studies
Evaluate effectiveness of intervention / therapy
Use similar samples who reflect population
Comparable groups
Differences in outcomes due to interventions (not differences between groups)
Independent Variable (IV) alters Dependent Variable (DV)
Best “evidence” of cause and effect
Sometimes inconclusive
Types of Experimental Studies
Between Subjects Studies
Each group receives different treatment
Groups compared
Within Subjects Studies
Each individual is measured before & after intervention
Advantage that each participant is own control
Between subject variability removed
Traditional Experimental Designs
Between subjects studies
Treatment group
Outcome measured
Control group
Outcome measured
patients
Within Subjects studies
patients
Outcome measured #1
Treatment
Outcome measured #2
Within Subjects Studies
Cross-over-studies
Each patient receives treatment in sequence
“Washout” period between treatments
Order of treatments randomised
Group A
Treatment 1
Treatment 2
Gp A
Group B
Treatment 2
Treatment 1
Gp B
Matched-pairs study
Parallel study
Patient in arm 1 matched with patient in arm 2
Match based on prognostic / socio-economic factors
Data is linked
Paired individuals
Control Groups
Allow comparison in Between Group studies
Evaluations without comparison?
Types of Control Groups
•“no treatment” group
likely to be confounded by having condition
•“placebo” group
ethically dodgy?
•“low dose” group
avoids ethical issues
•“standard treatment” group
avoids ethical issues
•“gold standard” group
avoids ethical issues
•“historical controls”
unreliable due to many confounders
Comparison Groups: Random Sampling
Ensures generalizability of findings to larger pop.
e.g. in-patient sample limitations
Treatment effects better detected if there is little between-group variability
Exclusion Criteria & Inclusion Criteria keep groups comparable
Paradox:
greater uniformity of sample = less generalisable to general population
Control Groups: Random Allocation
Population
(60 million)
Doesn’t guarantee groups will be homogonous
Ensures allocation independent of patient features
Sample
(1000)
Avoids (sub)conscious allocation bias
e.g. severely sick people into treatment groups
Gp A
(500)
Gp B
(500)
Drug X
Drug Y
53 yrs
27 years
Non-homogenous groups may still occur
due to chance – random errors
53% male
47% fem
Stratified randomisation
for each prognostic factor e.g. weight, age, sex
81% male
19% fem
Guarantees allocation to be bias-free
Randomised Controlled Trials in GP & Primary Care
90% consultations take place in GP surgery
RCT is actually 50 years old
Potential problems
2 Key areas:
Recruitment Bias
Randomisation Bias
Over-focus on failings of RCTs
RCTs in General Practice & Primary Care
•
•
•
RCTs justified in situations of genuine clinical uncertainty
Provides rigorous, sound basis for evaluating treatments
Samples large enough to establish any worthwhile benefit
(effectiveness or cost, or both)
Need for larger numbers of patients
More than are available to single practices
Requires “club together” approach
GPs: no contractual obligation
(i)
unwilling to take part if no immediate benefit for patients
(ii) while possibly disrupting the delivery of health care
RCTs in General Practice & Primary Care
GPs conflict of interest between:
Role and Wish to benefit future patients
Academic merit
Long term nature of practitioner and patient relationship
May engender loyalties
Unfairly coerce patients to give consent
Patients' fears about:
Confidentiality
Risks of the intervention
Apparent disadvantage of being allocated to a control group
may further inhibit recruitment
Fail to recruit consecutive patients may introduce potential for selection bias
RCTs in General Practice & Primary Care
May disrupt primary care
Too much disruption = no reflection of real practice
Methodological problems reduce scientific reliability of the results
(Recruitment & Randomisation)
General practice not a laboratory
Patients are not experimental animals
Case-control studies, retrospective and prospective cohort studies, and
descriptive studies are all acceptable methods.
Observation is OK
Should accept alternative methods when RCT too difficult or flawed
RCT Deficiencies
Trials too small
Trials too short
Poor quality
Poorly presented
Address wrong question
Methodological inadequacies
Inadequate measures of quality of life (changing)
Cost-data poorly presented
Ethical neglect
Patients given limited understanding
Poor trial management
Politics
Marketeering
Why still the dominant model?
Observational Studies
Investigator observes existing situation
Describes
Analyses
Interprets
No influence on events
Longitudinal observation studies
case-control studies: retrospective
cohort-studies: prospective
Cross-sectional observation studies
surveys examining subjects at one point in time
based on random sample of interest population
Observational Studies
Look for associations
• Cause -> Effect
• Exposure – Illness
• Epidemiological
• Incidence, cause, prevention
No control group necessary
Cannot use classical experimentation
No randomisation
Bias is a realistic problem
Case-Control Study
Identify group with condition / illness (cases)
Identify group without condition / illness (controls)
Both groups compared for exposure to (hypothesized) risk factors
Greater exposure to risk factor in cases than controls = “causal relation”
Beware:
Lead time bias
Recruitment of cases at similar points in time
Newly diagnosed cases (biases?)
Selection of Controls
Cases have Lung Cancer + Smoking Exposure
Controls could be other hospital patients (other disease) or “normals”
Matched Cases & Controls for age & gender
Option of 2 Controls per Case
Smoking years of Lung Cancer cases and controls
(matched for age and sex)
Cases
n=456
Smoking years 13.75
(± 1.5)
Controls
n=456
6.12
(± 2.1)
F
7.5
P
0.04
Case-Control Study: Other Biases
Recall Bias
Cases > associations with exposures
Unreliable Memories
Retrospective nature
Over-reliance on recall
Unreliable Records
Poor hospital records
Repetitive, incomplete, inaccurate, irretrievable, interpretation
Interview Bias
Different interviewers
Cohort Study
ID and examination of a group (cohort)
Followed over time (20 years common!)
Looking for disease development / other end-point
Aetiology of disease (based on data collected)
Data more reliable than case-control studies
• Requires large N
• Requires long follow up
• Inefficient
• Expensive (espec. rare outcomes)
Cohort Study: Methods
Subjects classified into 2 (or more groups)
e.g. exposed vs non exposed
End point: groups compared for cancer symptom status
Comparison of Brain cancers between users and non-users of mobile
phones
Brain Ca
No Ca
mobile phone user
292
108
400
non-phone user
89
313
402
381
421
802
Cohort Study: Other Biases
Lost to follow up
Bias if reason related to exposure
Validity affected
Group sizes change
Membership changes e.g ex-smokers
Differential mortality
Change in circumstance
e.g. job change
Exposures need calculation or re-calculation
Surveillance bias
Investigator aware of group membership
Investigating exposed members more
Observational studies
Cohort (prospective)
cohort
prospectively measure risk factors
end point measured
aetiology
prevalence
development
odds ratios
Case-Control (retrospective)
start point measured
aetiology
odds ratios
prevalence
development
retrospectively measure risk factors
cases
Cross Sectional Study
Subjects contacted & surveyed just once
Questionnaire (post, email, phone)
Random sample of defined pop.
Limited causality
Not temporal relationships
Little insight into aetiology
Source of descriptive data
Prevalence rates
Volunteer bias
Non responses
Self-selection
Unrepresentative sample
Critical Appraisal Criteria
• Researchers’ plan
• Ethical clearance / considerations
• Sample
•Size
•Bias
•Allocation
• Methods & Data collection
•Valid
•Reliable
•Measurable
•Accurate
• Analysis
• Write up
•Accurate
•Clear
•Replicable
Planning and Design
Ethical clearance & considerations
Good research should be...
Justified
Well planned
Appropriately designed
Ethically approved
• Research should be driven by protocol
• Pilot studies should have a written rationale
• Protocols should answer specific questions
• Not just “collecting data”
• Protocols must be agreed by all contributors & participants
• Keep the protocol as part of the Research record / log
Ethical misconduct not to meet this standard? – Not yet
Design & Ethical Approval
Statistical issues should be considered before data collection
Power calculations are (becoming) essential
Formal documented ethical approval is required for all research involving
(i) people
(ii) medical records
(iii) anonymous human tissue (Nuffield Council on Bioethics)
Fully informed consent should always be sought
If not possible (deceptive studies) a research ethics committee should decide
WMA Research Ethics Checklist
• people’s rights and claims
• different sorts of interests and their relative strength
• human well-being
• loss of life
• what would be good or bad for people
• democratic acceptance
• consultation
• sensitive moments
• benefits and harms
• grief and distress
• an obligation to make sacrifices for the community;
• entitlement of the community to deny autonomy and violate bodily integrity in public interest
• the system of justice
• public safety
• public policy considerations
• danger
• civil liberties
• individual autonomy
• lives and liberties of citizens
Sample size
Population characteristics
The Importance of Sample Size
• Forgotten in many studies
• Little consideration given
• Appropriate size needed to confirm / refute hypotheses
• Small samples far too small to detect anything but the grossest difference
• Non-significant results become reported as significant – Type 2 errors occur
• Too large a sample –
unnecessary waste of (clinical) resources
waste of patient time, inconvenience, discomfort
Essential to assess optimal sample size before investigation
How Many Make a Sample?
“8 out of 10 owners who expressed a preference, said their cats
preferred it.”
How confident can we be about such statistics?
8 out of 10?
80 out of 100?
800 out of 1000?
80,000 out of 100,000?
Multiple Measurement of small sample
25 cell clusters
26
22 cell clusters
25
24
24 cell clusters
23
22
21
21 cell clusters
20
Total
Mean
SD
= 92 cell clusters
= 23 cell clusters
= 1.8 cell clusters
It all depends on the size of your needle
Small samples spoil research
N
Age
IQ
N
Age
IQ
N
Age
IQ
1
2
3
4
5
6
7
8
9
10
20
20
20
20
20
20
20
20
20
20
100
100
100
100
100
100
100
100
100
100
1
2
3
4
5
6
7
8
9
10
18
20
22
24
26
21
19
25
20
21
100
110
119
101
105
113
120
119
114
101
1
2
3
4
5
6
7
8
9
10
18
20
22
24
26
21
19
25
20
45
100
110
119
101
105
113
120
119
114
156
Total
Mean
SD
200
20
0
1000
100
0
Total
Mean
SD
216
21.6
± 4.2
1102
110.2
± 19.2
Total
Mean
SD
240
24
± 8.5
1157
115.7
± 30.2
Qualitative studies need to sample wisely too…
Asian GPs’ attitudes to ANP
Objective:
To determine attitudes to ANP among Asian doctors in East Birmingham PCT
Method:
Send invitation to 55 Asian GPs (Approx 47% of East Birmingham PCT)
Intends to interview (30mins) with first 20 GPs who respond
Sample would be 36% of Asian GPs – and only 17% of GPs in PCT
Severely Biased Research (and ethically dodgy too)
Population Samples
Achieving a high response rate to a questionnaire is vital
as helps ensures a normal distribution of responses?
Postal questionnaires rarely get a response rate > 40%
Unless respondents have a vested interest in the outcome
Bias?
Most efficient (best) response rates usually happen when respondents have to
do very little to take part in the study
Multiple phase projects see a depletion in numbers at every stage
Quick “in and out” one-stop approach is best
% of population
A Normally Distributed Sample of a Population
5’6”
5’7”
5’8”
5’9”
5’10” 5’11” 6’
Height
RANDOM sampling
OPPORTUNISTIC sampling
CONSCRIPTIVE sampling
QUOTA sampling
6’1”
6’2” 6’3”
6’4”
Sampling a Population
A POPULATION
REPRESENTATIVE SAMPLE
(theoretical)
ACCESSIBLE
SAMPLE
(actual)
Are this lot are REPRESENTATIVE of the POPULATION ?
Sampling Keywords
POPULATIONS
Can be mundane or extraordinary
SAMPLE
Must be representative
INTERNALY VALIDITY OF SAMPLE
Sometimes validity is more important than generalisability
SELECTION PROCEDURES
Random
Opportunistic
Conscriptive
Quota
ECOLOGICAL VADLIDITY
Participants in their natural environment
Deployment
RANDOM SAMPLING
RANDOM ASSIGNMENT
How to assign the sample into different treatments or groups
Related to the INTERNAL VALIDITY of the research
Ensures groups are similar (EQUIVALENT) to each other prior to TREATMENT
Waste of time randomly sampling but not randomly allocating
Having a choice in this matter is a luxury
How many makes a sample?
POWER OF STUDY CALCULATION
Statistical method of calculating the number of subjects needed in a project
Based upon…..
Expected variance of subjects’ scores
Useful size of any differences between groups
Significance level (e.g. 5 % or 1 %)
Power level
The larger the differences you are looking for between groups, then the fewer
subjects are needed. Looking for small differences between groups requires
larger numbers of subjects
Bias
Bias
Validity of study depends on avoiding bias
Bias = “Systematic distortion of results due to unforeseen factors”
Group 1 = pill
Group 2 = no pill
How will the “no pill”group progress?
Any effects of them “knowing” they have no treatment?
Handling differences may influence + complicate trial results
Known as confounding factors
To minimize bias…
control group
randomisation
blinding
Selection Bias
Sampling properly is Crucial
Gulf War
Syndrome
Call Bird Flu
Centres
Samples may be askew
Specialist publications attract a specialist response group
Exists a self-selection bias of those with special interests
Depleted Uranium
Weaponry
Stress
Pesticides
Hospital
Telecomms infection
Controversial topics, or litigious areas
THIS IS AN INHERENT PROBLEM WITH
HEALTH RESEARCH
COMBAT IT WITH LARGE SAMPLES
AND CLEVER METHODOLOGY
Bias – The placebo effect really does work!
Most effective medication known
In approx. 30% of pop.
Subjected to more clinical trials than any other medicament
Nearly always does better than anticipated
The range of susceptible conditions seems limitless
Does not always occur
Present in subjective and objective outcomes
Negative outcomes can occur (Nocebo effect)
•Big pills better than smaller pills
•Red pills better than blue
Patient’s “knowledge” of their treatment causes bias
•4 pills better than 2
e.g. Benedetti & the Turin study
•30% of pop.
•Sham surgery vs arthroscopy for osteoarthritis
Subject Variables that potentially bias / confound research
STABLE FACTORS
Age
Education
Sex
Socioeconomics
Language
Handedness
Computer experience
Caffeine (habitual use)
Alcohol (habitual use)
Nicotine (habitual use)
Medication (habitual use)
Paints, glues, pesticides (habitual use)
Diabetes
Epilepsy
Other CNS / PNS disease
Head injury (out >1 hr)
Alcohol / drug addiction
Physical activity
SITUATIONAL FACTORS
Alcohol (recent use)
Caffeine (recent use)
Nicotine (recent use)
Medication (recent use)
Paints, glues, pesticides (recent)
Near visual acuity
Restricted movement (injury)
Cold / flu
Stress
Arousal / Fatigue
Sleep
Screen luminance
Time of day
Time of year
Blinding: Importance of doing it
Investigator or Patient know treatment = Bias
Observations and Judgements become less reliable
Patient responses change:
Positive outcomes in active arm
Negative outcomes in passive arm
e.g. known cancer diagnoses and deterioration
Use max. degree of blindness possible
e.g. make patient and investigator both blind if possible
e.g. A.A.Mason & Congenital Ichthyosis and Hypnosis 1951
Blinding: Methods of doing it
Double-blind
patient & investigator blind
Treatment type
Patient interaction
Data manager
Blinding: Methods
Double-blind
patient & investigator blind
Single-blind
patient blind
Triple-blind
patient & investigator & data monitor blind
Double-dummy 2 treatments
patients get 2 pills (1 active, 1 dummy)
Open trials
patient & investigator aware of treatment
Randomisation in a double-blind trial
Envelope technique common
Un-blinding – ethical necessity
Un-blinding a problematic study
Breaking code – anticipated in the planning stages
Criteria for breaking code – established and agreed
Emergency access to randomisation code
Treatment stopped and patient withdrawn
Formal monitoring process – review and make recommendations
Methods
Data Collection
Background on Research
• Large-scale
• Quantitative
• Can be descriptive
“2% of women think they are beautiful”
• Can be inferential
“Significantly more singletons think they’re beautiful (46%) than married (23%)
• Done with a sample of patients, respondents, consumers, or professionals
• Differences between any groups assessed with hypothesis testing
Important that sample size must be large enough to detect any
such difference if it truly exists
Survey Research
Questionnaire is a fundamental component of most research
Most MSc / MPhil /PhD projects use survey methods
Can be very efficient
Weaknesses
weak / dubious questionnaires
non-valid questionnaires
biased samples
biased responses
poor response rate
Validity
Does the survey measure what it says it is measuring
Reliability
Does the survey yield stable
data over time
How accurate are large scale surveys / questionnaires?
Likert scales
“How do you feel right now?”
XX
X
happy
X
XXX
XX
X
X
X
X
X
XX X
X
X
X
X
sad
Bending the Data
“How do you feel right now?”
XX X
happy
XX
X
X
XX X X
X
sad
Non-Responders just as Important
Postal surveys may accrue poor response rates (e.g. 20%) from pop.
May need to re-write to pop. to re-recruit bigger sample
Inefficient to write to all pop. again
Need to re-write to non-responders and NOT responders
Impossible in anonymous studies with no linkage
Can be done with confidential studies
Diminishing returns of multi-stage recruitment
Researcher
Invitation letter & consent form
Potential Sample
1000 people
Acceptance letter & consent form
540 consents
540 blank questionnaires
Under-powered study
n = 210
Response rate of 21%
210 completed questionnaires
“Unethical” practices proved to increase response rates
Technique
Likelihood of participation
Cash incentive
(Brown, et al. 1997, Roberts et al. 2000)
X2
Warn respondents of follow up (need linkage)
X 1.4
Drop out must be explained by the respondent
X 1.3
Choice to opt out given to respondents
X 0.7
(Edwards et al. 2002)
Finally. . . . The best research is simple in design
“Some people hate the very name of statistics but.....their power of
dealing with complicated phenomena is extraordinary. They are the
only tools by which an opening can be cut through the formidable
thicket of difficulties that bars the path of those who pursue the science
of man.”
Sir Francis Galton, 1889
Further Reading
Altman, D.G. “Designing Research”. In: Altman, D.G., (ed.) Practical Statistics
For Medical Research. London, Chapman and Hall, 1991; 74-106.
Bland, M. “The design of experiments”. In: Bland, M., (ed.) An introduction to
medical statistics. Oxford, Oxford Medical Publications, 1995; 5-25.
Daly, L.E., Bourke, G.J. “Epidemiological and clinical research methods”.
In: Daly L.E., Bourke, G.J., (eds.) Interpretation and uses of medical statistics.
Oxford, Blackwell Science Ltd, 2000; 143-201.
Jackson, C.A. “Study Design” & “Sample Size and Power”. In: Gao Smith, F.
and Smith, J. (eds.) Key Topics in Clinical Research. Oxford, BIOS scientific
Publications, 2002.
Jackson, C.A. “Planning Health & Safety Research Projects in the
Workplace”. Croner Health and Safety at Work Special Report 2002; 62: 1-16.
Kumar, R. Research Methodology: a step by step guide for beginners.
Sage, London 1999.
Further Reading
Abbott, P. and Sapsford. Research methods for nurses and the caring
professions. Open University Press, Buckingham 1988.
Bowling, A. Measuring Health. Open University Press, Milton Keynes 1994
Polit, D. & Hungler, B. Nursing research: Principles and methods (7th ed.).
Philadelphia: Lippincott, Williams & Wilkins 2003.
Council for International Organizations of Medical Sciences (CIOMS).
International Guidelines for Ethical Review of Epidemiological Studies
World Health Organisation, Geneva 1991.
Nuffield Council on Bioethics. Human tissue: Ethical and legal issues. Nuffield
Council on Bioethics, London 1995.
World Medical Association. Ethical Principles for Medical Research Involving
Human Subjects. Declaration of Helsinki, 2002. (Washington Amendment).