Transcript Slide 1
The design of animal experiments Michael FW Festing c/o Understanding Animal Research, 25 Shaftsbury Av. London, UK. [email protected] 1 Principles of Humane Experimental Technique (Russell and Burch 1959) Replacement e.g. in-vitro methods, less sentient animals Refinement e.g. anaesthesia and analgesia, environmental enrichment Reduction Research strategy Controlling variability Experimental design and statistics 2 A well designed experiment Absence of bias High power Low noise (uniform material, blocking, covariance) High signal (sensitive subjects, high dose) Large sample size Wide range of applicability Experimental unit, randomisation, blinding Replicate over other factors (e.g. sex, strain): factorial designs Simplicity Amenable to a statistical analysis 3 The animal as the experimental unit N=8 n=4 Animals individually treated. May be individually housed or grouped 4 A cage as the Experimental Unit. Treated Control Treated Control N=4 n=2 Treatment in water or diet. 5 An animal for a period of time: repeated measures or crossover design N Animal 4 1 2 4 4 3 Treatment 1 N=12 n= 6 Treatment 2 6 Teratology: mother treated, young measured N=2 n=1 Mother is the experimental unit. 7 Failure to identify the experimental unit correctly in a 2(strains) x 3(treatments) x 6(times) factorial design ELD group ELD group Single cage of 8 mice killed at each time point (288 mice in total) 8 Experimental units must be randomised to treatments Physical: numbers on cards. Shuffle and take one Tables of random numbers in most text books Use computer. e.g. EXCEL or a statistical package such as MINITAB 9 Randomisation Original Randomised 1 2 1 3 1 3 1 1 2 2 2 1 2 2 2 1 3 3 3 2 3 3 3 1 NB Randomisation should include housing and order in which observations are made 10 Failure to randomise and/or blind leads to more “positive” results Blind/not blind odds ratio 3.4 (95% CI 1.7-6.9) Random/not random odds ratio 3.2 (95% CI 1.3-7.7) Blind Random/ not blind random odds ratio 5.2 (95% CI 2.0-13.5) 290 animal studies scored for blinding, randomisation and positive/negative outcome, as defined by authors Babasta et al 2003 Acad. emerg. med. 10:684-687 11 Some factors (e.g. strain, sex) can not be randomised so special care is needed to ensure comparability Six cages of 7-9 mice of each strain: error bars are SEMs "CBA mice showed greater variability in body weights than TO mice..." Outbred TO (8-12 weeks commercial) Inbred CBA (12-16 weeks Home bred) 12 A well designed experiment Absence of bias High power Low noise (uniform material, blocking, covariance) High signal (sensitive subjects, high dose) Large sample size Wide range of applicability Experimental unit, randomisation, blinding Replicate over other factors (e.g. sex, strain): factorial designs Simplicity Amenable to a statistical analysis 13 High power: (good chance of detecting the effect of a treatment, if there is one) High = High = High = High Signal/Noise ratio Standardized effect size d=|m1-m2|/s (Difference between means)/SD Student’s t =( X1-X2)/Sqrt (2S2/n) 14 Power Analysis for sample size and effects of variation A mathematical relationship between six variables Needs subjective estimate of effect size to be detected (signal) Has to be done separately for each character Not easy to apply to complex designs Essential for expensive, simple, large experiments (clinical trials) Useful for exploring effect of variability A second method “The Resource Equation” is described later 15 Power analysis: the variables Signal a) Effect size of scientific interest or b) actual response Chance of a false positive result. Significance level (0.05) Sample size Sidedness of statistical test (usually 2-sided) Power of the Experiment (80-90%?) Noise Variability of the experimental material 16 Group size and Signal/noise ratio Bad 140 Power 90% 80% 120 Group size 100 80 Neutral 60 Good 40 20 0 0 0.5 1 1.5 2 2.5 3 Signal/noise ratio Effect size (Std. Devs.) Assuming 2-sample, 2 sided t-test and 5% significance level 17 Comparison of two anaesthetics for dogs under clinical conditions (Vet. Anaesthes. Analges.) Unsexed healthy clinic dogs, • Weight 3.8 to 42.6 kg. • Systolic BP 141 (SD 36) mm Hg Assume: • a 20 mmHg difference between anaesthetics is of clinical importance, • a significance level of a=0.05 • a power=90% • a 2-sided t-test Signal/Noise ratio 20/36 = 0.56 Required sample size 68/group 18 Power and sample size calculations using nQuery Advisor 19 A second paper described: • Male Beagles weight 17-23 kg • mean BP 108 (SD 9) mm Hg. • Want to detect 20mm difference between groups (as before) With the same assumptions as previous slide: Signal/noise ratio = 20/9 = 2.22 Required sample size 6/group 20 Summary for two sources of dogs: aim is to be able to detect a 20mmHg change in blood pressure Type of dog SDev Signal/noise Random dogs 36 Male beagles 9 0.56 2.22 Sample size/gp(1) 68 6 %Power (n=8) (2) 18 98 (1) Sample size: 90% power The scientific dilemma: (2) Power, Sample size 8/group With small sample sizes we can not detect an Assumes a=5%, 2-sided t-test and effect size 20mmHg important effect in genetically heterogeneous animals. We can detect the effect in genetically homogeneous animals, but are they representative? 21 Variation in kidney weight in 58 groups of rats 90 80 70 Variability 60 Mycoplasma 50 Outbred 40 F1 F2 30 20 10 0 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 Sample numbe r Gartner,K. (1990), Laboratory Animals, 24:71-77. 22 Required sample sizes Factor Type Genetics F1 hybrid 13.5 0.74 30 80 F2 hybrid 18.4 0.54 55 53 Outbred 20.1 0.49 67 46 Mycoplasma free 18.6 0.54 55 53 With Mycoplasma 43.3 0.23 298 14 Disease Std.Dev Signal/ noise* Sample Power** size *signal is 10 units, two sided t-test, a=0.05, power = 80% ** Assuming fixed sample size of 30/group 23 The randomised block design: another method of controlling noise Treaments A, B & C B C A B1 A C B B2 B A C B3 A C B B4 B C A • • • • • • Randomisation is within-block Can be multiple differences between blocks Heterogeneous age/weight Different shelves/rooms Natural structure (litters) Split experiment in time B5 24 Apoptosis score A randomised block experiment 500 450 400 350 300 250 200 150 100 50 0 Control CGP STAU 365 398 421 1 Treatment effect p=0.023 (2-way ANOVA) 423 432 459 2 Week 308 320 329 3 25 Analysis of apoptosis data Analysis of Variance for Score Source Block Treatmen Error Total DF 2 2 4 8 SS 21764.2 2129.6 379.1 24272.9 MS 10882.1 1064.8 94.8 F 114.82 11.23 P 0.000 0.023 26 Residual Model Diagnostics Normal Plot of Residuals I Chart of Residuals 20 Residual Residual 10 0 -10 10 0 Mean=3.16E-14 -10 -20 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 LCL=-20.17 0 1 2 3 4 5 6 7 Normal Score Observation Number Histogram of Residuals Residuals vs. Fits 8 9 10 3 2 Residual Frequency UCL=20.17 1 0 -10 0 -10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 Residual 300 350 400 Fit 450 27 Another method of determining sample size: The Resource Equation Depends on the law of diminishing returns Simple. No subjective parameters Useful for complex designs and/or multiple outcomes (characters) Does not require estimate of Standard Deviation Crude compared with Power Analysis E= (Total number of animals)-(number of groups) 10<E<20 (but give some tolerance) 28 Student's t, 5% critical value The Resource Equation & Sample Size 12.0 E= (total numbers)-(number of groups) 9.5 10<E<20 7.0 4.5 2.0 0 5 10 15 20 25 30 35 Degrees of freedom But if experimental subjects are cheap (e.g. multi-well plates, E can be much higher 29 A well designed experiment Absence of bias High power Low noise (uniform material, blocking, covariance) High signal (sensitive subjects, high dose) Large sample size Wide range of applicability Experimental unit, randomisation, blinding Replicate over other factors to (e.g. sex, strain) to increase generality: factorial designs Simplicity Amenable to a statistical analysis 30 Factorial designs Factorial design Treated Control E=16-4 = 12 Single factor design Treated Control E=16-2 = 14 One variable at a time (OVAT) Treated Control Treated Control E=16-2 = 14 E=16-2 = 14 31 Factorial designs (By using a factorial design)”.... an experimental investigation, at the same time as it is made more comprehensive, may also be made more efficient if by more efficient we mean that more knowledge and a higher degree of precision are obtainable by the same number of observations.” R.A. Fisher, 1960 32 A 4x2 factorial design Analysed with Student’s t-test: This is not appropriate because: 1. Each test is based on too few animals (n=3-4), so lacks power 2. It does not indicate whether there are strain differences in protein thiol status 3. It does not indicate whether dose/response differs between strains 4. A two-way design should be analysed using a 2-way ANOVA 33 Incorrect statistical analysis leading to excessive numbers of animals One experiment or 4 separate experiments? 8 mice per group 8 groups = 64 mice. E= 64-8 =56 Alternative 3 mice per group: 8 groups E=24-8 = 16 Saving:40 mice Formal test of interaction 34 2 (strains) x 4 (Animal units) factorial 35 Effect of chloramphenicol (2000mg/kg) on RBC count Strain Control C3H 7.85 8.77 8.48 8.22 CD-1 9.01 7.76 8.42 8.83 Tests: Treated 7.81 7.21 6.96 7.10 9.18 8.31 8.47 8.67 Should not be analysed using two t-tests 1. Each test lacks power due to small sample size 2. Will not give a test of whether strains differ in response Use a two-way ANOVA with interaction 1. Do the treatment means averaged across strains differ? 2. Do the strains differ, averaged across treatments 3. Do the two strains respond to the same extent? 36 A 2x2 factorial design with interaction Source strain Treatment strain*treat. Error Total DF 1 1 1 12 15 Red bloodcellcount 9 SS 2.4414 0.8236 1.4702 2.2308 6.9659 MS 2.4414 0.8236 1.4702 0.1859 F 13.13 4.43 7.91 P 0.003 0.057 0.016 Pooled variance CD-1 C3H 8.5 8 7.5 7 6.5 Control Treated Control Strain and treatment Treated 37 Use of several inbred strains to reduce noise, increase signal and explore generality Effect of chloramphenicol on mouse haematology Dose of chloramphenicol (mg/kg) 0 500 1000 1500 2000 2500 Outbred CD-1 8 8 8 8 8 8 CBA 2 2 2 2 2 2 C3H 2 2 2 2 2 2 BALB/c 2 2 2 2 2 2 C57BL 2 2 2 2 2 2 Inbred Festing et al (2001) Fd. Chem.Tox. 39:375 38 Example of a factorial compared with a single factor design Strain CBA CBA C3H C3H BALB/c BALB/c C57BL C57BL WBC Control Treated 1.90 0.40 2.60 0.20 2.10 0.40 2.20 0.40 1.60 1.30 0.50 1.40 2.30 0.80 2.20 1.10 CD-1 CD-1 CD-1 CD-1 CD-1 CD-1 CD-1 CD-1 3.00 1.70 1.50 2.00 3.80 0.90 2.60 2.30 1.90 1.90 3.50 1.20 2.30 1.00 1.30 1.60 Four inbred strains One outbred stock 39 WBC counts following chloramphenicol at 2500mg/kg White blood cell counts Strain N CD-1 16 0 2.23 Strain N 0 CBA 4 2.25 C3H 4 2.15 BALB/c 4 1.05 C57BL 4 2.25 Mean 16 1.93 Dose * strain Signal Noise 2500 (Difference) (SD) Signal/noise p 1.83 0.40 0.86 0.47 0.38 Signal Noise 2500 (Difference) (SD) 0.30 1.95 0.34 0.40 1.85 0.34 1.35 (-0.30) 0.34 0.95 1.30 0.34 1.20 0.73 0.34 Signal/noise p 5.73 5.44 (-0.88) 3.82 2.15 <0.001 <0.001 40 Genetics is important: Twenty two Nobel Prizes since 1960 for work depending on inbred strains Cell mediated immunity Immunological tolerance H2 restriction, immune responses Medawar, Burnet, Doherty, Zinkanagel Benacerraf (G.pigs) Genetics Snell ES cells & “knockouts” Humoral immunity/antibodies T-cell receptor Tonegawa, Jerne monoclonal antibodies BALB/c mice Kohler and Millstein C.C. Little, DBA, 1909 Inbred Strains and derivatives Jackson Laboratory Evans, Capecchi, Smithies Cancer mmTV Transmissable encephalopathacies/prions Pruisner Smell Axel & Buck Retroviruses, Oncogenes & growth factors Cohen, Levi-montalcini, Varmus, Bishop, Baltimore, Temin 41 18th Annual Short Course on Experimental Models of Human Cancer August 21-30, 2009 Bar Harbor, ME courses.jax.org 42 Conclusions Five requirements for a good design Unbiased (randomisation, blinding) Powerful (signal/noise ratio: control variability) Wide range of applicability (factorial designs, common but frequently analysed incorrectly) Simple Amenable to statistical analysis Mistakes in design and analysis are common Better training in experimental design would improve the quality of research, save money, time and animals 43 44 45