Inferential Statistics Parametric Ch 5. Inferential Statistics Random Samples Estimate Population Statistics Correlation of Two variables Experimental Methods Standard Error of the Mean Descriptive Statistics Ch 1.
Download ReportTranscript Inferential Statistics Parametric Ch 5. Inferential Statistics Random Samples Estimate Population Statistics Correlation of Two variables Experimental Methods Standard Error of the Mean Descriptive Statistics Ch 1.
Inferential Statistics Parametric Ch 5. Inferential Statistics Random Samples Estimate Population Statistics Correlation of Two variables Experimental Methods Standard Error of the Mean Descriptive Statistics Ch 1. The Mean, The Number of Observations, & the Standard Deviation N/Population/Parameters Measures of Central Tendency – Median, Mode, Mean Measures of Variability – Range, Sum of Squares, Variance, Standard Deviation Ch 6. T Scores / T Curves Estimates of Z scores Computing t Scores Critical Values Degrees of Freedom Ch 2. Frequency Distributions and Histograms Frequency Distributions Bar Graphs / Histograms Continuous vs Discreet Variables Ch 7. Correlation Variable Relationships – linearity, direction, strength Correlation Coefficient Scatter plots Best Fitting lines Ch 3. The Normal Curve Z scores & percentiles Least Squares, Unbiased estimates Ch 8. Regression Predicting using the regression equation Generalizing – The null hypothesis Degrees of freedom and statistical significance Ch 4. Translating To and From Z Scores Normal Scores Scale Scores Raw Scores Percentiles Non Parametric Other stuff to come Ch 10.Two Way Factorial Analysis of Variance Three null hypotheses Graphing the means Factorial designs Chapter 11. A variety of t tests Ch 9. Experimental Studies Independent and dependent variables The experimental hypothesis The F test and the t test Ch 12. Tukeys Significant Difference Testing differences in group means Alpha for the whole experiment HSD - Honestly Significant Difference Ch 12. Power Analysis Type 1 error and alpha Type 2 error and beta How many subjects do you need? Ch 13. Assumptions Underlying Parametric Statistics Sample means form a normal curveSubjects are randomly selected from the population Homgeneity of VarianceExperimental error is random across samples Ch 14. Chi Square Nominal Data Chapter 11- Lecture 1 t tests: single sample, repeated measures, and two independent samples. 11/7/2015 2 Conceptual overview 11/7/2015 3 t and F tests: 2 approaches to measuring the distance between means There are two ways to tell how far apart things are. When there are only two things, you can directly determine their distance from each other. If they are two scores, as they usually are in Psychology, you simply subtract one from the other to find their difference. That is the approach used in the t test and its variants. 11/7/2015 4 F tests: Alternatively, when you want to describe the distance of three or more things from each other, the best way to index their distance from each other is to find a central point and talk about their average squared distance (or average unsquared distance) from that point. The further apart things are from each other, the further they will be, on the average, from that central point. That is the approach you have used in the F test (and when treating the t test as a special case of the F test: t for one or two, F for more.) 11/7/2015 5 One way or another: the two methods will yield identical results. We can use either method, or a combination of the two methods to ask the key question in this part of the course, “Are two or more means further apart than they are likely to be when the null hypothesis is true.” 11/7/2015 6 H0: It’s just sampling fluctuation. If the only thing that makes the two means different is random sampling fluctuation, the means will be fairly close to the population mean and to each other. If an independent variable is pushing the means apart, their distance from each other, or from some central point, will tend to be too great to be explained by the null hypothesis. 11/7/2015 7 Generic formula for the t test These ideas lead to a generic formula for the t test: t (dfW)=(Actual difference between 2 means) (Estimated average difference between the two means that should exist if the H0 is correct) 11/7/2015 8 Calculation and theory As usual, we must work on calculation and theory. Again we’ll do calculation first. 11/7/2015 9 The first of three types of simple t tests - the single or one sample t test One sample t test t test in which a sample mean is compared to a population mean. The population mean is almost always the value postulated by the null hypothesis. Since it is a mean obtained from a theory (H0 is a theory), we call that mean “muT”. To do the single sample t test, we divide the actual difference between the sample mean and muT by the estimated standard error of the mean, sX-bar. X 11/7/2015 10 Let’s do a problem: You may recognize this problem. We used it to set up confidence intervals in Ch. 6 11/7/2015 11 For example: For example, let’s say that we had a new antidepressant drug we wanted to peddle. Before we can do that we must show that the drug is safe. Drugs like ours can cause problems with body temperature. People can get chills or fever. We want to show that body temperature is not effected by our new drug. 11/7/2015 12 Testing a theory “Everyone knows” that normal body temperature for healthy adults is 98.6oF. Therefore, it would be nice if we could show that after taking our drug, healthy adults still had an average body temperature of 98.6oF. So we might test a sample of 16 healthy adults, first giving them a standard dose of our drug and, when enough time had passed, taking their temperature to see whether it was 98.6oF on the average. 11/7/2015 13 Here’s the formula: t ( df ) ( X muT ) / sX ( X muT ) /(s / n ) 11/7/2015 14 Here’s the formula: t ( df ) ( X muT ) / sX ( X muT ) /(s / n ) 11/7/2015 15 Data for the one sample t test We randomly select a group of 16 healthy individuals from the population. We administer a standard clinical dose of our new drug for 3 days. We carefully measure body temperature. RESULTS: We find that the average body temperature in our sample is 99.5oF with an estimated standard deviation of 1.40o (s=1.40). In Chapter 7 we asked whether 99.5oF was in the 95% CI around muT? It wasn’t. We should get the same result with a t test. 11/7/2015 16 Here’s the computation: t (15) (99.5 98.6) /(1.40 / 11/7/2015 16) .9 / .35 2.57 17 Notice that the critical value of t changes with the number of degrees of freedom for s, our estimate of sigma, and must be taken from the t table. If n= 16 in a single sample, dfW=n-k=15. 11/7/2015 18 df .05 .01 1 12.706 63.657 2 4.303 9.925 3 3.182 5.841 4 2.776 4.604 5 2.571 4.032 6 2.447 3.707 7 2.365 3.499 8 2.306 3.355 df .05 .01 9 2.262 3.250 10 2.228 3.169 11 2.201 3.106 12 2.179 3.055 13 2.160 3.012 14 2.145 2.997 15 df .05 .01 17 2.110 2.898 18 2.101 2.878 19 2.093 2.861 20 2.086 2.845 21 2.080 2.831 22 2.074 2.819 23 2.069 2.807 24 2.064 2.797 df .05 .01 25 2.060 2.787 26 2.056 2.779 27 2.052 2.771 28 2.048 2.763 29 2.045 2.756 30 2.042 2.750 40 2.021 2.704 60 2.000 2.660 df .05 .01 100 1.984 2.626 200 1.972 2.601 500 1.965 2.586 1000 1.962 2.581 2000 1.961 2.578 10000 1.960 2.576 16 2.131 2.120 2.947 2.921 We have falsified the null. We would write the results as follows t (15) = 2.57, p<.05 Since we have falsified the null, we reject 98.6o as the population mean for people who have taken the drug. Instead, we would predict the average person, drawn from the same population as our sample, would respond as did our sample. We would predict they will have an average body temperature of 99.5o after taking the drug. That is, they would have a slight fever. 11/7/2015 20 An obvious problem with the one sample experimental design: no control group. 11/7/2015 21 So, we can use a single random sample of participants as their own controls if we measure them two or more times. If it they are measured twice, we can use the repeated measures t test. 11/7/2015 22 Computation of the repeated measures t test Let’s say we measured 5 moderately depressed inpatients and rated their depression with the Hamilton rating scale for depression. The we treated them with CBT for 10 sessions and again got Hamilton scores. Lower scores = less depression. 11/7/2015 23 Here are pre, post and difference scores showing post-treatment scores subtracted from pretreatment scores Al scored 28 before treatment and 18 after . Bill scored 22 before and 14 after. Carol scored 23 before and 14 after. Dora scored 38 before and 27 after. Ed scored 33 before and 21 after. Before After 28 22 23 38 33 11/7/2015 18 14 14 27 21 Difference 10 8 9 11 12 Mean difference = 10.00 24 In this case, there are 5-1=4 degrees of freedom. Now we can compute the estimated standard error of the difference scores: t ( df ) ( X muT ) / sX ( X muT ) /(s / n ) sDbar 1.58 / 5 .71 11/7/2015 25 Now we are ready for the formula for the repeated measures t test: t equals the actual average difference between the means minus their difference under H0 divided by the estimated standard error of the difference scores t (dfD) ( XD muT ) / sDbar 11/7/2015 26 11/7/2015 27 Here is the computation in this case: t ( 4) ( XD mu T ) / sDbar 10.00 0.00 / .71 14.14 11/7/2015 28 Here is how we would write the result: t (4) = 14.14, p<.01 In this case the means are 14.14 estimated standard errors apart. We wrote the results as p<.01 But these are results so strong, they far exceed any value in the table, even with just a few degrees of freedom. This antidepressant works!!!! 11/7/2015 29 There are times a repeated measures design is appropriate and times when it is not. When it is not, we use a two sample, independent groups t test. 11/7/2015 30 The t test for two independent groups. You already know a formula for the two sample t test: t (n-k) = sB/s But now we want an alternative formula that allows us to directly compare two means by subtracting one from the other. It takes a little algebra, but the formula is pretty straight forward 11/7/2015 31 Three steps to computing the t test for two independent groups: First, we need to compute the actual difference between the two means. (That’s easy, we just subtract one from the other. 11/7/2015 32 Step 2: Then we compare that difference to the difference predicted by H0. That’s also easy because the null, as usual, predicts no difference between the means. H0: mu1 – mu2 = 0.00 That is, the null says that there is actually no average difference between the means of the two population represented by the two groups. 11/7/2015 33 Step 3 – this one is a little harder. Here we compute the standard error of the difference between two means ( sX X ) . Although the population means may be identical, samples will vary because of random sampling fluctuation. The amount of fluctuation is determined by MSW and the sizes of the two groups. 11/7/2015 34 So we need to divide the actual difference between the mean score at time 1 minus the mean at time two. Then we subtract the theoretical difference (which is 0.00 according to the null). Finally, we divide by the estimated standard error of the difference between the means of two independent groups. 11/7/2015 35 Let’s learn the conceptual basis and computation of the estimated standard error of the difference between 2 sample means. 11/7/2015 36 The estimated average squared distance between a sample mean and the population mean due solely to sampling fluctuation is MSW /n, where n is the size of the sample. The estimated average squared distance between two sample means is their two squared differences from mu added together: MSW/n1+ MSW/n2. 11/7/2015 37 So, if the samples are the same size, their average squared distance from each other equals: MSW/n1+ MSW/n2 = 2MSW/n But if the samples have different numbers of scores, we have to use the average size of the two groups. 11/7/2015 38 The problem is we can’t use a usual arithmetic average; we need to use a geometric average called the harmonic mean, nH. Then the average squared distance between two independent sample means equals 2MSW/nH The square root of that is the average unsquared difference between the mean of the two samples, the denominator in the t test. 11/7/2015 39 Here is the formula for the estimated standard error of the difference between the means of two independent samples sX X 2 MSW / nH 11/7/2015 40 Here’s the formula for the independent groups t test: t ( df ) [( X 1 X 2 ) ( mu1 mu 2 )] / sX 11/7/2015 X 41 Where sX X 2 MSW / nH 11/7/2015 42 So, to do that computation we need to learn to compute nH. 11/7/2015 43 Calculating the Harmonic Mean k nH 1 1 1 1 ... nK n1 n2 n3 Notice that this technique allows different numbers of subjects in each group. Oh No!! My rat died! What is going to happen to my experiment? 11/7/2015 44 If the groups are the same size , the harmonic and ordinary mean number of participants is the same. 3 groups; 4 subjects each k nH 1 1 1 1 ... nK n1 n2 n3 3 nH 1 1 1 4 4 4 11/7/2015 3 3 *100 100 4 .75 75 25 45 When groups do not have equal numbers, harmonic mean is smaller than ordinary mean. 4 groups; 6, 4, 8 and 4 participants. Ordinary mean=22/4=5.50 participants each. nH 4 1 1 1 1 6 4 8 4 4 8 12 6 12 48 48 48 48 4 4 * 48 2 * 48 96 5.05 38 19 19 38 48 11/7/2015 46 The theory part: 11/7/2015 47 ZX-bar scores As you know from Chapter 4, the Z score of a sample mean is the number of standard errors of the mean the sample mean is from mu. Here is the formula. ZX-bar = (X-bar - mu)/ sigmaX-bar 11/7/2015 48 Confidence intervals with Z As you learned in Chapter 4, if a sample differs from its population mean solely because of sampling fluctuation, 95% of the time it will fall somewhere in a symmetrical interval that goes 1.96 standard errors in both directions from mu. That interval is, of course, the CI.95. CI.95 = mu + 1.960 sigmaX-bar Or, for theoretical population means: CI.95 = muT + 1.960 sigmaX-bar 11/7/2015 49 MUT, the CI.95 and H0. Most of the time we don’t know mu, so we are really talking about muT. Most of the time, muT will be the value of mu suggested by the null hypothesis. If a sample falls outside the 95% confidence interval around muT, we have to assume that it has been pushed away from mu by some factor other than sampling fluctuation. 11/7/2015 50 ZX-bar and the null hypothesis If H0 says that the only reason that a sample mean differs from muT is sampling fluctuation, as H0 usually does, then the value of ZX-bar can be used as a test of the null hypothesis. If H0 is correct, ZX-bar should fall within the CI.95, within 1.960 standard errors of muT. If ZX-bar has an absolute value greater than 1.960, the sample mean falls outside the 95% confidence interval around mu and falsifies the null hypothesis. 11/7/2015 51 The underlying logic of the Z test Here is the formula for Zx-bar again. ZX-bar = (X-bar - muT)/ sigmaX-bar When used as a test of the null, most text books identify ZX-bar simply as Z. We will follow that lead and, when we use it in a test of the null, call Zx-bar simply “Z.” Here is the formula for the Z test. Z = (X-bar - muT)/ sigmaX-bar If the absolute value of Z equals or exceeds 1.960, Z is significant at .05. If the absolute value of Z equals or exceeds 2.576, Z is significant at .01. 11/7/2015 52 In the Z test You start with a random sample then expose it to an IV. You determine muT, the predicted mean if the null hypothesis is true. If the absolute value of Z > 1.960, Xbar outside the CI.95 around muT. The null hypothesis is probably not correct. Since you have falsified the null, you must turn to H1, the experimental hypothesis Also, since Z was significant, you conclude that were other individuals from the population treated the same way, they would respond similarly to the sample you studied. 11/7/2015 53 There are two problems We seldom know sigma. It would be nice to have a control group. Let’s deal with those problems one at a time. We’ll deal with the fact that we don’t know sigma Therefore we can’t compute sigmaX-bar. 11/7/2015 54 The first problem: Since we don’t know sigma, we must use our best estimate of sigma, s, the square root of MSW and then estimate sigmaX-bar by dividing s by the square root of n, the size of the sample. We therefore must use the critical values of the t distribution to determine the CI.95 and CI.99 around muT in which the null hypothesis predicts that Xbar will fall. The exact value will depend of degrees of freedom for s. Since s is the square root of MSW, dfW=n-k. 11/7/2015 55 t curves and degrees of freedom revisited Z curve F r e q u e n c y Standard deviations 5 df 1 df score 3 2 1 0 1 2 3 To get 95% of the population in the body of the curve when there are 5 df of freedom, you go out over 3 standard deviations. To get 95% of the population in the body of the curve when there is 1 df of freedom, you go out over 12 standard deviations. 11/7/2015 56 Critical values of the t curves The following table defines t curves with 1 through 10,000 degrees of freedom Each curve is defined by how many estimated standard deviations you must go from the mean to define a symmetrical interval that contains a proportions of .9500 and .9900 of the curve, leaving proportions of .0500 and .0100 in the two tails of the curve (combined). Values for .9500/.0500 are shown in plain print. Values for .9900/.0900 and the degrees of freedom for each curve are shown in bold print. 11/7/2015 57 df .05 .01 1 12.706 63.657 2 4.303 9.925 3 3.182 5.841 4 2.776 4.604 5 2.571 4.032 6 2.447 3.707 7 2.365 3.499 8 2.306 3.355 df .05 .01 9 2.262 3.250 10 2.228 3.169 11 2.201 3.106 12 2.179 3.055 13 2.160 3.012 14 2.145 2.997 15 2.131 2.947 16 2.120 2.921 df .05 .01 17 2.110 2.898 18 2.101 2.878 19 2.093 2.861 20 2.086 2.845 21 2.080 2.831 22 2.074 2.819 23 2.069 2.807 24 2.064 2.797 df .05 .01 25 2.060 2.787 26 2.056 2.779 27 2.052 2.771 28 2.048 2.763 29 2.045 2.756 30 2.042 2.750 40 2.021 2.704 60 2.000 2.660 df .05 .01 100 1.984 2.626 200 1.972 2.601 500 1.965 2.586 1000 1.962 2.581 2000 1.961 2.578 10000 1.960 2.576 Estimated distance of sample means from mu: the estimated standard error of the mean We can compute the standard error of the mean when we know sigma. We just have to divide sigma by the square root of n, the size of the sample Similarly, we can estimate the standard error of the mean, estimated the average unsquared distance of sample means from mu. We just have to divide s by the square root of n, the size of the sample in which we are interested 11/7/2015 59 Here’s the formula: t (df ) ( Xbar muT ) / sX ( Xbar muT ) /(s / n ) 11/7/2015 60 The one sample t test t ( df ) ( Xbar muT ) / sX ( Xbar muT ) /(s / n ) If the absolute value of t exceeds the critical value at .05 in the t table, you have falsified the null and must accept the experimental hypothesis. 11/7/2015 61 The second problem: no control group. Participants as their own controls: the repeated measures t test 11/7/2015 62 3 experimental designs: First = unrelated groups There are three basic ways to run experiments. The first is to create different groups each of which contains different individuals randomly selected from the population. You then measure the groups once to determine whether the differences among their means exceeds that expected for sampling fluctuation. That’s what we’ve done until now. 11/7/2015 63 Second type of design– repeated measures The second is to create one random sample from the population. You then treat the group in different ways and measure that group two or more times, once for each different way the group is treated. Again, you want to determine whether the differences among the group’s means taken at different times, exceeds that expected for sampling fluctuation. 11/7/2015 64 Baseline vs. post-treatment If the first measurement is done before the start of the experiment, the result will be a baseline measurement. This allows participants to function as their own controls. In any event, the question is always whether the change between conditions is larger than you would expect from sampling fluctuation alone. 11/7/2015 65 From this point on, we look only at the difference scores. That is, we ignore the original pre and post absolute scores altogether and only look at the differences between time 1 and time 2. Of course, our first computation is the mean and estimated standard deviation of the differences scores. 11/7/2015 66 Here is the example we used in learning the computation of the repeated measures t test S# A B C D E X 10 8 9 11 12 X=50 n= 5 XD=10.00 11/7/2015 X 10.00 10.00 10.00 10.00 10.00 (X - X)2 0.00 4.00 1.00 1.00 4.00 (X - X) 0.00 2.00 -1.00 1.00 2.00 (X-X)=0.00 (X-X)2=10.00 = SSW MSW = SSW/(n-k) = 10.00/4 = 2.50 s= MSW = 1.58 67 The null hypothesis in our repeated measures t test Theoretically, the null can predict any difference. Pragmatically, the null almost always predicts that there will be no change at all from the first to the second measurement, that the average difference between time 1 and time 2 will be 0.00. Mathematically H0: muD = 0.00, where muD is the average difference score. 11/7/2015 68 Does this look familiar? We have a single set of difference scores to compare to muT. In the single sample t test, we compared a set of scores to muT. So the repeated measures t test is just like the single sample t test. Only this time our scores are difference scores. 11/7/2015 69 To do a t test, we need the expected mean under the null. We have that, muT=0.00. 11/7/2015 70 We also need the expected amount of difference between the two means given random sampling fluctuation. sDbars/ nD 11/7/2015 71 The expected fluctuation of the difference scores is called the estimated standard error of the difference scores, sD-bar. The estimated standard error of the difference scores = the estimated standard deviation of the difference scores divided by the square root of the number of differences scores. It has nD-k = nD – 1 degrees of freedom, where nD is the number of difference scores. Here is the formula for sD-bar sDbar sD / nD 11/7/2015 72 The repeated measures t is a version of the single sample t test: t equals the actual average difference between the means minus their difference under H0 divided by the estimated standard error of the difference scores t (dfD ) ( Dbar muT ) / sDbar 11/7/2015 73 By the way Repeated measures designs are the simplest form of related measures designs, in which the each participants in each group is related to one participant in each of the other groups. The simplest way for participants in one group to be related to each other is to use the same participants in each group. But there are other ways. For example, each mouse in a four condition experiment could have one litter-mate in each of the other conditions. But the commonest design is repeated measures, and that is what we will study 11/7/2015 74 11/7/2015 75 11/7/2015 76 11/7/2015 77 11/7/2015 78 11/7/2015 79 11/7/2015 80 11/7/2015 81 11/7/2015 82