The Scientific Method and the Uses of Statistics

Download Report

Transcript The Scientific Method and the Uses of Statistics

PSYCH 3400
Statistical Methods
CUNY Brooklyn College, Department of Psychology
Alla Chavarga
[email protected]
MTWR 11:50am-12:50pm
Room: 4607J
Office hours:
MT 1-3pm 4305 James
Approach of the Course
• In this class you will learn both the theory and
practice of statistics.
• Homework is practice for the exams
• Essay type answers
• Statistical calculations by hand
• SPSS analysis
Lab Format
•
Announcements (make sure you are on time
•
Demonstration of new computer techniques required for
that week’s homework
•
Period of questions and answers
•
Opportunity for you to work with SPSS when
your TA is present
You should think of the lab section as training, you will complete
most of the homework on your own time.
http://psychfiles.net
•
•
•
•
Contact info
Syllabus/ Semester Schedule
Lecture Slides
Homework Assignments/Problem Sets
Announcements
Notices and updates from me will mainly be handled over email.
• Please log into your email account and send an email to
[email protected]
• Subject: PSYC3400 – YOUR NAME – TA’s NAME
• Ex: PSYC3400 – JANE SMITH – NAOMI / KAMIL
Required Text:
Pagano, Robert R. (2009) Understanding Statistics
in the Behavioral Sciences. 9th Ed. Wadsworth Pub
Co; ISBN: 0534353908
Any edition from the 5th on will work
Appendix A if you feel shaky on the math
Required reading in ADVANCE of lecture.
Definition of a Statistic
OUR WORKING DEFINITION:
A number that organizes, summarizes or makes
understandable a collection of data.
THE FORMAL DEFINITION:
A number calculated on sample data that
quantifies a characteristic of the sample.
Which of these makes more sense?
“In our calculations, we noted large differences in pupil size
between males and females. The male group had pupil diameters
(mm) of 3.2, 4.1, 4.6, 7.2, 4.1, 5.3, 8.1, 6.3, 4.8, 4.6, 4.8, while
females had the following pupil diameters: 4.6, 7.1, 4.7, 3.7, 8.0,
4.8, 6.2, 4.5, 4.9, 7.1, 6.8. Obviously, there is a noticeable
difference.”
vs.
“In our calculations, we noted large differences in pupil size
between males and females. The male group had an average
pupil diameter of 4.9, while females had an average pupil
diameter of 6.1. Obviously, there is a noticeable difference.”
Hours worked
Pay
Hours worked
Pay
Pay
We can also use statistics to describe relationships
that we can depict graphically, such as in these
SCATTERPLOTS.
Hours worked
How do we acquire knowledge?
Authority
Scientific Method
Intuition
Rationality
WHY do I have to learn Statistics?
Some VERY important definitions:
• Experimental vs. Observational Methods
• Population – the complete set of individuals, objects, or
scores that the investigator is interested in studying.
• Sample – a subset of the population.
• Variable – any property or characteristic of some event,
object, or person that may have different values at different
times depending on the conditions
– Independent: the variable that is systematically manipulated by the
investigator
– Dependent: the variable that is measured to determine the effect of
the independent variable
• Data - the measurements made on the subjects of an
experiment
• Statistic – a number calculated on sample data that
quantifies a characteristic of the sample. (Note: Parameter).
– Descriptive vs. inferential statistics
The Concept of a Variable
Any measurable property of a person, event or object that may take
on different values at different times or under different conditions.
Height (y-axis)
Weight (x-axis)
Textile Workers
75
Hieght (inches)
70
65
60
55
50
45
80
100
120
Weight (lbs)
140
160
Compare with a
CONSTANT
like p
Continuous and Discrete
Variables
Discrete Variable
1
2
3
4
2
Can divide
2.125
in half
1/8
infinitely
2.25
1/4
5
2.5
1/2
6
3
Continuous
Variable
Scales of Measurement
Nominal
Ordinal
Interval
Ratio
Names or categories
Order: a sense of greater
or lesser but not by how much
Ordinal and how much greater
& lesser: each interval is equal
Interval scale with an absolute
zero - ratios of scores have
meaning.
Summarizing Samples with
Math and Graphs
i
=
Frequency
(number of
individuals)
SG
Class Heights
(Raw S cores)
15
10
5
0
54 55
Nominal
Ordinal
56
57 58 59 60 61
Interval
Height (inches)
Ratio
62 63 64
Significant Figures and Rounding
It does not make sense to carry our calculations
beyond the real limits of the variables we measure.
Ex: On a thermometer the smallest unit is half of a degree.
By convention, in this class we will round all
numbers to the hundredths place (two places
after the decimal).
5.624  5.62 when the 3rd decimal place is ≤4.
1.287  1.29 when the 3rd decimal place is ≥5.
Mathematical Notation
This is probably new to you.
S
It means “summation”
Mathematical Notation:
Summation Calculation
Student
ID
1
2
3
4
5
6
7
Grade
(X)
93
75
88
77
65
55
97
S X = 93 + 75
S X = 550
+88 + 77 + 65 + 55 + 97
Average of the variable X:
1
n
(S X ) = (1/7) 550
= 78.57
Order of Operations
Order of operations:
Parentheses,
Exponents,
Summation,
Multiplication/Division,
Addition/Subtraction
Read them like
English
sentences or lists of
things to do in order
Important Example
x: { 1, 2, 3}
S x2
(S x )2
“Sum of the squared x’s”
x
1
2
3
“Square of the summed x’s”
x2
(1)2=1
(2)2=4
(3)2=9
x
1
2
3
14
6
62 = 36
How can data be described? Summarized?
Here is a set of 15 height measurements (in inches).
{ 55, 56, 56, 58, 60, 61, 57, 57, 59, 60, 60, 61, 54, 57, 57}
Value
54
55
56
57
58
59
60
61
HEIGHT
5
4
3
2
1
Std. Dev = 2.20
Mean = 57.9
N = 15.00
0
54.0
55.0
HEIGHT
56.0
57.0
58.0
59.0
60.0
61.0
Frequency Histogram
Frequency
1
1
2
4
1
1
3
2
How can data be described? Summarized?
How to create a detailed frequency table:
Example: How many siblings do you have?
Set of scores: x: {2, 1, 5, 0, 2, 1, 2, 0, 1, 1, 3, 1, 2, 1, 1, 0, 0, 2, 3 , 1}
Value
0
1
2
3
4
5
How can data be described? Summarized?
How to create a detailed frequency table:
Example: How many siblings do you have?
Set of scores: x: {2, 1, 5, 0, 2, 1, 2, 0, 1, 1, 3, 1, 2, 1, 1, 0, 0, 2, 3 , 1}
Value
0
1
2
3
4
5
Frequency
How can data be described? Summarized?
How to create a detailed frequency table:
Example: How many siblings do you have?
Set of scores: x: {2, 1, 5, 0, 2, 1, 2, 0, 1, 1, 3, 1, 2, 1, 1, 0, 0, 2, 3 , 1}
Value
0
1
2
3
4
5
Frequency
4
How can data be described? Summarized?
How to create a detailed frequency table:
Example: How many siblings do you have?
Set of scores: x: {2, 1, 5, 0, 2, 1, 2, 0, 1, 1, 3, 1, 2, 1, 1, 0, 0, 2, 3 , 1}
Value
0
1
2
3
4
5
Total
Frequency
4
8
5
2
0
1
20
How can data be described? Summarized?
How to create a detailed frequency table:
Example: How many siblings do you have?
Set of scores: x: {2, 1, 5, 0, 2, 1, 2, 0, 1, 1, 3, 1, 2, 1, 1, 0, 0, 2, 3 , 1}
Value
0
1
2
3
4
5
Total
Frequency Percent
4
8
5
2
0
1
20
How can data be described? Summarized?
How to create a detailed frequency table:
Example: How many siblings do you have?
Set of scores: x: {2, 1, 5, 0, 2, 1, 2, 0, 1, 1, 3, 1, 2, 1, 1, 0, 0, 2, 3 , 1}
Value
0
1
2
3
4
5
Total
Frequency Percent
4
20
8
5
2
0
1
20
= (4/20) x 100
= .20 x 100
= 20
How can data be described? Summarized?
How to create a detailed frequency table:
Example: How many siblings do you have?
Set of scores: x: {2, 1, 5, 0, 2, 1, 2, 0, 1, 1, 3, 1, 2, 1, 1, 0, 0, 2, 3 , 1}
Value
0
1
2
3
4
5
Total
Frequency
4
8
5
2
0
1
20
Percent
20
40
25
10
0
5
Cumulative
Frequency
4
12
17
19
19
20
Cumulative
Percent
20
60
85
95
95
100
How can data be described? Summarized?
How to create a detailed frequency table:
Example: TEST GRADES!!?
Set of scores: x: {100, 23, 65, 98, 84, 72, 50, 49, 52, 99, 83, 79, 89, 90
56, 63, 72, 92, 83, 100}
What if our range is very large?
-We use class intervals instead of single values
-Rule for # of intervals for use in this class: 10
-To determine the width that each interval should be given the range of
data we have, use the following formula:
= (Highest score – Lowest score)/10
= (100 – 23)/10
= 77/10
= 7.7  round this to the next whole number, 8.
How can data be described? Summarized?
How to create a detailed frequency table:
Example: TEST GRADES!!?
Set of scores: x: {100, 23, 65, 98, 84, 72, 50, 49, 52, 99, 83, 79, 89, 90
56, 63, 72, 92, 83, 100}
Intervals
23-30
31-38
39-46
47-54
55-62
63-70
71-78
79-86
87-94
95-102
How can data be described? Summarized?
How to create a detailed frequency table:
Example: TEST GRADES!!?
Set of scores: x: {100, 23, 65, 98, 84, 72, 50, 49, 52, 99, 83, 79, 89, 90
56, 63, 72, 92, 83, 100}
Intervals
23-30
31-38
39-46
47-54
55-62
63-70
71-78
79-86
87-94
95-102
Frequency
1
0
0
3
1
2
2
4
3
4
How can data be described? Summarized?
How to create a detailed frequency table:
Example: TEST GRADES!!?
Set of scores: x: {100, 23, 65, 98, 84, 72, 50, 49, 52, 99, 83, 79, 89, 90
56, 63, 72, 92, 83, 100}
Cumulative Cumulative
Intervals Frequency Percent
Frequency Percent
5
23-30
1
5
1
0
31-38
0
5
1
0
39-46
0
5
1
15
47-54
3
20
4
5
55-62
1
25
5
10
63-70
2
35
7
10
71-78
2
45
9
20
79-86
4
65
13
15
87-94
3
80
16
20
95-102
4
100
20
Choice of Interval is Important
HEIGHT
30
20
HEIGHT
20
0
46.0
43-48
50.5
49-54
55.0
55-60
59.5
61-66
64.0
67-72
10
HEIGHT
Frequency
Frequency
10
0
45.0
45-47
HEIGHT
47.5
50.0
52.5 57-59
55.0
57.5
60.0
62.5
65.0
48-50
51-53
54-56
60-62
63-65
66-68
69-71
Frequency Polygons
5.0
4.0
3.0
2.0
Count
1.0
0.0
54.00
55.00
HEIGHT
56.00
57.00
58.00
59.00
60.00
61.00
HEIGHT
5
By Comparison…
4
3
2
5.0
1
Std. Dev = 2.20
4.0
Mean = 57.9
N = 15.00
0
54.0
55.0
56.0
57.0
58.0
59.0
60.0
61.0
3.0
HEIGHT
2.0
Count
1.0
0.0
54.00
55.00
HEIGHT
56.00
57.00
58.00
59.00
60.00
61.00
HEIGHT
5
These are
By Comparison…
commonly referred
to as
DISTRIBUTIONS
4
3
2
5.0
1
Std. Dev = 2.20
4.0
Mean = 57.9
N = 15.00
0
54.0
55.0
56.0
57.0
58.0
59.0
60.0
61.0
3.0
HEIGHT
2.0
Count
1.0
0.0
54.00
55.00
HEIGHT
56.00
57.00
58.00
59.00
60.00
61.00
Common Shapes of Frequency
Distributions
HEIGHT
7
6
5
4
3
Frequency
2
1
0
54.0
55.0
56.0
57.0
58.0
59.0
60.0
HEIGHT
HEIGHT
7
6
6
5
5
4
4
3
3
2
2
Frequency
Frequency
HEIGHT
7
1
0
54.0
HEIGHT
55.0
56.0
57.0
58.0
59.0
60.0
1
0
54.0
HEIGHT
55.0
56.0
57.0
58.0
59.0
60.0
Common Shapes of Frequency
Distributions
Common Shapes of Frequency
Distributions
Symmetrical
Bell-shaped
Positively
Skewed
Negatively
Skewed
Multimodal Distributions
HEIGHT
HEIGHT
8
14
12
6
10
8
4
4
2
Frequency
Frequency
6
0
54.0
HEIGHT
55.0
56.0
57.0
58.0
59.0
60.0
2
0
54.0
55.0
56.0
57.0
58.0
59.0
HEIGHT
When describing a distribution, always specify:
-Is it unimodal, bimodal, multimodal?
-Is it symmetrical?
-Is it skewed, positive or negative?
60.0
61.0
62.0
A real example…
Psych Stats 3400 First Exam Grades
N=66 students
16
14
Frequency
12
10
8
6
4
2
0
20-28
29-36
37-44
45-52
53-60
61-68
Grade
69-76
77-84
85-92
93-100
IT’S THE HUMAN HISTOGRAM!
Is this a histogram?