Chi-Square Slides from : Heibatollah Baghi 1

Download Report

Transcript Chi-Square Slides from : Heibatollah Baghi 1

1
Chi-Square
Slides from : Heibatollah Baghi
Chi-Square (χ2) and Frequency Data
Up to this point, the inference to the population has been
concerned with “scores” on one or more variables, such as
CAT scores, mathematics achievement, and hours spent on
the computer.
We used these scores to make the inferences about
population means. To be sure not all research questions
involve score data.
Today the data that we analyze consists of frequencies; that
is, the number of individuals falling into categories. In
other words, the variables are measured on a nominal
scale.
The test statistic for frequency data is Pearson Chi-Square.
The magnitude of Pearson Chi-Square reflects the amount
of discrepancy between observed frequencies and expected
frequencies.
2
Steps in Test of Hypothesis
1.
2.
3.
4.
5.
6.
Determine the appropriate test
Establish the level of significance:α
Formulate the statistical hypothesis
Calculate the test statistic
Determine the degree of freedom
Compare computed test statistic against a
tabled/critical value
3
1. Determine Appropriate Test
Chi Square is used when both variables are
measured on a nominal scale.
It can be applied to interval or ratio data that have
been categorized into a small number of groups.
It assumes that the observations are randomly
sampled from the population.
All observations are independent (an individual
can appear only once in a table and there are no
overlapping categories).
It does not make any assumptions about the shape
of the distribution nor about the homogeneity of
variances.
4
2. Establish Level of Significance
α is a predetermined value
The convention
• α = .05
• α = .01
• α = .001
5
3. Determine The Hypothesis:
Whether There is an Association
or Not
Ho : The two variables are independent
Ha : The two variables are associated
6
4. Calculating Test Statistics
Contrasts observed frequencies in each cell of a
contingency table with expected frequencies.
The expected frequencies represent the number of
cases that would be found in each cell if the null
hypothesis were true ( i.e. the nominal variables
are unrelated).
Expected frequency of two unrelated events is
product of the row and column frequency divided
by number of cases.
Fe= Fr Fc / N
7
(obs  exp)
 
exp
2
2
•The “Χ” is the Greek letter chi; the “∑” is a
sigma; it means to sum the following terms for
all phenotypes.
• “obs” is the number of individuals of the
given phenotype observed
• “exp” is the number of that phenotype
expected from the null hypothesis.
 Note that you must use the number of individuals,
the counts, and NOT proportions, ratios, or
frequencies.
8
9
5. Determine Degrees of
Freedom
df = (R-1)(C-1)
6. Compare computed test statistic
against a tabled/critical value
The computed value of the Pearson chisquare statistic is compared with the critical
value to determine if the computed value is
improbable
The critical tabled values are based on
sampling distributions of the Pearson chisquare statistic
If calculated 2 is greater than 2 table
value, reject Ho
10
Chi-Square Table
‫‪Example‬‬
‫پژوهشگری عالقه مند به بررسی وضعیت افسردگی در‬
‫سالمندان بستری در بیمارستان های شهر کرمان می‬
‫باشد‪.‬‬
‫به همین منظور با استفاده از پرسشنامه‪ ،‬داده های الزم‬
‫را از بیمارن سالمند بستری جمع آوری می نماید‪.‬‬
‫‪12‬‬
‫فروانی افسردگی در سالمندان مورد‬
‫بررسی برحسب جنس‬
‫‪f row‬‬
‫‪Not‬‬
‫‪Depress‬‬
‫‪25‬‬
‫‪120‬‬
‫‪Male‬‬
‫‪121‬‬
‫‪35‬‬
‫‪86‬‬
‫‪Female‬‬
‫‪n = 266‬‬
‫‪60‬‬
‫‪206‬‬
‫‪f column‬‬
‫‪145‬‬
‫‪13‬‬
‫‪Depress‬‬
Depress
Row frequency
‫فروانی افسردگی در سالمندان مورد‬
‫بررسی برحسب جنس‬
Male
120
Not
Depress
25
f row
Female
86
35
121
f column
206
60
n = 266
145
14
‫فروانی افسردگی در سالمندان مورد‬
‫بررسی برحسب جنس‬
Depress
Male
120
Not
Depress
25
Female
86
35
121
206
60
n = 266
f frequency
column
Column
f row
145
15
1. Determine Appropriate Test
1. Sex ( 2 levels) and Nominal
2. Depression Condition ( 2 levels) and
Nominal
16
17
2. Establish Level of Significance
Alpha of .05
‫‪3. Determine The Hypothesis‬‬
‫• ‪ -Ho‬بین زنان و مردان سالمند بستری از نظر ابتال به‬
‫افسردگی اختالفی وجود ندارد‬
‫• ‪ -H1‬بین زنان و مردان سالمند بستری از نظر ابتال به‬
‫افسردگی اختالف وجود دارد‬
‫‪18‬‬
4. Calculating Test Statistics
Depress
Male
120
Not
Depress
25
f row
145
Exp=112.3 Exp=32.7
Female
f column
86
35
Exp=93.7
Exp=27.3
206
60
121
n = 266
19
4. Calculating Test Statistics
(obs  exp)
 
exp
2
2
(120  112.3) 2 (25  32.7) 2 (86  93.7) 2 (35  27.3) 2
 



 5.15
112.3
32.7
93.7
27.3
2
20
21
5. Determine Degrees of
Freedom
df = (R-1)(C-1) =
(2-1)(2-1) = 1
‫‪6. Compare computed test statistic against a‬‬
‫‪tabled/critical value‬‬
‫عدد کای محاسبه شده با درنظر گرفتن درجه آزادی بدست آمده و ‪α =0.05‬‬
‫‪ ،‬با مقادیر بحرانی در جدول مقایسه می گردد‪ ،‬از آنجایی که کای محاسبه شده‬
‫از عدد بحرانی در جدول بزرگتر می باشد (‪)3.84 > 5.15‬لذا نتیجه می گیریم‬
‫اختالف مشاهده شده با ‪ P value‬کمتر از ‪ 0.05‬معنا دار می باشد‪.‬‬
‫گزارش نتیجه آزمون با استفاده از نرم افزار‬
SPSS
Crosstab
q1
sex
‫مرد‬
‫زن‬
Total
Count
% within sex
Count
% within sex
Count
% within sex
‫بلي‬
‫خیر‬
120
82.8%
86
71.1%
206
77.4%
25
17.2%
35
28.9%
60
22.6%
Total
145
100.0%
121
100.0%
266
100.0%
23
Chi-Square Tests
Pearson Chi-Square
Continuity Correction a
Likelihood Ratio
Fis her's Exact Test
Linear-by-Linear
As sociation
N of Valid Cases
Value
5.155b
4.508
5.144
5.135
df
1
1
1
1
As ymp. Sig.
(2-sided)
.023
.034
.023
Exact Sig.
(2-sided)
Exact Sig.
(1-sided)
.027
.017
.023
266
a. Computed only for a 2x2 table
b. 0 cells (.0%) have expected count less than 5. The minimum expected count is
27.29.
24
‫‪Example 2‬‬
‫در همین پژوهش به بررسی رابطه شدت‬
‫افسردگی و بخش بستری سالمندان نیز پرداخته‬
‫شده است و نتایج زیر بدست آمده است‪.‬‬
‫‪25‬‬
noitalubatssorC
EROCSSDG * ‫بخش بستري‬
‫بخش بستري‬
‫جراحي‬
‫داخلي‬
Total
Count
% within ‫يرتسب شخب‬
Count
% within ‫يرتسب شخب‬
Count
% within ‫يرتسب شخب‬
No
Depres sion
41
52.6%
44
23.4%
85
32.0%
GDSSCORE
Mild
depres sion
24
30.8%
85
45.2%
109
41.0%
Sever
Depres sion
13
16.7%
59
31.4%
72
27.1%
Total
78
100.0%
188
100.0%
266
100.0%
Ch i-Sq uare Te sts
Pearson Chi-S quare
Lik elihood Rati o
Linear-by-Linear
As soc iation
N of V alid Cases
Value
21.886 a
21.224
17.986
2
2
As ymp. Si g.
(2-sided)
.000
.000
1
.000
df
266
a. 0 c ells (.0% ) have expected count less than 5. The
mi nimum expected count is 21. 11.
26