Presentation
Download
Report
Transcript Presentation
CBR221
Introduction to Survey Data Analysis
with Excel
Workshop Objectives
Use Excel to help you:
– Organize data for analysis
– Systematically work with data
– Analyze data
– Graphically display analysis
2
We’ll Examine
– Types of variables used in analysis
– Types of measurement scales used in analysis
– How to describe data with Frequency counts,
Descriptive Statistics, Histograms, and Pivot Tables
– How to create charts
Survey
Questions:
1.
Area of the city child lives in?
2.
Number of colds child had last year?
3.
Gender:
4.
Age
5.
Describe cold symptoms __________
Male_____ Female _____
Analysis
Helps describe, conclude, recommend
Systematic exploration for interpreting data
5
Survey Data
Answers can be in text or numbered formats
Statistics
Systematic method of converting and analyzing data by
using numbers
7
Excel
Support tool for statistical methods
8
Start Excel
© The Wellesley Institute
www.wellesleyinstitute.co
9
Moving from Model to Excel
Data Analysis
• Assumption, hypothesis, or model
• Collect data
• Organize data for analysis
10
Example Model or
Assumption
• “West area children get fewer colds than central area
children.”
• Want proof
• Analysis: Find mean (average) number of colds by area
When Data Fit Model
• Data cluster as expected
• Findings support assumption or model
For 33 west area children,
the mean is 5 colds.
For 31 central area children,
the mean is 6 colds.
The results indicate west area children do have fewer colds than central
area children.
Table Illustrates
Table 1. Colds by City Area
Zone
Participants Colds
x
n
West
33
5
Central
31
Total
64
6
Graph Illustrates
Mean
Colds
6
5
West
Central
Figure 1. West area children had a lower mean
number of colds than central area children.
Pie Chart Illustrates
West, 5
Central, 6
Figure 1. West area children had a lower mean of five (5)
colds than central area children who had a mean value of six
(6).
Data Organizing Tips
As required, ensure:
• All questions answered
• Questions needing to be skipped, were skipped
• Split multiple choice into 1 answer per column
• Open File 1
16
Coding Open Ended
Questions
What cold symptoms did your child have?
• Take first 100 answers
• Group similar answers together
• You define what is similar
• Reduce to 10-20 codes or fewer if useful
• Pilot test
17
Exercise: Create 3 codes
Question: Describe child’s symptoms?
•
•
•
•
•
•
•
•
•
Response 1
Response 2
Response 3
Response 4
Response 5
Response 6
Response 7
Response 8
Response 9
Stuffy nose
Sinus congestion, Runny nose
Difficulty breathing through nose
Phlegm, Body ache, Runny nose
No energy, Cough
No energy
Sore throat
Cough
No energy
18
Exercise:
Code Symptoms
Code
Description
Block
Stuffy nose, Sinus congestion, Difficulty
breathing through nose, No energy
Expel
Cough, Phlegm, Runny nose
Pain
Body Ache, Headache, Sore throat
19
Assign Values to Codes
Code
Description
Value
Block
Stuffy nose, Sinus congestion,
Difficulty breathing through nose,
No energy
Yes=1
No=0
Expel
Cough, Phlegm, Runny nose
Yes=1
No=0
Pain
Body Ache, Headache, Sore throat
Yes=1
No=0
20
I entered Code Variables and
Values on new Excel sheet
21
I entered Responses with
values on new Excel Sheet
22
To Enter Data
•
•
•
•
Row 1 has label for each variable
Enter data 1 survey at a time
1 row per ID, work left to right
1 answer per column
row 1
A
ID
B
Location
C
Block
row 2
1
1
0
row 3
2
2
1
row 4
3
1
0
23
To Enter Data
• Enter 1 survey at a time
• For each question, work left to right across single Excel
row
• Look at File 1b
24
Practice
Enter answers on File 1b, sheet 3
Tip: Split answers across 3 columns
Question 2
• ID#1 Answer
• ID#2 Answer
• ID#3 Answer
• ID#4 Answer
• ID#5 Answer
a)Block X
a)Block X
a)Block X
a)Block X
a)Block X
b)Expel
b)Expel X
b)Expel
b)Expel X
b)Expel X
c)Pain
c)Pain
c)Pain
c)Pain X
c)Pain
Compare your results with Data sheet
25
Other Possible Open-Ended
Question Codes
1 = positive comment
2 = negative comment
3 = neutral comment, positive and negative
Verbatim codes:
e.g., “ache”, “congested”, “cough”
Code only those related to research question
26
Text or Numbers
• Text codes often easier to remember, fewer entry errors
• e.g. M for male and F for female
• But numbers often faster entry
• Easier to work with numbers in Excel
27
Check for Response
Accuracy
• See File 1b, Accuracy Sheet
• Take 1 question at a time - i.e. pick out single column
and check answers
• ID numbers at left
• For unanswered question, create blank cell (pivot tables
count blanks)
28
Use Excel
When have larger number of respondents
Makes manual calculations easier
29
Organize Data
• Make sure data are entered into Excel in such a way
that mathematical transactions can be performed on
them
• E.g., If studying gender, let male equal 1 and female
equal 2. Can then count the 1s and 2s in your study.
30
To Explain Findings
Use common terms e.g., Variable
Use accepted methods of analyses
(Certain variables and measurements scales use certain
tests)
Variable
An object or human characteristic that:
•
•
•
Is observable
Can be subject to variation
Can be classified according to a type (discrete,
continuous as well as dependent, independent)
Variable and Measurement
Scale
• Certain variables also use certain scales
• Can do frequency test for all variables
• But variable and scale type may also further inform with
additional statistical test e.g., descriptive statistics
Discrete vs Continuous
Variable
Discrete
Continuous
• Finite values
• No valid values
in-between
• E.g.,
male/female
full/part-time/co-op
• Infinite values
• Valid values in-between
• E.g., distance, height, age
Discrete or Continuous
Variable
Hypothesis:
City areas have different water temperatures.
Discrete Variable:
e.g., West, Central, East area
Yes or No Value, no values in-between, no equi-distance
Continuous:
e.g., water temperature
Infinite number or range of possible values in-between
35
Dependent vs Independent
Variable
Dependent
• Assumed to change
Independent
• Assumed to influence or does
not change
Hypothesis:
City areas have different water temperatures
Dependent
Water temperature
Independent
City area
Measurement Scales
How do we measure what we are working with?
Types:
1.
2.
3.
4.
Nominal
Ordinal
Interval
Ratio
Nominal Scale
•
•
•
•
•
•
Finite values
No values in-between
No logical order
Yes/No answer
E.g., East, West, Central
E.g., colour, gender
Ordinal Scale
•
•
•
•
•
Finite values
No values in-between
Yes logical order
Yes/No answer
E.g., letter grades
Grade B
Grade C
Grade A
Interval Scale
•
•
•
•
•
•
•
•
•
Infinite values
Logical order
Values in-between
Equal distance between data points
How much? Numerical value
No natural zero, keeps going
No meaningful ratio between
numbers
E.g., temperature,
20 degrees NOT twice as hot as 10
Ratio Scale
•
•
•
•
•
•
•
•
Infinite values
Logical order
Values in-between
Equal distance between data points
Comparing how much? Numerical values
“0” value means something
Meaningful ratio between numbers
E.g., AGE
Adult earns $50K; Teenager earns $25K Adult earns twice
as much as teenager
:
Variable Type Review
Variable Type
Values Values
Value Values
Scale *
Infinite inin
Equidist
Range between order ant
Discrete
(Yes/no)
X
X
X
X
Nominal
Discrete Grade
(Yes/no) A B C
X
X
√
X
Ordinal
Continuous
(How much?)
Temperature or
$$, Age
√
√
√
√
Interval or
Ratio
Dependent
Assumed changes due to an independent variable
Independent
Assumed does not change
Areas
* Certain variables lend themselves to using certain types of measurement scales
Measurement Scale Review
Scale
Values
finite in
a range
Values Values
Values are Test *
have
equidistant
inorder between
Nominal
(yes/no), AREAS
√
X
X
X
Frequency
Ordinal A,B,C,
(yes/no)
√
√
X
X
Frequency
Interval/ Ratio
X
(how much for
1 sample?)
Age
√
√
√
Descriptive
Statistics
Interval/ Ratio
(how much for
comparing 2 or
more samples?)
G1 age G2 age
√
√
√
Inferential
Statistics
e.g. t-test
X
Note: * Certain scales lend themselves to using certain statistical tests.
Statistical Tests
• Frequency count for discrete data (nominal, ordinal scale) *
- quantity
• Descriptive Statistics for continuous data (interval, ratio scale)
– mean, median, mode
– Characteristics of single sample
• Inferential Statistics for continuous data (ratio scale)
– t-test
– Comparison of two or more samples
– Making inference from samples to populations
Note: Can do frequency count for all data types
44
Methodology
• “By using a discrete/continuous independent variable
called area (for city areas West, East, and Central), this
study examined the discrete/continuous, dependent
variable (with its possible multiple range of values) namely,
water temperature.”
• “To measure the independent variable, (to indicate if
someone lived in the West, East, or Central area), a
nominal/ordinal/interval/ratio scale was used.”
• “To measure the dependent variable, to indicate how much
the water temperature is), a(n) nominal/ordinal/interval/ratio
scale was used.”
45
Methodology
• “By using a discrete independent variable called area, this
study examined the continuous, dependent variable
namely, water temperature.”
• “To measure the independent variable, a nominal scale
was used.”
• “To measure the continuous dependent variable, an
interval scale was used.”
46
Calculations
The statistical calculations we would conduct with Excel
would be:
• Frequency counts for each of the 3 city areas
• Descriptive statistics of mean, median, mode for water
temperatures
47
Review of Descriptive
Statistical Terms
• Mean – average
• Median – where 50% of scores lie above a certain score
and where 50% lie below a certain score
• Mode – score that results most often
48
Format Cells
Open Booklet
Open File 1c)