Transcript Document

Introduction to Statistics
Chapter 1
MSIS 111
Prof. Nick Dedeke
1
Objectives
Define statistics
Differentiate between descriptive and
inferential statistics
Define statistical variables
Classifying numbers
2
What is Statistics?
A general way to view statistics is as
follows: it is a language and the set of
rules that enables us to make sense of
data about events, people, places and
things.
3
Valid Statistic?: Example 1
An online survey conducted recently led
some to the conclusion that Apple’s
iphone product will not succeed in the
U.S. market. 75% of the men and 89%
of the women surveyed answered
“never” when asked the question:
Would you buy an ipod?
4
Valid Statistic?: Example 2
When you vote consider this
information. A mail survey showed that
in the years when Democrats controlled
the Congress, U.S. had a higher number
of destructive, level 5 hurricanes. In the
years that the Republicans controlled
Congress, the U.S. have more days with
extremely cold and extremely hot days.
5
Valid Statistic?: Example 3
If you are seeking to have a job quickly
after you graduate, do not wear a clothing
with a white color during your interview.
A recent phone survey of fifty human
resources managers at the top 10 retail
firms in America revealed that only 2% of
them wear white clothing to work.
6
Facts
There is such a thing as bad statistics

Poor methods, sample, and/or
interpretation
You can always make bad statistics say
anything you want it to say
The cure for bad statistics is good
statistics
7
Do we really need statistics?
Imagine a government never gathers
data about population growth.
Imagine a hospital that never stores
data about patient data and care
Imagine a car firm that never analyzes
data about vehicle rollovers
Imagine an insurance firm that never
interprets the causes for the increases
in health care costs
8
Definition of statistics?
Statistics is a science dealing with the
collection, organization, analysis,
interpretation and presentation of
quantitative and qualitative data.
Statistics is a means to an end. The objective
is not statistics for its own sake, it is the
effective use of statistics for decision-making
that matters for firms.
9
Challenge of statistics?
Statistics has two primary challenges:


Describing a group of entities using a segment of
the group. For example, we have over 300 million
U.S. citizens. I have the question to answer. How
tall are Americans? This kind is called descriptive
statistics. FOCUS – Present or Past
Generating conclusions about future trends of a
large group of data using smaller set from the
same or related group. For example, I have the
question: At which rate are we depleting fishes in
our rivers? This kind is called inferential statistics.
FOCUS – Present or Future
10
Terminologies in statistics?




Census: Gathering of data from every member of
a group or population, e.g. all voters in a
presidential election, all subscribers to cable TV
Sample: A randomly sampled set of members of a
population (fraction of the size of a census)
Variable: Attribute of interest of each member of
group
Observation or measurement: The value of a
variable for a member of a group (population or
sample)
11
Exercise 1:
How many members
are in this sample?
Bill, Marty, Mary, Sue,
Buba, Dub,
Anne, Ali Baba, Jane,
Phil, Don, Monki
If I were interested in
the physical attributes
of the members, which
two variables will I
survey?
If I were interested in
the opinions of the
sample which two
variables will I survey?
If I were interested in
the identity of the
members, which two
variables will I survey?
12
Exercise 1 Responses
How many members
are in this sample
(data set)? 12
Physical attributes:
height, weight, hair
color, gender
Opinion: political
affiliation, political
worldview,
Identity: last name,
nationality, ID
number, Soc.
Sec.No.
13
Exercise 2
For each of the
underlined variables
write down an example
of what the observation
(responses to survey)
would be when you
survey a member of the
population.
Physical attributes:
height, weight, hair
color, gender
Opinion: political
affiliation, political
worldview,
Identity: last name,
nationality, ID number,
Soc. Sec.No.
14
Exercise 2 Responses
Weight: 200 pounds
Gender: Female
Politic. affiliation:
Republican
Political view:
Liberal
Nationality: Nigerian
Soc.Sec: 123974
Numerical data:
Permit the use of
arithmetical
operations
Categorical data:
Permit only the
building of
subgroups
15
Data Measurement
The question that one puts on a survey
determines how a variable is measured.
Consider the following questions:



How much income do you make per year (in
thousand $)?
Do you make more than the US national average
of $30,000 per year?
[Yes]
[No]
How much income do you make per year?
[Below $10k] [$10k to $30k] [$30k to $50k]
[$50k to $70k] [above $70k]
16
Data Measurement
Many variables could be measured at different levels.



Do you make more than the US national average of $30,000
per year?
[Yes]
[No]
Nominal level. Grouping only and ranking not advisable/
permissible
How much income do you make per year?
[Below $10k] [$10k to $30k] [$30k to $50k] [$50k to $70k]
[above $70k]
Ordinal level. Absolute zero not emphasized and ranking
possible
How much income do you make per year (in thousand $)?
Ratio level. Absolute zero and ratio of numbers are
meaningful. Arithmetical operations possible
17
Exercise 3: Data Measurement
What is the level of measurements of the
following observations:





1980: date of birth
Social security number
Temperature, e.g. 90 degrees Fahrenheit
Age: 19 years old
Rating of customer service: Excellent (7)
18
Exercise 3: Responses
What kind of level of measurements are the
following observations:





1980: date of birth [ORDINAL]
Social security number [NOMINAL]
Temperature, e.g. 90 degrees Fahrenheit
[INTERVAL]
Age: 19 years old [RATIO]
Rating of customer service: Excellent (7) [ORDINAL]
19
Analyzing Data




Nonparametric statistics [ORDINAL]
Nonparametric statistics [NOMINAL]
Parametric statistics [INTERVAL]
Parametric statistics [RATIO]
20
Data Measurement: Examples
Two respondents: $20,000 and $ 40,000 income/yr. Many variables
could be measured at different levels.




Do you make more than the US national average of $30,000 per year?
[Yes]
[No]
Nominal level. Grouping only and ranking not advisable/ permissible.
Analyses: Income class of B ranks higher than A. Difference in incomes =
??; ratio of income of class?? Not possible.
How much income do you make per year?
[Below $10k] [$10k to $30k] [$30k to $50k] [$50k to $70k] [above $70k]
Ordinal level. Absolute zero not emphasized and ranking possible.
Analyses: Income class of B ranks higher than A. Difference in income
classes = ranges from $1-$40,000; ratio of income of class?? Not possible.
If you divide your salary by $20,000 per year, what do you get?
[¼] [½ ] [¾ ] [1] [1¼] [1½ ] 1¾ ] [2] [2¼] [2½ ] [2¾]
Interval level. Absolute zero is convenient and ratio of numbers are
meaningful. Analyses: Income B ranks higher than A. Difference between
consecutive income classes =$5,000; ratio of income of B twice as high as
A (2 divided by 1).
How much income do you make per year (in thousand $)?
___________ $ thousands
Ratio level. Absolute zero and ratio of numbers are meaningful. Analyses:
Income B ranks higher than A. Difference in income =$20,000; income of B
twice as high as A (40,000/20,000).
21