GEOGRAPHICAL STATISTICS GE 2110
Download
Report
Transcript GEOGRAPHICAL STATISTICS GE 2110
GEOGRAPHICAL STATISTICS
GE 2110
ZAKARIA A. KHAMIS
Descriptive statistics
2
Statistics are interesting….”only when they are set in
wider context that they begin to come to life”
Five Rules for using statistics by Danny Dorling
1.
2.
3.
4.
5.
Often there is little point in using statistics
If you do use statistics, make sure they can be
understood
Do not overuse statistics in your work
If you find a complex statistics useful then explain it
clearly
Recognize and harness the power of statistics
Zakaria Khamis
7/17/2015
Measures of Central Tendency
3
In most cases, it is helpful to describe data by a
single number that is most representative of the
entire collection of data
The single numbers which tend to appearing in the
middle of the data distribution MCT
They act as the fulcrum (center of gravity) at which
the data balance
Zakaria Khamis
7/17/2015
Means
4
Means are of many types, the very commonly used is
Arithmetic mean; however, there are Geometric and
Harmonic among others
Arithmetic Mean
Simply is the average the observations (data)
Arithmetic Mean is in most cases referred to mean
and is denoted by
Zakaria Khamis
x
7/17/2015
Means
5
The mean, or average, of n numbers is the sum of the
numbers divided by n
n
Mathematically,
x
x
i 1
i
n
Where xi denotes the value of observation i, and n
denotes number of observations
Mean value is influenced by extreme measurements
Zakaria Khamis
7/17/2015
Means
6
Zakaria Khamis
7/17/2015
Means
7
Geometric Mean
The geometric mean only applies to positive
numbers.
It is also often used for a set of numbers whose
values are meant to be multiplied together or are
exponential in nature, such as data on the growth of
the human population or interest rates of a financial
investment
Zakaria Khamis
7/17/2015
Means
8
The Geometric mean of n numbers is the nth root of
the product of the numbers
Mathematically,
GM n
n
x
i
i 1
Where xi denotes the value of observation i, and n
denotes number of observations
This is rarely used in statistical analysis
Zakaria Khamis
7/17/2015
Means
9
Harmonic Mean
This is most commonly used when the average rate is
what of interest E.g. the average speed of a car;
the average rate of population increase
The Harmonic mean of n numbers is given by
HM
Zakaria Khamis
n
n
1
i 1 xi
7/17/2015
Mode and Median
10
Median is defined as the observation that splits the
ranked list of observations (arranged in ascending or
descending) in half
When the number of observation is odd, median is
simply equal to the middle value on a ranked list of
observations
When the number of observation is even, median is the
average of the two values in the middle of ranked list
Zakaria Khamis
7/17/2015
Mode and Median
11
Mode refers to the most frequently occurring value
If two numbers tie for most frequent occurrence, the
collection has two modes and is called bimodal.
Which of the three measures of central tendency is the
most representative?
The answer is that it depends on the distribution of the
data and the way in which you plan to use the data
Zakaria Khamis
7/17/2015
Measures of Central Tendency
12
Zakaria Khamis
7/17/2015
Measures of Central Tendency
13
Class examples:
12, 33, 11, 45, 45, 34, 20, 67, 87, 19, 12, 12
Mean =
Mode =
Median =
Zakaria Khamis
7/17/2015
Measures of Dispersion/Variability
14
The phenomena and aspects of the world we lives is
changing spatially (within location) and temperarily
(time to time)
For examples. The changes in human population,
the changes in standard living, and changes in literacy
rate and the changes in price
variability attract the experts to make detailed
studies about them and then correlate these changes
with the human life.
Zakaria Khamis
7/17/2015
Measures of Dispersion/Variability
15
In statistics, the MCT measures the center of the data
while the dispersion measures how the observation
spread away from the center
If the observation are close to the center ( arithmetic
mean or median) dispersion, scatter or variation
is small
If the observations are spread away from the center
dispersion is large.
Zakaria Khamis
7/17/2015
Measures of Dispersion/Variability
16
Suppose we have three groups of students who have
obtained the following marks in a test
Group A: 46, 48, 50, 52, 54
Group B: 30, 40, 50, 60, 70
Group C: 40, 50, 60, 70, 80
Zakaria Khamis
Mean =50
Mean =50
Mean =60
7/17/2015
Measures of Dispersion/Variability
17
The idea of dispersion is important in the study of
wages of workers, prices of commodities, standard of
living of different people, distribution of wealth,
distribution of land among farmers and various
other fields of life.
It will help in identifies those variation and solve
any problem which might happen.
Zakaria Khamis
7/17/2015
Dispersion Range
18
Is the difference between the highest and the
lowest value in a series of data
Range xmax xmin
Zakaria Khamis
7/17/2015
Variance and Standard Deviation
19
The variance represents the average squared
deviation of an observation from the mean
n
s2
2
(
x
x
)
i
i 1
n
The standard deviation refers to the square root of
variance
n
s
Zakaria Khamis
2
(
x
x
)
i
i 1
n
7/17/2015
Variance and Standard Deviation
20
The standard deviation of a set is a measure of
how much a typical number in the set differs from
the mean. The greater the standard deviation, the
more the numbers in the set vary from the mean
Imagine a researcher examine the monthly salary of
Zanzibar secondary school teachers. He took 10
samples out of secondary school teachers .
44, 50, 38, 96, 42, 47, 40, 39, 46, 50’ 0000
Zakaria Khamis
7/17/2015
Variance and Standard Deviation
21
He calculated the mean = 49.2
This information telling us that all secondary school
teachers receive 49.2 per months.
However there might be variation because we have
different categories of teacher in Zanzibar: diploma,
bachelor degree, Master degree , private and public
owned.
Zakaria Khamis
7/17/2015
Variance and Standard Deviation
22
Standard deviation = 17
Mean +/- standard deviation
49.2 - 17 = 32.2
49.2 + 17 = 66.2
This mean that, most of the secondary school
teachers receive between 32.20 and 66.20tsh/=
Zakaria Khamis
7/17/2015
Quartiles
23
While standard deviation (SD) is the measure of
dispersion that is associated with the mean;
Quartiles measure dispersion associated with the
median
Consider an ordered set of numbers whose median is
m. The lower quartile is the median of the numbers
that occur before m. The upper quartile is the
median of the numbers that occur after m.
Zakaria Khamis
7/17/2015
Quartiles
24
Zakaria Khamis
7/17/2015
Inter-Quartile Range
25
In some statistical analysis we may need to find the
difference which exists between the Quartiles the
inter-quartile is calculated
Inter-quartile range is the difference between the
25th and 75th percentile
When the data have been ranked from lowest to
highest, with n observations, the 25th percentile is
represented by observation
( n 1)
Zakaria Khamis
4
7/17/2015
Inter-Quartile Range
26
The 75th percentile is represented by observation
3(n 1)
4
This provides much more detail information about
the data, for it provides within data picture of the
variability by removing the outlying values
Zakaria Khamis
7/17/2015
Skewness and Kurtosis
27
Skewness measures the degree of asymmetry
exhibited by the data
The data can exhibits +ve skewness or –ve skewness
If the mean of the data is greater than its median, the
data is positively skewed; and if the mean of the data
is less than its median, the data is negatively skewed
n
Mathematically,
Zakaria Khamis
skewness
(x x)
i 1
3
i
ns3
7/17/2015
Skewness and Kurtosis
28
Kurtosis measure the peaking of the data relative to
the normal distribution
Data with high degree of peakeness is said to be
leptokurtic and have the kaurtosis value more than 3
Flat data has the kurtosis value of less than 3, and it
is called platykurtic
Mathematically,
Zakaria Khamis
n
kurtosis
(x x)
i 1
4
i
ns4
7/17/2015
Skewness and Kurtosis
29
Zakaria Khamis
7/17/2015