Statistics for Business and Economics, 6/e

Download Report

Transcript Statistics for Business and Economics, 6/e

Averages
Lecture presented on the 21st of
September, 2011
Where we are

Variables that generate raw data occur at random.
So that useful information can be obtained about
these variables, as shown in the previous lecture,
raw data can be organized by using frequency
distributions. Furthermore, organized data can be
presented by using various graphs, observed during
the last lecture
Describing Data Numerically
Describing Data Numerically
Central Tendency
Variation
Arithmetic Mean
Range
Median
Interquartile Range
Mode
Variance
Standard Deviation
Coefficient of Variation
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-3
Averages

Statistical methods are used to organize and present
data, and to summarize data. The most familiar of
these methods is finding averages. For example, one
may read that the average salary of top-managers is
$5 mln and the salary of a Russian professor is less
than $1,000
What is the Russian for “average”?
• существительное: среднее число; средняя величина;
убыток от аварии судна; распределение убытка от аварии
между
владельцами;
авария;
среднее;
среднее
арифметическое
• глагол: составлять; равняться в среднем; выводить
среднее число; составлять; распределять убытки;
усреднять
• прилагательное: средний; обычный; нормальный
What is the Russian for “mean”?
• существительное: середина; среднее; средняя величина;
среднее значение; среднее; среднее число; среднее
арифметическое; средство, способ
• глагол: намереваться; иметь в виду; подразумевать;
подразумеваться;
думать;
предназначать;
предназначаться; значить; предвещать; иметь значение;
означать
• прилагательное: средний; серединный; посредственный;
плохой; слабый; скупой; скаредный; захудалый; жалкий;
убогий; низкий; подлый; нечестный; низкого происхождения;
придирчивый; недоброжелательный; бедный; скромный;
смущающийся; трудный; неподдающийся
“Average” or “Mean”?
• Analyzing various texts we can summarize
that the word “average” is used in general
cases, in the discussion of medium size,
and the word “mean” is used in special
cases dealing with specific formulas and
applications to find the average
Measures of Central Tendency
Overview
Central Tendency
Mean
Median
Mode
Midpoint of
ranked values
Most frequently
observed value
n
x
x
i1
i
n
Arithmetic
average
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-8
Arithmetic Mean

Statistical methods are used to organize and present
data, and to summarize data. The most familiar of
these methods is finding averages. For example, one
may read that the average salary of top-managers is
$5 mln and the salary of a Russian professor is less
than $1,000
Some quotes from Batueva
In the book “American averages” by Mike
Feinsilber and William B. Meed, the authors
state:
"Average" when you stop to think of it is a funny
concept. Although it describes all of us it
describes none of us... While none of us wants
to be average American, we all want to know
about him and her”

Some quotes from Batueva
The authors go on to give examples of
averages:
 The average American man is five feet, nine
inches tall; the average woman is five feet, 3.6
inches.
 The average American is sick in bed seven
days a year missing five days of work. On the
average day, 24 million people receive animal
bites

Some quotes from Batueva


By his or her 7th birthday, the average
American will have eaten 14 steers, 1050
chickens, 3.5 lambs, and 25.2 hogs.”
In the above examples, the word “average” is
ambiguous, since there are several different methods
used to obtain an average. Loosely stated, the average
means the center of the distribution or the most typical
case. Measures of central tendency are also called
measures of average and include the mean, mode and
median
Some quotes from Batueva


The mean x is also known as arithmetic mean
(or arithmetic average) and is the sum of the
values divided by the total number of values in
the sample or of the size of population.
Fоr discrete (ungrouped) data the simple mean
should be used
Logical formula
• The calculation of any average should
start with determining
a logical
formula. Before multiplying, dividing, or
adding anything, it is necessary to make
the initial ratio of the average, otherwise
known as a logical formula
The initial ratio of the average IRA
A
IRA  ,
B
The initial ratio of the average IRA
where A - the amount of the studied events
in the population or sample: A is the
summary absolute value;
B – the size of population or sample:
B is the number of units in the population
or sample.
IRA gives us the level of the studied
events per unit of population or sample
Examples of IRA
• The average salary shows how much one
employee earns.
What do we take in the numerator and
denominator of the IRR?
A - amount of funds paid to all employees
= wages & salaries fund;
B - number of employees
Average salary
• The salary of the individual employee is
the individual value. Wages & salaries
fund is the summary value, and the
average salary is an average
Examples of IRA
• The average price shows how much in
average this good costs.
What do we take in the numerator and
denominator of the IRR?
A – the turnover gained from the sale of
goods;
B - the total quantity of goods sold
Examples of IRA
• The average cost shows how much money
was spent per unit of production.
What do we take in the numerator and
denominator of the IRR?
A – the cost of production;
B - the total quantity of goods produced =
the quantity of output
Examples of IRA
• The average age shows the average
number of years lived by the population
under investigation; this indicator concerns
not of necessarily animate objects - this
may be the average age of cars, students,
buildings, chickens, equipment.
What do we take in the numerator and
denominator of the IRR?
A – the total number of years;
B - the number of surveyed units
Examples of IRA
• Average lifespan: Life expectancy for people ,
service life, or average age of used equipment shows the average number of years lived by
investigated units, no matter living or inanimate
objects they are.
What do we take in the numerator and
denominator of the IRR?
A - total number of life (service) years;
B - the number of surveyed units
Logical formula
For a specific economic indicator we may
form only one true logical formula
Types of averages
Mathematicians proved that most averages
we use, can be expressed in general
terms, by the formula of average power
(средней степенной)
The averages used in statistics relate to the class of power
averages. The general formula of average power is as
follows:
xk 
where
k
x

_
x k – average power
n
k
,
of grade k;
k – the exponent, which determines the type, or
form of the average; х – the values of variants;
n – the number of variants
If k =1, we get the
simple arithmetic mean
AM:
x
x

n
;
if k =2, we get the square
mean SM:
xq 
x
n
2
;
in case of k =0, we get the
geometric mean GM:
n
xg  x1  x2  ... xn  n  xi ;
n
i 1
if k = -1, we have the
harmonic mean HM:
xh 
n
1
x
A well known inequality concerning
arithmetic, geometric, and harmonic
means for any set of positive numbers
is
xq  x x g  xh
It is easy to remember noting that the
alphabetical order of the letters A, G,
and H is preserved in the inequality.
The higher exponent in the formula of
average power, the greater the value of
the average
Simple arithmetic mean SAM
• Simple arithmetic mean SAM is used
when we have different variants without
grouping.
In the numerator, we find the amount of
variants, in the denominator - the number
of variants
Example 1.Productivity of 5 workers
was 58, 50, 46, 44, 42 products per
shift.
Determine
the
average
productivity of five workers. In this case,
the
solution
is
as
follows:
x
x
n

50  46  58  42  44
5
 48 products
Simple Arithmetic Mean SAM

In this case, the simple arithmetic mean SAM
was used and it could be calculated by the
following equation:
x

x
,
i
n
where n – the number of values in the sample or
the size of population
Example 2
The miles-per-gallon fuel tests for ten
automobiles are given below: 22.2, 23.7, 16.8,
19.7, 18.3, 19.7, 16.9, 17.2, 18.5, 21.0
Find the mean (the average miles-per-gallon)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-36
Example 2
This solution was taken from one text-book, but
I consider it wrong. We need to proceed first to
understand how this problem can be solved

Simple Arithmetic Mean SAM

The arithmetic mean (mean) is the most
common measure of central tendency

For a population of N values:
N
x
x1  x 2    x N
μ

N
N
i1
i
Population
values
Population size

For a sample of size n:
n
x
x
i1
n
i
x1  x 2    x n

n
Observed
values
Sample size
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-38
Weighted arithmetic mean
• Weighted arithmetic mean WAM is used
for grouped data. This is the most
common power-average
Calculating the
WAM for a
frequency
distribution
Example 3


For data in an ungrouped frequency
distribution, the mean can be found as shown in
the next example
Example 3. The scores for 25 students on a 5point quiz are shown below. Find the mean (the
average score)
Example 3
Score,
xi
Number of
students,
frequency
fi
xi fi
0
1
0*1=0
1
2
1*2=2
2
6
2*6=12
3
12
3*12=36
4
3
4*3=12
5
1
5*1=5
Total
25
67
Example 3

To find the average score it is necessary to
calculate the total scores tor all students. For
this, the multiplication between the score and
the frequency To find the average score it is
necessary to calculate the total scores tor all
students. For this, the multiplication between
the score and the frequency xi fi of each class
should be made. Then, the sum of these
multiplications must be found (see the last line
in the table)
Example 3

Further, the total sum has to be divided by the
number of students:
67
x
 2.68 points.
25
The average score for 25 students on a 5-point
quiz is 2.68 points
Example 4
The number of machines serviced by one worker,
1
2
3
4
5
Total:
х
Number of
workers, f
10
37
43
34
16
140
х .f
10
74
129
136
80
429
Calculation:
x f

x
f
429

 3, 06machins
140
Weighted mean WAM

In two last examples the weighted arithmetic
mean was used and it could be calculated by
the following formula:
xf

x
.
f
i i
i

It is also used for data in grouped frequency
distribution after transformation into ungrouped
frequency distribution by dint of the midpoint.
Example 5
The following frequency distribution shows the
amount of money (USD) 100 families spend for
gasoline per month.
Find the mean (the average amount of money)

Example 5
Classes of
families by the
amount of money
Number of
families, f
Midpoint
xi
i
xi 
24.5  29.5
 27
2
xi fi
24.5-29.5
5
29.5-34.5
10
32
320
34.5-39.5
16
37
592
39.5-44.5
32
42
1344
44.5-49.5
27
47
1269
49.5-54.5
6
52
312
54.5-59.5
4
57
228
Total
100
135
4200
The midpoint

The procedure of finding the mean for grouped
frequency distribution assumes that all of the
raw data values in each class are equal to the
midpoint of the class. In reality, this is not true,
since the average of the raw data values in
each class will not be exactly equal to the
midpoint
Example 5

However, using this procedure will give an
acceptable approximation of the mean, since
some values fall above the midpoint and some
values fall below the midpoint for each class.
xf

x
f
i i
i
4200

 $42.
100
Average amount of money spent for gasoline by
family is $42 per month
Example 6
Productivity,
m
x
x·f
x`
x`·f
S
_
x–x
_
(x – x)2 · f
x` 2· f
Up to 200
Number
of
workers,
f
3
190
570
-3
-9
3
-63,9
12249,63
27
200-220
12
210
2520
-2
-24
15
-43,9
23126,52
48
220-240
50
230
11500
-1
-50
65
-23,9
28560,50
50
240-260
56
250
14000
0
0
121
-3,9
851,76
0
260-280
47
270
12690
1
47
168
16,1
12182,87
47
280-300
23
290
6670
2
46
191
36,1
29973,83
92
300-320
7
310
2170
3
21
198
56,1
22030,47
63
320 and
2
330
660
4
8
200
76,1
11582,42
32
140558
359
more
Total:
200
50780
39
x f

x
f
50780

 253,9 m
200
Modification of the WAM formula
• If f is a relative frequency (SR, a share in
the population is given), the classic
formula of weighted arithmetic mean WAM
is not applicable, we can use its
modification:
n


,
xi  xi d i
x 1
Modification of the WAM formula
where
f
d  f ;

f  frequencyinabsolutevalue;
 f thesizeof  population
i
i
i
i
i
Modification of the WAM
formula
In fact, we multiply variants by SR in shares
Conclusion

For discrete data the simple arithmetic mean
should be used. For ungrouped frequency
distribution and for grouped frequency
distribution the weighted arithmetic mean WAM
should be used
Properties of the WAM

The arithmetic mean has the following
mathematical properties, which can be used in
a task solution. These mathematical properties
let us simplify the problem
1st property of the WAM
The product of the arithmetic mean and
the sum of frequencies is equal to the
total volume of the studied events in the
population (look at the IRA formula):
x 
f x f
i
i
i
2nd property of the WAM

The sum of deviations between the variants and
the mean is equal to zero:
(x  x ) f
i
i
 xi fi  x  fi   xi fi
 0;
xf


f

f
i i
i
  xi fi  xi fi 0
i

2. The sum of deviations between the
variants and the mean is equal to zero
This means, that in AM deviations
from the average are mutually
repaid
Properties of the arithmetic mean
• The properties 3 to 5 are used to simplify
the calculation, when you need to
calculate the average of unsuitable values
3d property of the arithmetic mean

If all variants multiply or divide by the same any
constant B, the mean will increase or reduce by
the same number of times:
xi
1
 B fi B  xi fi 1 x

 x
 fi
 fi B B
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-62
4th

property of the arithmetic mean
If all variants reduce or increase by definite
amount A, then the mean will reduce or
increase by the same amount:
 ( x  A) f   x f  Af   x f
f
f
f
i
i
i
i i
i
i
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
i i
i

A fi
f
xA
i
Chap 3-63
5th

property of the arithmetic mean
If all frequencies multiply and divide by the
same any constant k, then the mean will not be
changed:
fi 1
 xi k k  xi fi

 x.
fi
1
 k k  fi
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-64
Properties of the arithmetic mean
• If during the calculation of the WAM its
properties have been used, we do not get
a normal answer, that could be a
transformed value. To get the normal
WAM, you must make the reverse
operations in reverse order
Simplified calculation
of the arithmetic mean
for a frequency
distribution
is based on properties of the AM
x  f 

x
 h  c,
 f
xc
where x 
;
h
f
f ;
A
h – the interval width;
c – one of the variants near the middle of the
distribution (lying in the middle);
А – a digital, a maximal multiple for all frequencies
Example 6
Productivity,
m
x
x·f
x`
x`·f
S
_
x–x
_
(x – x)2 · f
x` 2· f
Up to 200
Number
of
workers,
f
3
190
570
-3
-9
3
-63,9
12249,63
27
200-220
12
210
2520
-2
-24
15
-43,9
23126,52
48
220-240
50
230
11500
-1
-50
65
-23,9
28560,50
50
240-260
56
250
14000
0
0
121
-3,9
851,76
0
260-280
47
270
12690
1
47
168
16,1
12182,87
47
280-300
23
290
6670
2
46
191
36,1
29973,83
92
300-320
7
310
2170
3
21
198
56,1
22030,47
63
320 and
2
330
660
4
8
200
76,1
11582,42
32
140558
359
more
Total:
200
50780
39
x  f 
39

x
hc 
 20  250  253,9m
200
f
h=20; c=250; f=f'; A=1
Example 7
The following data represent the grouping of workers by size of
payment:
Classes
Number of workers,
500-600
10
600-700
15
700-800
20
800-900
25
900-1000
15
1000-1100
10
More than 1100
5
Total:
100
Find the average size of payment, using the mathematical properties of
the mean
Chap 3-70
Example 7

Transform the grouped frequency distribution
into
ungrouped
frequency
distribution
calculating the midpoint (column 3). After that,
using the property #4, each item could be
reduced by the same constant A. Digital A can
be anyone, however it is recommended to
accept it equal to the variant with maximal
frequency (column 4)
Chap 3-71
Example 7

On the basis of the property #3, reduced
variants should be divided by the constant B,
which can be anyone too. But it is
recommended to accept В equal to the width of
classes (column 5). The next table shows the
procedure of solving the problem
Chap 3-72
Example 7
Classes,
USD
Num-ber
of workers,
fi
Midpoint
xi
xi  A,
xi  A
A  850,
,
B
because
f850  max B  100
fi
5
xi' f i '
500-600
10
550
-300
-3
2
-6
600-700
15
650
-200
-2
3
-6
700-800
20
750
-100
-1
4
-4
800-900
25
850
0
0
5
0
900-1000
15
950
100
1
3
3
1000-1100
10
1050
200
2
2
4
More than 1100
5
300
3
1
3
20
-6
Total:
100
1150
Example 7
We have now new variants, which can be
symbolized xi'  xi  A . Since changing of the
B

variants involved changing of the mean, it is
necessary to get back the actual magnitude of
the mean. Using the property #5, all frequencies
were divided by k=5, because 5 is the maximal
multiple (column 6)
Example 7
'
New frequencies are symbolized f i . According
to the property #5, changing of frequencies
does not entail changing of the mean.
Thus the formula of the mean will look as follows,
using the mathematical properties:
' '
x
 i fi
6
x
B A
 100  850  $820.
'
20
 fi
The average wage is $820 per worker.
Arithmetic Mean
(continued)



The most common measure of central tendency
Mean = sum of values divided by the number of values
Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
1  2  3  4  5 15

3
5
5
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
1  2  3  4  10 20

4
5
5
Chap 3-76
Arithmetic Mean

The arithmetic mean, as a single number
representing a whole data set, has important
advantages. First, its concept is familiar to most
people and intuitively clear. Second, every data
set has a mean. It is a measure that can be
calculated, and it is unique because every data
set has one and only one mean. Finally, the
mean is useful for performing statistical
procedure such as comparing the means from
several data sets
Harmonic Mean HM

The harmonic mean HM is defined as the
number of values divided by the sum of the
reciprocals of each value. The equations of the
HM are shown below:
simple harmonic mean:
n
xh 
;
1
x
i
Harmonic Mean HM
weighted harmonic mean:
xh
W ,
1
x W
i
i
i
where Wi=xifi.
Harmonic Mean HM
• HM is the reciprocal of the arithmetic
mean: one can get the arithmetic mean as
number 1, divided by the harmonic mean
and vice versa.
• There are simple and weighted HMs. The
weighted formula of HM is used more
often
Harmonic Mean HM


In the case of HM, frequencies are not known,
but we know the total sum of the values.
In fact, the arithmetic mean and the harmonic
mean are applied in the same cases, but under
different data sets. And so, before the choice of
the mean equation it is necessary to construct
the logical (economic) formula
HM is applied when the volumes of
investigated variants are used as
weights.
Sometimes the problem arises: what
formula should be used - the harmonic
mean HM or arithmetic mean AM?
The answer is as follows:
Fits the formula, in which both the
numerator and denominator would have
values with an economic meaning
AM or HM ?
• The tip:
If the initial information gives an averaged
value (variant) and the denominator of the
logical formula, the AM is used.
If variant and the numerator of the logical
formula are given, the HM is implemented
AM or HM ?
• In other words:
• If the numerator of the IRA is unknown,
we’ll use AM.
If the denominator of the IRA is unknown,
the HM should be used
Example 8
Enterprise
1
2
3
Total:
Number of
employees, persons
(fi)
540
275
458
1273
Average salary,
RUR
(xi)
31046
31210
31130
?
xf

x
f
i i

i
31046  540  31210  275  31130  458


540  275  458
 31112RUR
Example 9
Enterprise
1
2
3
Total:
Monthly fund of
wages & salaries,
RUR thousands (wi)
16 764,84
8 582,75
14 257,54
39 605,13
Average salary,
RUR (xi)
31 046
31 210
31 130
?
w

x
w
x
i

i
i
16764840  8582750  14257540


16764840 8582750 14257540


31064
31210
31130
 31112RUR
Example 10


A carpenter buys $500 worth of nails at $50 per pound and
$500 worth of nails at $10 per pound. Find the mean
(average price of a pound of nails).
For task solution we have to define how the average price
can be found. Construct the logical formula expressing the
relation between price, and worth:
Totalworth
Average price( price per pound ) 
Thenumberof  pounds
Example 10
In this case, the total worth is known, but the number of
pounds is not known. Therefore, the number of pounds
could be counted by the ratio between total worth and
price:
Totalworth
Average price( price per pound ) 

Totalworth
$500  $500
$1000
Price per pound


 $16.67.
$500 $500 60 pounds

$50
$10
The average price per pound of nails is $16.67.
The logical equation allows us to correctly choose the mean
equation not breaking the relation of economic processes
Example 11
Below is the data about wages:
Base month
Reporting month
Salary per
person,
thousand
rubles, Xj
Number of
personnel,
people, fj
1
11
30
11
275
2
14.5
40
15.2
684
3
16
20
17
408
4
18
10
21
336
#of workshop
Salary per
person,
thousand
rubles, Xj
Find the average salary per person in each month
Wages fund,
thousand
rubles, Wi
Example 11
First, construct the logical formula, describing the relation
between salary, wages fund, and the number of personnel:
Wages fund
Averagesalary  salary per person  
Thenumberof  personnel
Example 11
For the base month Wages fund is not known, but
the number of personnel is known. Wages fund is
the total sum of values Wi and could be defined by
multiplication between salary per person (variants
xi) and the number of personnel (frequencies fi).
Thus, logical formula will be as follows:
Wages fund
Averagesalary 

Thenumberof  personnel
Salary per person * Thenumberof  personnel

Thenumberof  personnel
Example 11

Using this logical formula, the average salary for
the base month could be calculated:
xbase 
11*30  14.5* 40  16* 20  18*10
14.1thousand rubles.
30  40  20  10
To calculate the average salary the weighted arithmetic mean
was used:
xf

x
f
i i
i
Example 11

For the reporting month, Wages fund is known, but
the number of personnel is not known. The
number of personnel could be expressed by
dividing Wages fund by the salary per person.
Thus, the logical formula has been transformed in
the following way:
Averagesalary 
Wages fund

Thenumberof  personnel
Wages fund
Wages fund
Salary per person
Example 11

Using this transformation of the initial logical
formula, the average salary could be calculated on
the basis of the weighted harmonic mean:
xreporting
W


1
x W
i
i
i

275  684  408  336 1703

 15.5thousand rubles
275 684 408 336 110



11 15.2 17
21
Once more AM or HM ?
The accuracy of the choice of the mean equations
is rule-based:


If the numerator of the logical formula is
unknown, the arithmetic mean should be used.
If the denominator of the logical formula is
unknown the harmonic mean should be used
The geometric mean GM

Sometimes when we are dealing with quantities
that change over a period of time, we need to
know an average rate of change, such as an
average growth rate over a period of several
years. In such cases, the arithmetic mean AM
and the harmonic mean HM are inappropriate,
because they give wrong answers. What we
need to find is the geometric mean GM.
th
 The geometric mean GM is defined as the n
root of the product of n values
The geometric mean GM
• The formula of the simple geometric mean
GM is:
xg  n x1  x2  ...  xn 
n
n
x .
i 1
i
The geometric mean is useful for finding the
average of percentages, ratios, indexes or
growth rates
Example 12
The growth rate of the Living Life Insurance
Corporation for the part three years was 35%,
24%, and 18%. Find the average growth rate.
• First, it is necessary to transform the growth rate
into growth factor, using the following equation:
Growthrate
Growth factor  1 
100%
Example 12
• Further the geometric mean can be applied:
xg  n x1  x2  ...  xn 
 1.35*1.24 *1.18  1.2547( 25.47%).
3
The average growth rate is 25.47% per year
The geometric mean GM
• The geometric mean could be applied when the
data do to have large spread. Let us suppose,
you want to calculate the average winning
amount between maximal and minimal winnings
amounts.
• n of the geometric mean is more justified:
Example 13
• The geometric mean could be applied when the
data do to have large spread. Let us suppose,
you want to calculate the average winning
amount between maximal and minimal winnings
amounts.
Example 13. Minimal winning amount $1000 and
maximal winning amount $100,000. Find the
average winning amount
Example 13
• The raw data have big difference, and so, the
application of the arithmetic mean will be
incorrect. In this case, the application of the
geometric mean is more justified:
xg  $1000 *$100,000  $10,000.
The average winning amount is $10,000
Weighted geometric mean WGM
For analysis of time series the weighted geometric
mean could be used. The formula is shown
below:
ki k1
k3
kn
k2

xg 
x1  x2  x3  ...  xn .
The application of the weighted geometric mean
will be needed while solving problems on time
series
The quadratic (square) mean QM
• A useful mean for physical sciences is the
quadratic mean, which it found by taking the
square root of the sum of the average of the
squares of each value. There arc two kinds of
the quadratic mean: the simple quadratic mean
and the weighted quadratic mean:
xq 
x
2
i
n
 the simple quadratic mean;
The quadratic (square) mean QM
xq 
 x f 
f
2
i i
the weighted quadratic
i
mean.
The quadratic mean is applied when the
measures of dispersion are calculated
Chronological mean TM
• This mean formula is applied to the
number of instant indicators, especially in
time series:
X
1
1
x1  x2  x3 ... xn 1  xn
2
2
n1
Chronological mean TM
• Take half of the first and last values, plus
all values that are in the middle of the
series, the amount received divide by “the
number of moment indicators minus 1”
Chronological mean TM
TM is widely used in time series analysis, in
socio-economic statistics to determine the
average population and average size of
the fund, as well as other indicators,
calculated at certain points in time
Chronological mean TM
• If calculating the average for two moment
indicators, the formula for chronological
mean TM transforms into the formula of
simple arithmetic mean AM:
X 
1
1
x1  x2
2
2
1
x1  x2

2
The universal set of rules for calculating the
average
I offer universal set of rules for calculating the
average, which discipline students and allow them
to choose correctly the necessary mean.
I. Write down a logical formula IRA of the
average calculated. Remember that the logical
formula does not depend on the initial data, so
it is unique, and is appropriate only for the
calculation of the required average
The universal set of rules for calculating the
average
II. Compare the logical formula with the initial data. There
may be eight cases.
1. The numerator A is unknown, the data are not grouped –
the formula of the simple arithmetic mean AM is used.
2. The numerator A is unknown, the data are grouped –
use the weighted arithmetic mean WAM formula. If you
know the relative frequency in the form of shares, the
modified formula MAM of the weighted arithmetic mean is
the best way for calculations
The universal set of rules for calculating the
average
3. The denominator B is unknown, the data
are not grouped – the simple harmonic
mean HM is most suitable formula to
calculate the average.
4. The denominator B is unknown, the data
are grouped and weights Wi are different –
the weighted harmonic mean WHM should
be chosen; if the weights Wi coincide we
recommend to apply Case #3
The universal set of rules for calculating the
average
5. The data are incomplete, insufficient and
not grouped – the simple geometric mean
GM formula is necessary and sufficient.
6. The data are incomplete, insufficient, and
grouped – deploy the formula of weighted
geometric mean WHM.
7. Time series with instant levels and equal
time intervals – the simple chronological
mean TM is recommended to be applied
The universal set of rules for calculating the
average
8. Time series with instant levels and
unequal time intervals - the weighted
chronological mean WTM is suitable.
III. Write the appropriate formula.
IV. Plug the initial data into the formula and
perform the necessary calculations.
V. Specify the unit of measure in the result
received and formulate the economic
meaning of this numerical answer
Structural averages
Using the average power for the analysis of
the distribution is not enough.
Structural averages are used for initial
analysis of the distribution of units in the
population
Structural averages
Out of numerous list of structural averages
we’ll discuss mode, median, quartile,
decile, and percentile
Mode Mo
Mode - the value of the variant
occurring in the population the
largest number of times. In
everyday life the word “mode"
actually has the opposite meaning
as fashion
Mode Mo
Mode is the most common
variant
of
frequency
distribution. For a discrete
series this is the value,
which corresponds to the
highest frequency
Mode Mo
The mode is the value that is repeated most often in
the data set. A data set can have more than one mode
or no mode at all.
If we analyze a discrete series and there are several
variants with the highest frequency (which is quite
rare), then the mode is defined as the arithmetic
average of all the modal variants
Mode Mo
The mode is the value that is repeated most often in
the data set. A data set can have more than one mode
or no mode at all.
The value that occurs most often in a data set is
called the mode
The mode can be defined only for ungrouped
frequency distribution and grouped frequency
distribution. The mode for ungrouped frequency
distribution could be defined by sight, by definition,
using the most frequency
Example 4
The number of machines serviced by
one worker, х
Number of
workers, f
х .f
1
2
3
4
5
Total:
10
37
43
34
16
140
10
74
129
136
80
429
Mo  3
Example 14
The number of books read by
each of the 28 students in a
literature class is given below:
Example 14
The number of books read by each of 28 students
in a literature class is given below:
Number of books
Number of students,
frequency
0
2
1
6
2
12
3
5
4
3
Total
28
The most frequency
The most frequency is 12, it means the mode is 2 books. Thus, the most
students have read only two books
The mode for grouped frequency distribution can
be defined using the following equation. It is used
for interval frequency distribution with equal
interval widths:
M o  xM o  hM o 
f M o  f M o 1
2 f M o  f M o 1  f M o 1
where xMо - lower boundary of the modal class;
hМо - width of the modal class;
f Мо - frequency of the modal class;
f Мо-1 - frequency of the pre-modal class;
f Мо+1 - frequency of the after-modal class
,
Example 6
Productivity,
m
x
x·f
x`
x`·f
S
_
x–x
_
(x – x)2 · f
x` 2· f
Up to 200
Number
of
workers,
f
3
190
570
-3
-9
3
-63,9
12249,63
27
200-220
12
210
2520
-2
-24
15
-43,9
23126,52
48
220-240
50
230
11500
-1
-50
65
-23,9
28560,50
50
240-260
56
250
14000
0
0
121
-3,9
851,76
0
260-280
47
270
12690
1
47
168
16,1
12182,87
47
280-300
23
290
6670
2
46
191
36,1
29973,83
92
300-320
7
310
2170
3
21
198
56,1
22030,47
63
320 and
2
330
660
4
8
200
76,1
11582,42
32
140558
359
more
Total:
200
50780
39
56  50
Mo  240  20 
 248m
2  56  50  47
Finding the modal interval

First, it is necessary to define the modal class
by definition, using the most frequency.
Therefore, the most frequency is 25 and it
conforms to the class 800-900 which is
detected as the modal interval
Example 7
The following data represent the grouping of workers by size of
payment:
The size of payment, USD
Number of workers,%
500-600
10
600-700
15
700-800
20
800-900
25
900-1000
15
1000-1100
10
More than 1100
5
Total:
100
Find the mode
Chap 3-130
25  20
Mo  800  100 
 $833.3.
(25  20)  (25  15)
Thus, the majority of workers have the salary
in the amount of $833.3
Mode Mo
• If the modal interval of the first or the last,
the missing frequency (or pre-modal or
after-modal) is taken to be zero
The mode on a graph
• The mode can be defined using the histogram.
For this, it is necessary to select the highest bar,
and then connect its right-wing angle with rightwing angle of previous bar. Further, connect leftwing angle of the highest bar with left-wing angle
of the next bar. From the point of intersection of
two segments, drop a perpendicular on the
X-line. The point of intersection of perpendicular
and abscissa axis is called the mode
The mode on a graph
• To determine the mode of a discrete
series the frequency polygon is
drawn. The distance from the vertical axis
to the highest point is the graphics mode
Median

In an ordered list, the median is the “middle”
number (50% above, 50% below)
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Median = 3
Median = 3

Not affected by extreme values
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-136
Median
The median is the measure of central tendency
different from any of means. The median is a
single value from the data set that measures
the central item in the data. This single item is
the middlemost or most central item in the set of
numbers. Half of the items lie above this point,
and the other half lie below it.
The median is the midpoint of the data array
For calculating the median the data must be
ascended or descended by order

Median Me
• Me is the central, “middle” value of a
population. Me – the value of a variant
located in the middle of the ordered list.
Me is the variant, which lies in the middle
of the frequency distribution and divides it
into two equal parts.
• In the discrete list Me is determined by
definition, in the interval frequency
distribution – by the formula
Finding the Median

The location of the median:
n 1
Median position 
position in the ordered data
2



If the number of values is odd, the median is the middle number
If the number of values is even, the median is the average of
the two middle numbers
n 1
is not the value of the median, only the
2
position of the median in the ranked data
Note that
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-139
Finding the Median
• If a discrete list contains an odd number of
values, Me is the only value, to the right
and the left of which there is the same
number of values:
Me  x n1
2
Example 15
• Find the median for the ages of seven
preschool children. The ages are 2, 3, 4,
2, 3, 5 and 5. First it is necessary to
ascend the data: 2, 2, 3, 3, 4, 5, 5.
According to the equation, the median is
the (7+l)/2=4th item in the array and
conforms to 3 years. Thus, you can say,
half of the children are under 3 and the
other half of them are over 3
Median


For ungrouped data the median can be defined
in the following way. If the data set contains an
odd number of items, the middle item of the
array is the median. If there is an even number
of items, the median is the average of the two
middle items and can be calculated using the
equation from the next slide
Finding the Median
• If a discrete list contains an even number
of values, there are two values, to the right
and to the left of which there is the same
number of values. Me is the arithmetic
mean of these two values:
Me 
x
n
2

x
2
n2
2
Example 16
• The ages of ten college students are given below. Find
the median: 18, 24, 20, 35, 19, 23, 26, 23, 19, 20. The
data set contains the even number of items. Set the data
in the ascending order: 18, 19, 19, 20, 20, 23, 23, 24, 26,
35. Using the equation, the median is the (10+1)/2=5.5th
item in the data set. In other words, the median lies
between the 5th and the 6th items. Thus, the median is:
Me= (20 + 23)/2= 21.5 years
Therefore, half of the students are under 21.5 and the other
half of the students are over 21.5 years old
Finding the Median
• For ungrouped frequency distribution the
median Me can be defined using the
cumulative frequencies Si
The golden rule
Для дискретного ряда медианой
является та варианта, для которой
накопленная
частота
впервые
превышает половину от суммы частот
For the discrete frequency distribution the
median is the value, for which the cumulative
frequency for the first time is more than the
half of the total frequencies
Example 4
The number of machines serviced by
one worker, х
1
2
3
4
5
Total:
Number of
workers, f
10
37
43
34
16
140
Me  3
S
10
47
90
124
140
-
Example 14
The number of books read by each of 28 students
in a literature class is given below:
Number of books
Number of students,
frequency
Cumulative
frequencies, Si
0
2
2
1
6
8
2
5
13
3
12
25
4
3
28
Total
28
Example 14
To locate the middle point, divide n by 2, which
gives 28/2=14. Then locate the point where 14
values would fall below and 14 values would fall
above. The 14th item falls in the fourth class
and conforms to 3 books. Me = 3 books.
It means, half of the students have read less than
3 books and the other half have read more than
3 books

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-149
For grouped frequency distribution the median can be defined
using the following equation:
f
Me  xMe  hMe 
2
 S Me 1
f Me
,
where xМе - lower boundary of the median class;
hМе - width of the median class;
fМе - frequency of the median class;
SМе-1 - cumulative frequencies of the class immediately
preceding the median class;
f
- the sum of all frequencies (the size of population,
the size of sampling)
Finding the Median
The median class can be defined
following the definition and the Golden
Rule: we use the cumulative
frequencies, in the median interval they
are for the first time bigger than the ½
of the sum of all frequencies
Example 6
Productivity,
m
x
x·f
x`
x`·f
S
_
x–x
_
(x – x)2 · f
x` 2· f
Up to 200
Number
of
workers,
f
3
190
570
-3
-9
3
-63,9
12249,63
27
200-220
12
210
2520
-2
-24
15
-43,9
23126,52
48
220-240
50
230
11500
-1
-50
65
-23,9
28560,50
50
240-260
56
250
14000
0
0
121
-3,9
851,76
0
260-280
47
270
12690
1
47
168
16,1
12182,87
47
280-300
23
290
6670
2
46
191
36,1
29973,83
92
300-320
7
310
2170
3
21
198
56,1
22030,47
63
320 and
2
330
660
4
8
200
76,1
11582,42
32
140558
359
more
Total:
200
50780
39
200
 65
Me  240  20  2
 252,5m.
56
This means that half of workers
have labor productivity which is less
than 252.5 m, while the other half
has productivity more than 252.5 m
Example 7
The following data represent the grouping of workers by size of
payment:
The size of payment, USD
Number of
workers,%
Cumulative
frequencies, Si
500-600
10
10
600-700
15
25
700-800
20
45
800-900
25
70
900-1000
15
85
1000-1100
10
95
More than 1100
5
100
Total:
100
Find the median
Chap 3-154
Example 7

First, it is necessary to divide the sum of all
frequencies by 2 to find the halfway point:
100/2=50. Further, let us find the class that
contains the 50th value. This class is called the
median class and it contains the median. The
median class is $800-900
100
 45
Me  800  100  2
 $820.
25
Thus, half of workers have the size
of payment less than $820 and the
other half of workers have
the size of payment more than $820
The median can be defined using the ogive.
For this, it is recommended to select the
point on the Y-line conforming to ½ of all
frequencies. From this point the parallel to
X-line should be drawn. From the point of
intersection of parallel and ogive it is
necessary to drop a perpendicular on the
abscissa axis. The point of intersection of
perpendicular and X-line is named the
median. The next slide shows this
procedure with the help of cumulative
frequency graph
Мо & Ме
• In practical calculations of Mo and Me their
values may be far removed from each
other. To better reflect the nature of
distribution statisticians use other
structural averages
Mode






A measure of central tendency
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may may be no mode
There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
0 1 2 3 4 5 6
No Mode
Chap 3-160
Review Example

Five houses on a hill by the beach
$2,000 K
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
$500 K
$300 K
$100 K
$100 K
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-161
Review Example:
Summary Statistics
House Prices:
$2,000,000
500,000
300,000
100,000
100,000

Mean:

Median: middle value of ranked data
= $300,000

Mode: most frequent value
= $100,000
Sum 3,000,000
($3,000,000/5)
= $600,000
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-162
Which measure of location
is the “best”?

Mean is generally used, unless
extreme values (outliers) exist

Then median is often used, since
the median is not sensitive to
extreme values.

Example: Median home prices may be
reported for a region – less sensitive to
outliers
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 3-163
The end
•Wishing you
all Great
Success and
Good Luck!