Statistics - Ain Shams University

Download Report

Transcript Statistics - Ain Shams University

Slide 1

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 2

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 3

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 4

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 5

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 6

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 7

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 8

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 9

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 10

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 11

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 12

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 13

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 14

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 15

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 16

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 17

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 18

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 19

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 20

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 21

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 22

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 23

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 24

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 25

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 26

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 27

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 28

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 29

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 30

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 31

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 32

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 33

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 34

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 35

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 36

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 37

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 38

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 39

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 40

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 41

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 42

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 43

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 44

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 45

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 46

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 47

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 48

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 49

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 50

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 51

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 52

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 53

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 54

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 55

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 56

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 57

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 58

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 59

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 60

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 61

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 62

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 63

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 64

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 65

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 66

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 67

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 68

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 69

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 70

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 71

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 72

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 73

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 74

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 75

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 76

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you


Slide 77

Statistics

When guessing is informed
decision
Dr. Ahmed Abd Elmaksoud
Lecturer of Anesthesiology & ICU
Faculty of medicine
Ain Shams University

Objectives
 Define statistics and understand its
terminology
 Discuss the importance and need of
statistics in medical field
 Distinguish types of data and variables
 Describe types of statistics & statistical
tests

What is statistics?
Statistics is the science of
collecting, describing, and
analyzing data in order to
get a good decision.

Collecting
data

Describing
data

Good
Decision

Analyzing
data

Why do we need statistics?
Variability (atropine)
 Causes:  Uncontrollable (too many) factors
 Immeasurable factors
 Unknown factors

Why do we need statistics?
Variability (atropine)
 Effect of variability
 Large amount of data describing the same
thing (many values for one variable)
 No certainty “Deterministic vs probabilistic”
 Sampling

Functions of statistics
(new hypothetical β-blocker drug)




Describe (Descriptive statistics)
Inference (Inferential statistics)

Variables & Data
 A variable is something whose value
can vary. For example, age, gender
and blood type are variables.
 Data are the values you get when you
measure a variable

The Variables……

………and the Data

Mrs
Brown
Age

Gender
Blood
type

Mr
Patel

Ms
Manda

32

24

20

Female

Male

Female

O

O

A

Types of variables
1. Qualitative variables (data)
a) Categorical variable
The values (data) of a categorical variable are categories



Gender: (dichotomous –binary)
 Male
 Female



Type of ICU admission






Medical
Surgical
Physical injuries
Poisoning
Others

Types of variables
1. Qualitative variables (data)
b) Ordinal variable

Categorical variable whose values are ordered





Degree of illness

 Mild
 Moderate
 Severe

Physical status









ASA
ASA
ASA
ASA
ASA

I
II
III
IV
V

Glasgow’s coma scale

Types of variables
2. Quantitative (Numerical) variables
a) Discrete variable
 Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable
 Cardiac index
 Creatinine clearance

Types of variables
Changing data scales

Sample and Population
 A sample is a group (subset) taken
from a population.
 Population is the group of ALL
individuals (entities) sharing specific
characteristics

Sample and Population
 All human beings
 Day case surgical patients undergoing
general anesthesia
 Low-risk CABG surgery patients
 ICU patients with septic shock
 Women undergoing CS under spinal
anesthesia
 Human Skeletal muscle fibers
 Cardiac muscles of rates

Descriptive statistics
 Descriptive statistics is a series of
procedures designed to illuminate the
data, so that its principal characteristics
and main features are revealed.
 This may mean sorting the data by size;
perhaps putting it into a table, may be
presenting it in an appropriate chart, or
summarizing it numerically; and so on.

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency

Descriptive statistics
 Qualitative variables
 Frequency
 Relative frequency
 Cumulative frequency

Descriptive statistics
 Qualitative variables





Frequency
Relative frequency
Cumulative frequency
Cross tabulation

Descriptive statistics
 Qualitative variables






Frequency
Relative frequency
Cumulative frequency
Cross tabulation
(what about numerical variables?, grouping)

Descriptive statistics
 Qualitative variables
 In terms of describing data, an appropriate
chart is almost always a good idea.
 What ‘appropriate’ means depends primarily
on the type of data, as well as on what
particular features of it you want to explore.
 Finally, a chart can often be used to
illustrate or explain a complex situation for
which a form of words or a table might be
clumsy, lengthy or otherwise inadequate.

Descriptive statistics
 Qualitative variables
 Charts
 Pie chart

Postoperative Complications
23.5%
27.5%

9.8%

Nausea
Vomiting
Pain
Couph

39.2%

Disadvantage:Advantage:-

one variable
only
1.1.Summarize
(Area-relative
frequency)
loosemagnitude
clarity if more
thanfrequency)
4-5 categories.
2.2.show
(relative
3. no cross tabulation “separate pies”

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple

Postoperative Complications

Number of patients

25
20
15
10
5
0

Nausea

Vomiting

Pain

Alternative
Width
spacing

Couph

Postoperative Complications

Number of patients (%)

45%
40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

Postoperative Complications

Number of patients (%)

45%

Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Postoperative Complications

Number of patients (%)

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%

Group I

Group II

Couph
Pain
Vomiting
Nausea

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram

one variable – no gapping

Descriptive statistics
 Qualitative variables
 Charts
 Pie Chart
 Bar Chart
 Simple
 Clustered bar chart
 Stacked bar chart

 Histogram
 Frequency polygon
 Cumulative frequency polygon (Ogive)

Number of patients (%)

Postoperative Complications
Group I
Group II

40%
35%
30%
25%
20%
15%
10%
5%
0%

Nausea

Vomiting

Pain

Couph

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Median

1 – 1 – 3 – 5 - 10
1-3-5-9
Odd vs Even

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Mode

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Midrange

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

1 – 1 – 3 – 5 - 10





Mean = 4
Median = 3
Mode = 1
Mid range = 4.5

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Range

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Percentiles – quartiles

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

90th percentile

70th quartile

1-2-5-6-8-9-11-13-17-20-22

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 10
1st quartile
3rd quartile

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

 Variance – standard deviation

1 – 1 – 3 – 5 - 10

Descriptive statistics
 Numerical variables
 Measures of Central tendency
(summary measures of location)

 Measures of degree of dispersion
(summary measures of spread)

1 – 1 – 3 – 5 - 10
 Range = 9
 Inter-quartile range = 4
 Variance = 11.2
 Standard deviation = 3.35

Descriptive statistics
 Numerical variables

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics

Inferential statistics
(Informed guess)

 Making inference about population
parameters from sample statistics
 Standard Error (SD of the statistic)

 Confidence interval (95% CI)

Inferential statistics
(Informed guess)

 Hypothesis testing
 Almost all clinical research begins with a question.
 For example, is stress a risk factor for breast
cancer?
 To answer questions like this you have to
transform the research question into a testable
hypothesis called the null hypothesis,
conventionally labeled H0.
 This usually takes the following form:
 H0: Stress is NOT a risk factor for breast cancer
 H0: The drug has NO effect on mean heart rate

Inferential statistics
(Informed guess)

 Hypothesis testing
 Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc.,
 To test this null hypothesis, researchers will take
samples and measure outcomes, and decide
whether the data from the sample provides strong
enough evidence to be able to reject the null
hypothesis or not.
 If evidence against the null hypothesis is strong
enough for us to be able to reject it, then we are
implicitly accepting that some specified alternative
hypothesis, usually labelled H1, is probably true.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Is the new hypothetical β-blocker is more
efficient than another conventional βblocker (e.g. Inderal) in decreasing heart
rate or not.

Inferential statistics
(Informed guess)

 Hypothesis testing
 Example
Let the mean heart rate of all people
having Inderal is 1
Let the mean heart rate of all people
having the other new drug is 2

Inferential statistics
(Informed guess)

 Type I & Type II Errors
The Ho is:

The
Accept
decision
about
reject
Ho

True

false

Good decision

Type II error

Type I error

Good decision

Inferential statistics
(Informed guess)

 : Probability of conducting type I error
 : Probability of conducting type II error
 p–value: the probability of getting the

outcome observed (or one more extreme),
assuming the null hypothesis to be true.
 Sample size & power of the study

Inferential statistics
(Informed guess)

 How does it work? Probability distributions

Inferential statistics
(Informed guess)

 How does it work?
Parametric vs non-parametric tests

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Comparing two or more factors
 Association

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction

45

y = 2.03x + 8

40

Weight (kg)

35
30
25
20
15
10
5
0
0

2

4

6

8

Age (y)

10

12

14

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)

Diagnostic tests
The disease (outcome) is:

The test Positive
is:
Negative

Present

Absent

TP

FP

FN

TN

ROC Curve

1.0

1.0

0.8

0.8

0.6

0.6

Sensitivity

Sensitivity

ROC Curve

0.4

0.2

0.4

0.2

0.0

0.0
0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

0.0

0.2

0.4

0.6

0.8

1 - Specificity
Diagonal segments are produced by ties.

1.0

Inferential statistics
(Informed guess)

 Some example of testing of hypothesis?
 Comparisons
 One sample
 Two independent samples
 Two dependent samples
 More than two samples (independent-dependent)
 Association
 Prediction
 Diagnostic test (dichotomous – continuous)
 Survival analysis (censored data)

Final words
 If valid data are analyzed improperly, then
the results become invalid and the
conclusions may well be inappropriate.
 At best, the net effect is to waste time,
effort, and money for the project.
 At worst, therapeutic decisions may well be
based upon invalid conclusions and patients’
wellbeing may be jeopardized.

Thank you